JSON storage for questionnaires
| Warning | Work in progress! We are in the process of updating the information on this page. Subject to change. |
This document formally describes new formats used for storing questionnaire data, which were designed with the goal to minimize storage footprint.
Questionnaire templates (or simply templates) are created in the Questionnaire Designer tool. They represent suitably arranged questions, but – what is important – no answers.
After the user fills in the questions, we have an answered questionnaire. In the end, there is the original template plus a set of answers. All this data must be stored somehow. This can be done in many ways, each with its own pros and cons.
All new formats are JSON-based. This means they serialize questionnaire content into a JSON string. After reading this document, you should be able to read (decode) this string.
Questionnaire data model originally worked with 3 entities:
- resco_questionnaire entity representing either the template or answered questionnaire.
- resco_questiongroup entity representing question groups.
- resco_question entity representing individual questions.
The above entities are documented here. A questionnaire consists of questions that can be optionally grouped into groups.
In original storage format, questions and groups were stored as separate records and the filled questionnaire was fully self-describing, i.e., independent from its template. (For example, the questionnaire could be used even after its template was modified or deleted.) While this is a very flexible concept, drawbacks of this design appeared soon:
- Too many records and consequently long sync times.
- Integrity problems. (Because a questionnaire is a distributed set of records.)
- Large storage consumption.
To address the above problems, Resco did several upgrades to the format. Among them, two concepts are especially important:
- Template-dependent questionnaires: Design-related data is not copied into answered questionnaires saving thus a considerable portion of storage. Template changes are addressed by template versioning. (Drawback is that the template is needed for the processing of answered questionnaires.)
- JSON storage format addresses the problem of integrity and of “too many records”: When the questionnaire is saved, its child records (group/question) are serialized into one JSON string, stored as a specific questionnaire field (resco_serializedanswers) and the child records are subsequently discarded. When the answered questionnaire is opened again, the original group and question records are reconstructed from the JSON storage.
However surprising it may sound, the JSON format did not solve all the storage problems. The reason was that it stored too many details and even some redundant information.
Another problem was the limits imposed by the backend servers, which limit field size (JSON string was originally stored in a single field) to 1M characters (Dynamics) or 128K characters (Salesforce).
Resco offers also optional JSON compression, which squeezes the storage requirements much further. However, this format has also drawbacks, especially if the customers want to process the results themselves.
To finalize the format evolution, with release 15.1, Resco introduced the next-generation storage format. And to fit all possible (often contradictory) requirements, there are two formats, both JSON-based:
- Flexible JSON
- This format cures the basic problems of the original JSON format, i.e., it leaves out nearly all information that can be reconstructed from the template. Moreover, customers can customize the format by deciding which data items are stored.
- Minimal JSON
- Goes even further as it stores the bare minimum – key/value pairs, where the key is the question name and the value represents the user answer. This is the most storage-efficient JSON format possible. Those who don’t need additional data (such as question scores) can enjoy minimal storage footprint and easy result processing.
Apart from the question data, both formats also contain the questionnaire header, containing the most important items such as IDs of both the answered questionnaire and its template. The weight of these items is small, especially in the case of large questionnaires. They were added to make the JSON information as complete as possible, so that the customers may eventually export JSON strings “as-is” into data warehouses for long-term storage.
Of course, both formats can be optionally compressed in the same way as the original JSON format.
Legacy JSON
This chapter has informational value only as we are deprecating this format.
The JSON string is constructed as follows:
{
"formatversion": 1,
"answers": [AnswerItem1..AnswerItemN],
"groups": [GroupItem1..GroupItemM]
}
AnswerItems is a dictionary that can have the following keys:
| Field name without the "resco_" prefix |
Default | Type | Notes |
|---|---|---|---|
| questionid | GUID | ||
| name | string | (unique in context of questionnaire) | |
| answeredon | null | Date | (‘2011-10-05T14:48:00.000Z’) |
| value | null | string | question response |
| defaultvalue | “” | string | |
| description | “” | string | |
| questiongroupid | null | string | (“resco_questiongroup,guid,groupname”) |
| icon | “” | string | |
| index | 0 | int | order of the question in questionnaire |
| kind | 0 | int | type of the question, datetime, lookup, text… |
| label | “” | string | display name of the question |
| min | “” | string | minimum number |
| max | “” | string | maximum number |
| options | “” | string | configuration for various question types |
| precision | 0 | int | precision of decimal numbers |
| style | “” | string | style name for the question |
| valuelabel | “” | string | human readable (formatted) value, used in reports |
| visible | true | Boolean | initial visibility of the question |
| required | 473220000 | int | None (473220000), SystemRequired (473220001) |
| isseparator | false | Boolean | if true this is static question (description, image…) |
| displayformat | 473220000 | int | question format (text, URL, email, duration…) |
| answerstorage | 1 | int | where the answer is stored: 0-record, 1-Json. |
| showonreport | 1 | int | whether this question should be displayed on the report |
| rules | “” | string | question specific rules |
| score | 0 | int | question score |
| localization | “” | string | |
| layout | “” | string | position of the question on the grid |
GroupItems is a dictionary that can have the following keys:
| Field name without the "resco_" prefix |
Default | Type | Notes |
|---|---|---|---|
| questiongroupid | GUID | referred by question.questiongroupid | |
| description | “” | string | |
| expanded | true | Boolean | |
| index | 0 | int | order of the group in questionnaire |
| label | “” | string | display name of the group |
| repeatconfig | “” | string | config for repeating groups |
| repeatindex | 0 | int | |
| rules | “” | string | group specific rules |
| templategroupid | GUID | reference to a group from the template | |
| visible | true | Boolean | initial visibility of the question |
| localization | “” | string | |
| layout | “” | string | layout definition (list/grid…) |
What is serialized: If the value of the question field is different from null and from the default, the field is serialized (added to the AnswerItems, resp. GroupItems dictionary).
A serialization sample can be found here.
How to improve this format?
The biggest problem with this format is that it stores too many fields. For example, IDs are not needed as the question/group records are re-created on an as-needed basis; rules, question labels or descriptions can be read from the template, etc.
It sounds tempting to “improve” storage format by simply leaving out all the above fields. However, questionnaire business logic allows changing even seemingly immutable fields. For example, some customers want to dynamically change question labels depending on previous answers, and they want to keep this info for reporting.
Example of minimal JSON
Let’s have the questionnaire form filled as follows:
Age = 12
City = Senec
Sex = Male
[Computer]
Price = 427.12
Comments = Not working at all
[Repair]
Technician = contact;<id>
Repairedon = 2022-02-16 11:18:34
No of issues = 2
[Topic]
Mark = 4.5 Passed = 0 Signature [empty]
[Topic]
Mark = 2.1 Passed = 1 Signature [image1] [image2]
The form contains 3 simple questions, 2 groups (Computer, Repair), and one repeatable group Topic. The serialized JSON in this minimal format might look like this:
{
"@ver": "m1.0",
"@q": {
"id": "eb614e48-4c4f-e911-a9a5-000d3a37da27"
},
"@root": {
"age": 12,
"city": "Senec",
"sex": "Male"
},
"computer": {
"price": 427.12,
"comments": "Not working at all"
},
"repair": {
"technician": "contact;f868ef1e-cff8-e811-a980-000d3a37d634",
"repairedon": "2022-02-16T11:18:34.000Z",
"no_of_issues": 2
},
"topic": [
{ "mark": 4.5, "passed": 0 },
{ "mark": 2.1, "passed": 1, "sig": {"c": 2, "s": "annotation"} }
]
}
The second topic has two images stored in annotation records with subjects sig#002$1 and sig#002$2. Both annotations are attached to the resco_questionnaire record with the ID specified in @q.id. (See the Images chapter for details.)
Example of flexible JSON
The same data stored in the flexible JSON format might look like this (note the different specification of image storage):
{
"@ver": "f1.0",
"@q": {
"id": "eb614e48-4c4f-e911-a9a5-000d3a37da27",
"on": "2022-03-16T11:18:34.000Z",
"s": 2427
},
"@root": {
"age": { "v": 12, "s": 2, "on": 3544, "vl": "12" },
"city": { "v": "Senec", "s": 0, "on": 3734 },
"sex": { "v": "Male", "s": 5, "on": 3900 }
},
"computer": {
"@props" : { "e": 0},
"price": { "v": 427.12, "s": 427, "on": 4000, "vl": "$427.12" },
"comments": {
"v": "The device is not working at all",
"s": 0,
"on": 4120
}
},
"repair": {
"technician": { "v": "contact;f868ef1e-cff8-e811-a980-000d3a37d634", "s": 5, "on": 4200, "vl": "WaBez" },
"repairedon": { "v": "2022-02-16T11:18:34.000Z", "s": 0, "on": 4370 }
},
"topic": [ {
"mark": { "v": 4.5, "s": 5, "on": 4600, "vl": "4.5" },
"passed": { "v": 0, "s": 0, "on": 4690, "vl": "Did not pass" }
},{
"mark": { "v": 2.1, "s": 2, "on": 4900, "vl": "2.1" },
"passed": { "v": 1, "s": 10, "on": 4950, "vl": "Passed" },
"sig": { "v": {"c": 2, "s": "annotation"}, "on": 5200 }
}
]
}