Difference between revisions of "SurveyCTO Choice Lists"
Line 76: | Line 76: | ||
*[https://support.surveycto.com/hc/en-us/articles/360033126194 Guide to Choice Filters Part 1] | *[https://support.surveycto.com/hc/en-us/articles/360033126194 Guide to Choice Filters Part 1] | ||
*[https://support.surveycto.com/hc/en-us/articles/360033126654-Guide-to-choice-filters-Part-2 Guide to Choice Filters Part 2] | *[https://support.surveycto.com/hc/en-us/articles/360033126654-Guide-to-choice-filters-Part-2 Guide to Choice Filters Part 2] | ||
*[ | *[https://support.surveycto.com/hc/en-us/articles/360033730073-Guide-to-choice-filters-Part-3 Guide to Choice Filters Part 3 (Examples) |
Revision as of 18:51, 23 July 2023
Choice Lists are the answer options from which an enumerator chooses in a select_one or select_multiple question.
Read First
- There are many operations that can be performed with choice lists, such as dynamic populating and filtering.
Choice Lists
As stated previously, choice lists are the answer options from which an enumerator chooses in a select_one or select_multiple question. They are listed in the choices tab in the SurveyCTO questionnaire. Open Data Kit, the programming language of SurveyCTO, has very few restrictions on how you can code your options.
Only Numeric Values in the Name Column
The name column is unfortunately named as you should never have text in this column, only numbers. Text values are more difficult to work with when used in relevance or constraint conditions.
Also, one step of the data cleaning is to replace all string variables (some exceptions exist) with numeric variables. That task can be greatly reduced if we already when coding the questionnaire assign categories for each answer option. SurveyCTO provides a Stata do-files that create labels and add them to the numeric values.
Negative and Standardized Values for Non-Answers
For any answer that is a type of non-answer, for example, "Don't know", "Question does not apply", "Decline/Refused to Answer" or "Other" we should have a code that is negative and have the same meaning across a project.
We want the value to be negative for the following two reasons:
- The first reason is that when cleaning the data set these values will stand out more and therefore be easier to address during the cleaning of the data. This is the case both when tabulating or looking at a distribution of a variable, but also when looking at descriptive statistics as negative values distort, for example, means to the degree that the project team will be reminded to look for these error codes. To increase this effect it is better to pick -999 than -9 to represent for example "Don't Know".
- The second reason is that we might want to add more answer options in later rounds. If we would have the number 9 representing "Don't Know", then there is a chance that we will need that value for a category that we will add of we have more than eight categories. We could, of course, assign the new value the code 10 but then we would have a non-value in the middle of actual answers and that is not optimal. We should absolutely never shift the "Don't know" code from 9 to 10 and give the added category the code 9. This is the worst solution of all as in a panel data set, the same thing that means an answer in the follow-up data means "Don't know" in the earlier data.
It should be obvious to everyone that having the same code representing the same non-answer across a project will reduce the risk of confusion and make the cleaning of the data easier.
Dynamically Populated Choice Lists
Sometimes, we want to ask a respondent to select one or several answers out of answers the respondent has given earlier in the interview. For example, we might want to ask who in the household out of the members listed in the household roster module is currently employed. It is possible to do this by dynamically loading previous answers as answer options.
Best Practice
Any answer can be used in dynamic choice lists, but when using variables inside a repeat group, one more step is required because we cannot reference a field that was inside a repeat group directly. We need to first store the value of this field so that SurveyCTO knows the item that we are referring to. The calculated field is the intermediate step required. The same applies if you want to build a relevance with the value of a field that is filled inside a repeat group.
Coding Example
Here is a code example of how the answers to a field inside a repeat group are used to dynamically load the answer options for a select_one or a select_multiple question.
This example dynamically loads answers from a repeat group to be able to use them later in the questionnaire: as relevance, or dynamically populated choice lists for example. If you were to dynamically load answers from fields not inside a repeat group, then you simply reference those fields directly in the choice tab.
Note that this coding example can be improved in many ways. It has excluded some possible improvements in order to highlight the functionality discussed in this section. The most obvious improvement would be to filter the answer options so that only the answer options needed are displayed. See the section about choice filters.
Note that SurveyCTO will allow you to upload a form on the server even if you directly used an answer from a repeat group in your programming without referencing it in a calculated field. The error message will appear only on the tablet when swiping to the next question. In that respect, always test your form before the field data collection.
Dynamically Populated Choice Lists from Select_One
A specific example of a dynamically populated choice list is when you populate a select_multiple question with answers from a select_one question asked in a repeat group. For example, say that you list crops grown in a repeat group where each repeat is a crop, and later you want to be able to ask "which crop did you grow the most?" and only the crops already selected in the repeat group should display.
Code Example
The key to this is that it is possible to take values from multiple fields and replicate the format in which SurveyCTO stores the result, and then use select_multiple functions on that replicated list. Here is an example for the recommended implementation, and the text below explains in detail.
In this code example, we use a simple crop roster and then ask the respondent to select the crop out of the crops already selected that they grew the most. Use this code example as a starting point and modify it to the specific requirements that you need.
To understand what is going on, we start with a slightly simplified explanation on how ODK/SurveyCTO stores data in the tablet. A select_multiple is stored like this: 1, 3, 12 where each number is the code for each answer selected. If we have the individual values 1, 3, and 12, we can manually create this list. If the values are stored in the fields Field1, Field2 and Field3, that are each of type select_one, we can create the list with this code (insert code here). The (insert code here) part makes sure that each value is separated by a space. Using concat() quickly gets messy if the number of fields increases.
If each individual value comes from a repeat group instead we can use (insert code here) where field is the name of the select_one field in the repeat group with the individual values, and (insert code here) is the symbol (a space) that should separate the values. Using join() or concat() in the way described here, we get a list of the format 1, 3, 12. On this list we can use functions like selected() which we can use in a filter expression to display only the choice options in this list. While selected() is meant to be used on a list created by select_multiple fields, there is nothing stopping us from using that function on lists we created manually as long as the format is identical.
In the code example linked to above, the field crop in the repeat group uses the same choice list as the field crop_most. This is important for the simplicity of the filter we use. For crop_most the filter expression only displays the choice options for which the value in the filter column in the choice tab is in the list created by join(). If the question were different it would be perfectly possible to change crop_most to a select_multiple.
Note that in the code example, a similar filter is used to prevent the same crop from being selected twice in the repeat group. By using the not() function we get the exact opposite filter than the filter we use for crop_most.
Advanced special case
If the individual values does not come from a select_one field but text fields instead, then it is possible to combine this method with the method where the the choice lists are dynamically created from text fields. If you are an advanced SurveyCTO user you can try this yourself, otherwise let us know and we will create an example if there is enough demand.
Conditional Filtering
This subsection discusses how to restrict ('filter') the choices available from dynamically populated choice lists. In the example below, we use a household roster, where we want follow-up questions to only allow choices of
- males under 20
- females over 20.
The key is that we are able to 'increase' the level of the fields from a repeat group up a level and combine everything there using the join() function. This example builds upon the Dynamically Populated Choice Lists example by allowing us to focus on particular parameters during follow-up questions.
Coding Example
Here is a code example. The explanation is below:
- Set up a calculated field inside the repeat group for each condition that you want to filter later (lines 17-20). The field should yield the repeat number if TRUE, blank if FALSE.
- Add a set of joining calculated fields outside of the repeat group (lines 31-34). These result in a list of the repeat group calculated field results that adhere to the earlier condition, which are stored in select_multiple format, e.g. 1, 4, 5.
- Leverage this field by using it as the base for the filter in the questions on lines 35 and 36. Note the relevance fields only show these questions if at least 1 instance of the condition is TRUE.