Repeat Groups and Rosters in SurveyCTO
This sections lists code examples that fulfill special requirements related to rosters and repeat groups. These can be used to develop interesting functionalities within forms, particularly with responses from a household, plot, or crop roster
Read First
- Repeat Groups and rosters can be utilized to carry out useful commands
- Examples include using repeat groups to repeat a set of questions over previously selected responses and filtering remove already selected options from select_one fields
- While some of the examples may seem a very specific, they can be generalized to any task that has the same outline
Repeat Group Using Previous Choices
There are many cases when you want to repeat a set of questions over previously selected responses, such as a set of crops cultivated or activities performed. Here we use an example based on a select_multiple question of 14 potential crops, where we want to ask follow up questions about the selected crops.
Both approaches require the use of the index() command, which yields the number of the repeat. Note that if NO CROPS is selected in the select_multiple question, the repeat groups are skipped. Also, it isn't possible to select NO CROPS and any other crops from the choices.
Code Example
Here is a code example showing the 2 main ways to achieve this. The comments in the labels of the calculated fields describe the steps taken in the SurveyCTO form.
1) Repeating for all options and adding a constraint inside the group
Here we code the repeat group count to cycle through all the possible options of the 'crop' choices - 14 in total - and add a constraint to the questions contained within the repeat group. The set up only needs a variable (${crop_name1}) to pull the crop name from the select_multiple options. There are two advantages and one drawback to this method:
- easier to code
- output links the crop ID to the repeat count
- can create large number of fields/variables when there are a lot of choices, particularly for nested repeats (e.g. crops within plots within seasons) which can slow down form processing on the tablet
2) Only repeating for selected options
This method utilizes how the select_multiple fields are stored (e.g. 1, 4, 5). It selects the corresponding number (i.e. the crop ID) in the list in order through the selected-at(${crops2}, index()-1)
calculate expression. Note that the '-1' is needed as the first field in the list is 0 (not 1), as in other programming languages like Python.
Once we know the crop ID, the crop name is pulled in the same manner as the other example. As the number of repeats is dynamic, we need to tell the repeat group to only repeat the number of times for which there are selected crops, which is: count-selected(${crops2})
. The setup of this method requires calculated fields for the crop ID and the crop name. There are two advantages and two drawbacks to this method:
- more elegant, less missing data
- runs faster on large forms
- can be trickier to code
- output crop IDs do not line up with repeat count, need to account for this in data cleaning
Filtering in Repeated Questions
This example demonstrates how to remove already selected options from select_one fields in repeat groups. It has several applications, such as choosing a favorite something, 2nd favorite, etc., or programming flexibility about the order in which the respondent answers questions about a particular plot, crop, family member, etc. It is important that the repeat_count is always well defined, as using all the answers will result in the form not allowing you to progress.
Code Example
Here is a code example. We have a situation where a team is going out to map the plots of an association, but first they want to know the order in which they will map the 10 plots.
To set this up, we have a repeat group with a select_one field to choose the 1st plot that is mapped. In the choices tab, the values have corresponding values in the filter.
The choices become filtered after each repetition through the expression not(selected(${plot_order}, filter))
in the choice_filter column. This removes choices that had already been selected for this question. Note that SurveyCTO aggregates over each instance of ${plot_order} in this filter expression, despite each of them being an individual select_one.
The example includes a join() of the responses that dynamically updates, to help visualize how the filter is being populated.
Note that this could be coded without a repeat group, with the size of the filter increasing with each plot selected.
Dealing With 'Other' Choices in a Repeat Group
Inside a self-contained repeat group, dealing with 'other' choices is relatively straightforward - we can add an 'other' response to a dynamically populated choice lists and select from there. However, this gets complicated under more complex structures.
Imagine that we are surveying smallholder farmers about their crop production and sales over the last year. To aid with the analysis and structure the interview, we should take into account the crops produced, which plots they were grown on, and in which season. This might require having 3 layers of repeat groups: season > plots > crops. We are easily able to set up an 'other' crop as an option for crops produced at the bottom level. However, once we want to refer to this 'other' crop at either the plot or season level or in a different repeat group, SurveyCTO cannot then link this selected 'other' crop's ID with it's name.
There are several ways around this, though none are perfect:
- Ask enumerators to reserve an 'other' crop option for a specific crop throughout the whole survey, where 'other crop 1' would always refer to one type, 'other crop 2' to another type and so on. In theory this would work, but is not recommended because of the large potential for enumerator error.
- Keep all questions related to the crops contained within their respective repeat group at the lowest level. Again, this is not ideal for the form structure, as we don't always, for example, want to ask about a crop's sales or processing over plots or seasons - we may just want to ask at the crop level.
- Define all the potential crops at the start (highest level) of the agriculture section. If we establish what the 'other' crops are at the start of the section, we can then dynamically add these to the crop choices in each lower level repeat group. The main drawback with this approach is that once the crops are defined you cannot define more later on if the respondent forgets one. Out of the 3 approaches, this is the most viable solution to the problem described above, especially when enumerators are well trained to predefine these 'other' crops and good form piloting has already defined all potential crops in the choice list. Also, any crops that the respondent may forget probably aren't that important anyway. The coding example below shows how to develop a form under this approach.
Coding Example
This is the link to the code example. We have a survey where we want crop production data by plot in each season, so we set up a repeat group structure of season > plots > crops. After this, we want sales data also. Let's say that a farmer combines all their production together after harvest, so that asking about crop sales per plot is confusing for them. In this case, we would want to ask sales of each crop per season, so we would need a new repeat group structure of season > crops. If we just code 'other' crops at the plot level, when these are aggregated, SurveyCTO will not know which 'other' crop is being referred to or what it's name is. This approach overcomes this problem.
- After asking if they grew any crops, we then ask what specific crops they cultivated during the past year. It is vital here that ALL 'other' crops are defined. Therefore if there are 3 crops missing from the list, 'Other crop 1', 'Other crop 2' and 'Other crop 3' must be selected and subsequently defined. The choice_filter column restricts the choices to being the undefined versions of the 'other' crops.
- We then ask about the crops actually cultivated at the season > plot level. Even though we are referring to the same choices options, the list filters IN the defined 'other' crops and filters OUT the undefined options. Note the value and filter numbers for the crop choices. Only those that were selected in the first stage appear in the second stage. The ID and the name for the defined 'other' crops continue throughout the rest of the survey in this structure.
- To structure the questions at the season > crops level, we first must aggregate all the crops that have been cultivated at the plot level. We do this through the expression
de-duplicate(' ', join(' ', ${crop1_id}))
, which first joins all the crop IDs at the plot level, then de-duplicates this list. We are left with a select_multiple type of number list, e.g. 1, 5, -81, 7. In this case, -81 would refer to the first of the defined 'other' crops. - We can then pull the crops from this number list in the crop2 repeat group, and pull the name from the choices from ${crop}. Note that the error will result if the other crops were defined inside of the crop1 group, not at the most upper level.
Roster Age Order
In this example we list the names and ages of up to 10 children in a repeat group. In the second repeat group, we list the children from youngest to oldest and ask the respondent to select a child that has their mother present. If the first child's mother is not present, then the second youngest child is listed and the respondent is again asked to select the child that has his or her mother present. If two children have the same age, then both children are listed at the same time and the respondent is asked to select one of them.
If a child is selected, no more children are listed and the name and index in the first repeat group are stored in separate calculated fields so that they can be referenced later. If no child has a mother present, then these two calculated fields are empty.
Code Explanation
The example code can be found here. The first loop and how the child names are dynamically loaded into the choice list used in the second loop is explained in the Dynamically Populated Choice Lists section.
The second loop is repeated up to the same number of times as there are unique ages in the child roster, but the repeats are stopped as soon as a child is selected. That is achieved by first joining all ages in child_ages and removing duplicates. Duplicates are removed as children with the same ages are listed at the same time. Then the number of elements in child_ages are counted in the field unique_ages_num. This is used as the maximum number of repeats, but the repeat count is created in mother_select_repeat_count that is set to 0 as soon as a child is selected.
The most complicated part of the code is the filter expression in select_mother_present. The expression has two parts (insert code here). The second part is the easiest and is always used to show the option Not Present.
The first part of the filter expression uses indexed-repeat() to get the age of all children and evaluates how they rank in the list in child_ages using the rank-value() function. The filter column in the choices tab makes sure that each child is associated with their correct age in indexed-repeat(). So the children that will be listed are the children whose age has a rank in the child_ages list that equals the value on the right hand side of the equality sign.
We cannot compare the age rank described above directly with mother_index as mother_index returns ascending values and rank-value() returns descending values. Remember that we want to list the youngest child first but rank-value() returns the oldest child as rank 1. Therefore ${unique_ages_num} + 1 - ${mother_index}
converts the mother_index so that the expression to the right of the equality sign (will be seen once code inserted in previous paragraph) descends from the numbers of unique ages (unique_age_num) to 1, and thereby lists the youngest child first.
When no child is selected in a repeat, child_index_mother_present is set to zero but when a child is selected, it is set to the index in the first loop of that child. When child_index_mother_present takes a non-zero value, child_index_selected is no longer zero and the repeat group count is set to zero and exited. child_index_selected is then used to return the name of the child selected.
Additional Resources
- Repeating Fields
- Grouping and Repeating Questions
- Understanding indexed-repeat() and the Structure of Repeat-Group Data
- Follow-ups: Asking follow-up questions for a list of selected items
- Guide to Choice Filters Part 3 (Examples)
- See examples 3-5 for help on related problems with filtering
- Creating an open response field after a multiple choice question that asks users to "specify other"
- Rosters: Two Methods for Repeated Questions
- Rosters: A Third Hybrid Approach for Repeated Questions
- Rosters: Choosing Among Earlier Entries
- Rosters: Collecting Repeated Information with Multiple Repeats