Difference between revisions of "SurveyCTO Coding Practices"

Jump to: navigation, search
 
(26 intermediate revisions by 4 users not shown)
Line 1: Line 1:
<onlyinclude>This article discuss solutions to common issues in the SurveyCTO programming language. For a general introduction to how to structure your approach to CAPI programming or best practices settings, see the [[Questionnaire Programming]] topic.</onlyinclude>
#REDIRECT[[SurveyCTO Additional Topics#Integrating Calculations]]


'''Read First''' All coding examples linked to in this section are stored in Google Drive. SurveyCTO also allows you to pull this code directly to your server, using the URL of the Google Sheet (alternatively, you can copy the code to Excel).
        [[SurveyCTO Additional Topics#Conducting Audio Audits]]


== Labelling ==
[[SurveyCTO Additional Topics#Collecting Sensor Data]]


To speed up data import, all SurveyCTO surveys should have a language labelling column in both the questionnaire and the value labeling tab called "label:stata" which will be used to download and process the data. These labels should be in English, be no longer than 32 characters, and uses no special characters. The research assistant who will be responsible for data management can be of great assistance in preparing this. See the SurveyCTO documentation on "Translating a form into multiple languages" for more details.
Thousands of users in more than 150 countries depend on '''SurveyCTO''' to conduct [[Computer-Assisted Personal Interviews (CAPI) | computer-assisted personal interviews (CAPI)]].  This article discusses solutions to common approaches to sophisticated [[Questionnaire Design | design]] and [[Questionnaire Programming | programming]] in the '''SurveyCTO'''. For a general introduction to how to structure your approach to '''CAPI programming''', see [[Questionnaire Programming|questionnaire programming]].
==Read First==
* Use [[SurveyCTO Form_Settings|form settings]] to specify basic settings and version for your '''SurveyCTO form'''.
* Add labels to the '''SurveyCTO form''' to speed up data import and [[Data Cleaning |cleaning]].
*'''Repeat groups/rosters''' for households, crops, activities, or otherwise, can be filtered, based on previous choices, and more.
* Use '''question groups''' to apply a relevant condition to multiple fields, display multiple questions on the same screen, or frame all questions on a module in a group.
*'''Employ dynamically populated choice lists''' to list choices contingent upon previous responses.  
*'''Gather additional data''' via audio audits and sensor data to improve data quality, monitoring, and precision.
* '''SurveyCTO''' also lets users  change the format of question text using [[SurveyCTO HTML Input|basic HTML commands]].


== Randomization ==
== Variable Names ==
In the field, the [[randomization in SurveyCTO | best practice when randomizing anything]] is to prepare the randomization before the field activities start, and preload the result of the randomization into the survey so that it is replicable. What follows are some examples of SurveyCTO forms that randomly select survey participants:


* [[SurveyCTO Random Draw of Beneficiaries 1|Random draw of beneficiaries from a large pool]], without knowing if the potential beneficiaries are valid participants - this form randomly prioritizes participation over a group of IDs, which are then verified by the enumerator until a final group of 8 participants are registered.
To speed up data import, all '''SurveyCTO surveys''' should have a language labeling column in both the questionnaire and the value labeling tab called '''"label:stata."''' This will be used to download and process the data. These labels should be in English, no longer than 32 characters, and use no special characters.  


* [[SurveyCTO Random Draw of Beneficiaries 2|Random draw of any number of beneficiaries using repeat group]] - here we randomly prioritize a group of IDs using an elegant and concise repeat group solution, however this is not recommended for use in the field as it's not replicable without adaptation.
The research assistant responsible for [[Data Management | data management]] helps to prepare this. For detailed instructions on how to use multiple languages on '''SurveyCTO''', go to your '''SurveyCTO server''', open up the documentation pages and search for '''"Translating a form into multiple languages"'''.
== Randomizing ==
In the field, the [[Randomization in SurveyCTO | best practice when randomizing anything]] is to prepare randomization before the field activities begin – ideally in [[Randomization in Stata | Stata]] and preload the assignments into the survey. Ways to randomly select survey participants in '''SurveyCTO''' include:
=== Random Draw of Beneficiaries - Method 1 ===
[[SurveyCTO Random Draw of Beneficiaries 1|Randomly drawing beneficiaries from a large pool]] without knowing if the potential beneficiaries are valid participants: this form randomly prioritizes participation over a group of [[ID Variable Properties | IDs]], which are then verified by the enumerator until a final group of 8 participants are registered.


== Repeat Groups / Rosters ==
=== Random Draw of Beneficiaries - Method 2 ===
This sections lists code examples for special requirements in relation to rosters and repeat groups. These can be used to develop interesting functionality within forms, particularly interacting with the responses from a household, plot or crop roster. Here are some examples of this:
[[SurveyCTO Random Draw of Beneficiaries 2|Randomly drawing of any number of beneficiaries using repeat group]]: this form randomly prioritize a group of IDs using an elegant and concise repeat group solution. However, this is not recommended for use in the field as it's not [[Reproducible Research | replicable]] without adaptation.


* [[SurveyCTO Repeat Group Using Previous Choices|Setting Up Repeat Group Using Previous Choices]] - there are many cases when you want to repeat a set of questions over previously selected responses, such as a set of crops cultivated or activities performed. This example shows the 2 main ways of coding this.
== Managing Repeat Groups / Rosters ==
This sections lists code examples that fulfill special requirements related to rosters and repeat groups. These can be used to develop interesting functionalities within forms, particularly with responses from a household, plot or crop roster. Here are some examples:
* [[SurveyCTO Repeat Group Using Previous Choices|Setting Up Repeat Group Using Previous Choices]]: this form shows two main ways of coding to repeat a set of questions over previously selected responses (i.e. a set of crops cultivated or activities performed).  
* [[SurveyCTO Select from Roster Age Order|Select Member in Roster Based on Criteria]]: in this example we have a roster over children and then we want the respondent to be asked to select the youngest child if the mother is present, if she is not present, we ask the respondent to select the second youngest child if the mother of the child is present, and so fourth.
*  [[SurveyCTO Conditional Filtering|Filtering on Conditions of Repeat Group Questions]]:  this form utilizes responses found inside a repeat group roster as conditions upon which to filter choices for questions further down in a form.
* [[SurveyCTO Filtering in Repeated Questions|Filtering in Repeated Choice Questions]] - this form shows how to code a repeating question where the list of choices is reduced if an option was previously selected.
* [[SurveyCTO Dealing with 'Other' Crops Over Different Repeat Group Levels |Dealing with 'Other' Crops Over Different Repeat Group Levels]]: this form presents a solution for introducing new crops in different repeats and being able to recall them at other points in the survey. In general, many challenges arise when coding agriculture sections of household surveys. There is a lot of data to capture at different and changing levels: per season, per plot, per crop, etc. Sometimes you might want to change the level of questions from crop within plot within season to, for example, just the crop level. It's important that the respondents are able to recall harvest and sales information as accurately as possible, therefore we must structure surveys well to account for this. This form presents useful information for doing so.
== Using Question Groups ==
Use a lot of question groups but do not overuse them. In general, use question groups to:
*[[Relevance Condition to Multiple Fields|Apply a relevance condition to multiple fields]].
*[[Multiple Questions Displayed at the Same Time |Display multiple questions on the same screen]].
* Frame all the questions on a module in group: only do this at the highest level of the survey (i.e. do not use this for sub-levels of a module).
=== Applying Choice Lists ===
Choice lists are the answer options from which an enumerator chooses in a ''select_one'' or ''select_multiple'' question. They are listed in the choices tab in the SurveyCTO questionnaire. Open Data Kit, the programming language of SurveyCTO, has very few restrictions on how you can code your options. However, there are [[SurveyCTO Choice Lists|choice list best practices]] that matter for data quality:
* [[SurveyCTO Dynamically Populated Choice Lists|Dynamically Populated Choice Lists - basic]]: it is possible to program dynamically populated choice lists using answers given by the respondents in a previous question.
* [[SurveyCTO Dynamically Populated Choice Lists From Select One|Dynamically Populated Choice Lists - from repeated select_one]]: a specific example of dynamically populated choice list is when you populate a ''select_multiple'' question with answers from a ''select_one'' asked in a repeat group. For example, say that you list crops grown in a repeat group where each repeat is a crop, and later you want to be able to ask "''which crop did you grow the most?''" and only the crops already selected in the repeat group should display.
#REDIRECT[[SurveyCTO Additional Topics#Integrating Calculations]]


* [[SurveyCTO Select from Roster Age Order|Select Member in Roster Based on Criteria]] - in this example we have a roster over children and then we want the respondent to be asked to select the youngest child if the mother is present, if she is not present, we ask the respondent to select the second youngest child if the mother of the child is present, and so fourth.  
== Integrating Calculations==
[[SurveyCTO Programming|SurveyCTO]] has developed a best practices guide for using calculations, which help with the design of smarter [[Survey Pilot|surveys]]. For example, you can use calculations to find out how long it takes respondents to reach a certain point in the '''survey'''; to monitor your respondents’ observance of suggested response times for skill assessments or to ensure that [[Personally Identifiable Information (PII) | PII]] doesn’t get captured in your [[Data Analysis | data analysis]]. [https://www.surveycto.com/blog/using-calculations/ This guide] provides tips and examples for using calculations in '''SurveyCTO'''.


[[SurveyCTO Conditional Filtering|Filtering on Conditions of Repeat Group Questions]] - this example utilizes responses found inside a repeat group roster as conditions upon which to filter choices for questions further down in a form.
== Conducting Audio Audits==
[[SurveyCTO Programming|SurveyCTO]] supports [[Monitoring Data Quality#Data Quality Checks for Remote Surveys|random audio audits]] as a part of the [[Survey Pilot|survey]] meta-data. '''Audio audits''' are audio recordings that occur during an interview without an indication that the recording has been initiated. They are one of several tools that research teams can use to ensure that they are [[Primary Data Collection|collecting]] the highest possible [[Data Quality Assurance Plan | quality of data]]. They also provide a cost-effective way for research teams to better understand how [[Enumerator Training|enumerators]] are conducting '''surveys''' in the field. You can learn more about best practices, logistical and [[Research Ethics | ethical]] considerations of '''audio audits''' in [https://www.surveycto.com/blog/audio-audits-best-practices/ this SurveyCTO article].


* [[SurveyCTO Filtering in Repeated Questions|Filtering in Repeated Choice Questions]] - this example shows how to code a repeating question where the list of choices is reduced if an option was previously selected
== Collecting Sensor Data==
[[SurveyCTO Programming|SurveyCTO]] can collect sensor meta-data using built-in Android device sensors. Android devices can come with a number of sensors beyond GPS including an accelerometer, gyroscope, light sensor, microphone, among others. The sensor data field types on '''SurveyCTO''' use these sensors to capture data during the [[Survey Pilot|survey]] that can provide users with an idea of:
* Light conditions around the device
* How much the device moved
* How loud the sounds were around the device.
*The tone of the sounds around the device.
*An estimate of whether a conversation was taking place around the device.
'''SurveyCTO''' has built '''Stata commands''' to help users easily analyze large volumes of sensor data streams. Sensor streams can be time-consuming to work with because for every submission, a sensor stream records a stream of observations (potentially thousands) and stores it as an additional .csv file attached to the submission. You can learn about these commands in [https://www.surveycto.com/blog/stata-sensor-data/ this article] and you can access the scto package [https://github.com/surveycto/scto here]. Visit [https://support.surveycto.com/hc/en-us/articles/360009552713-Using-sensor-meta-data?flash_digest=bc582e369a0560a8424218ec245209eefb41bfe4 this SurveyCTO help article] to learn more about sensor data.


=== Agriculture Survey Advice ===
== Related Pages ==
There are many challenges encountered when coding agriculture sections of household surveys. There is a lot of data to capture at different and changing levels: per season, per plot, per crop, etc., and sometimes you might want to change the level of questions from crop within plot within season, to, for example, just the crop level. It's important that the respondents are able to recall harvest and sales information as accurately as possible, therefore we must structure surveys well to account for this. Here are some example forms that talk through the main issues and suggest designs to overcome these issues.
[[Special:WhatLinksHere/SurveyCTO_Coding_Practices|Click here to see pages that link to this topic]].
 
* [[SurveyCTO Dealing with 'Other' Crops Over Different Repeat Group Levels |Dealing with 'Other' Crops Over Different Repeat Group Levels]] - in SurveyCTO it's very difficult to introduce new crops in different repeats and be able to recall them at other points in the survey. This example form talks about these difficulties and suggests a structure to refer back to them in other sections.
 
== Groups ==
Use a lot of groups but do not over use them. In general, groups are used to fulfill one of the purposes below:
 
* [[Relevance Condition to Multiple Fields|Apply a relevance condition to multiple fields]].
 
* [[Multiple Questions Displayed at the Same Time |Display multiple questions on the same screen]].
 
* Frame all the questions on a module in group. Only do this at the highest level of the survey, i.e. do not use this for sub-levels of a module.
 
== Choice Lists ==
Choice lists are the answer options an enumerator can choose from in a ''select_one'' or ''select_multiple'' question. They are listed in the choices tab in the SurveyCTO questionnaire. Open Data Kit, the programming language of SurveyCTO, has very few restrictions on how you can code your options. However, there are [[SurveyCTO Choice Lists|choice list best practices]] that matter for data quality.
 
* [[SurveyCTO Dynamically Populated Choice Lists|Dynamically Populated Choice Lists - basic]] - it is possible to program dynamically populated choice lists using answers given by the respondents in a previous question.
 
* [[SurveyCTO Dynamically Populated Choice Lists From Select One|Dynamically Populated Choice Lists - from repeated select_one]] - a specific example of dynamically populated choice list is when you populate a ''select_multiple'' question with answers from a ''select_one'' asked in a repeat group. For example, say that you list crops grown in a repeat group where each repeat is a crop, and later you want to be able to ask "''which crop did you grow the most?''" and only the crops already selected in the repeat group should display.
 
== SurveyCTO Calculations==
* SurveyCTO has developed a great best practices guide for using calculations. Calculations help with the design of smarter surveys. For example, you can use calculations to find out how long it takes respondents to reach a certain point in your survey, you can monitor your respondents’ observance of suggested response times for skill assessments or you can use calculations to ensure that personally identifiable information (PII) doesn’t get captured in your data analysis. This guide provide 9 tips and examples for using calculations and can be found  [https://www.surveycto.com/blog/using-calculations/ here].
 
== SurveyCTO Audio Audits==
* SurveyCTO supports random audio audits as part of the meta-data of a survey. Audio audits are audio recordings that take place during a survey interview without an indication that the recording has been initiated. They are one of several tools that survey administrators can use to ensure that the highest possible quality of data is being collected. They are also a cost effective means for administrators to better understand how their surveys are actually being conducted in the field. You can learn more about best practices, logistical and ethical considerations of using audio audits in this SurveyCTO [https://www.surveycto.com/blog/audio-audits-best-practices/ article].
 
== SurveyCTO Sensor Data==
* With SurveyCTO, you can collect sensor meta-data using built-in Android device sensors. Android devices can come with a number of sensors beyond GPS including an accelerometer, gyroscope, light sensor, microphone, among others. The sensor data field types on SurveyCTO use these sensors to capture data during the survey that can provide users with an idea of:
**The light conditions around the device.
**How much the device moved.
**How loud the sounds were around the device.
**The pitch of the sounds around the device.
**An estimate of whether a conversation was taking place around the device.
 
You can visit this SurveyCTO [https://support.surveycto.com/hc/en-us/articles/360009552713-Using-sensor-meta-data?flash_digest=bc582e369a0560a8424218ec245209eefb41bfe4 help article] to learn more about sensor data.
 
== Stata Commands for SurveyCTO Sensor Data==
* SurveyCTO has also built Stata commands to help users easily analyze large volumes of sensor data streams. Sensor streams can be time-consuming to work with because for every submission, a sensor stream records a stream of observations (potentially thousands) and stores it as an additional .csv file attached to the submission. You can learn about these commands in this [https://www.surveycto.com/blog/stata-sensor-data/ article] and you can access the scto package [https://github.com/surveycto/scto here].
 
== Other Tips and Tricks ==
* [[SurveyCTO HTML Input|Question font formatting in HTML]] - SurveyCTO accepts HTML commands in the text of questions. This can be used to <font color="red"> highlight </font> and '''emphasize''' key information, among other uses.
 
=== Categories to add to this page: ===
* Household rosters
** General examples
** Updating roasters from previous rounds on tablet during interview
* ID and identification
** Assigning IDs in the field - both when the sample is know before launch of survey and when respondents are sampled in the field


== Additional Resources ==
== Additional Resources ==
Tips on coding complex agricultural surveys in SurveyCTO, from IFPRI: https://www.surveycto.com/best-practices/pro-tips-for-agricultural-surveys-from-ifpri/  
*DIME Analytics (World Bank), [https://docs.google.com/document/d/1yVFdWugHV37vRaXmt-FnW80r015JVESR0bIHloFq378/edit#heading=h.ifwt68s1w4n DIME Analytics SurveyCTO Style Guide]
*DIME Analytics (World Bank), [https://docs.google.com/spreadsheets/d/1I2uXEgga0LwEPnz8m_pPrgTgmIjsC4yl/edit#gid=1987277168 Survey CTO Style Guide Template]
*DIME Analytics (World Bank), [https://osf.io/8e7bj Introduction to SurveyCTO]
*DIME Analytics (World Bank), [https://osf.io/2nepd Advanced SurveyCTO Programming]
* Jan Schenk, [https://medium.com/@janschenk/variable-names-in-survey-research-a18429d2d4d8 Variable Names in Survey Research]
* Simrin Makhija (IFPRI), [https://www.surveycto.com/best-practices/pro-tips-for-agricultural-surveys-from-ifpri/ Pro tips for designing and deploying complex agricultural surveys in SurveyCTO]
* SurveyCTO, [https://support.surveycto.com/hc/en-us/articles/360009552713-Using-sensor-meta-data Using sensor meta-data]


[[Category: SurveyCTO Coding Practices ]]
[[Category: SurveyCTO Coding Practices ]]
[[Category: Technical Tools]]

Latest revision as of 23:53, 20 July 2023

SurveyCTO Additional Topics#Conducting Audio Audits

SurveyCTO Additional Topics#Collecting Sensor Data

Thousands of users in more than 150 countries depend on SurveyCTO to conduct computer-assisted personal interviews (CAPI). This article discusses solutions to common approaches to sophisticated design and programming in the SurveyCTO. For a general introduction to how to structure your approach to CAPI programming, see questionnaire programming.

Read First

  • Use form settings to specify basic settings and version for your SurveyCTO form.
  • Add labels to the SurveyCTO form to speed up data import and cleaning.
  • Repeat groups/rosters for households, crops, activities, or otherwise, can be filtered, based on previous choices, and more.
  • Use question groups to apply a relevant condition to multiple fields, display multiple questions on the same screen, or frame all questions on a module in a group.
  • Employ dynamically populated choice lists to list choices contingent upon previous responses.
  • Gather additional data via audio audits and sensor data to improve data quality, monitoring, and precision.
  • SurveyCTO also lets users change the format of question text using basic HTML commands.

Variable Names

To speed up data import, all SurveyCTO surveys should have a language labeling column in both the questionnaire and the value labeling tab called "label:stata." This will be used to download and process the data. These labels should be in English, no longer than 32 characters, and use no special characters.

The research assistant responsible for data management helps to prepare this. For detailed instructions on how to use multiple languages on SurveyCTO, go to your SurveyCTO server, open up the documentation pages and search for "Translating a form into multiple languages".

Randomizing

In the field, the best practice when randomizing anything is to prepare randomization before the field activities begin – ideally in Stata and preload the assignments into the survey. Ways to randomly select survey participants in SurveyCTO include:

Random Draw of Beneficiaries - Method 1

Randomly drawing beneficiaries from a large pool without knowing if the potential beneficiaries are valid participants: this form randomly prioritizes participation over a group of IDs, which are then verified by the enumerator until a final group of 8 participants are registered.

Random Draw of Beneficiaries - Method 2

Randomly drawing of any number of beneficiaries using repeat group: this form randomly prioritize a group of IDs using an elegant and concise repeat group solution. However, this is not recommended for use in the field as it's not replicable without adaptation.

Managing Repeat Groups / Rosters

This sections lists code examples that fulfill special requirements related to rosters and repeat groups. These can be used to develop interesting functionalities within forms, particularly with responses from a household, plot or crop roster. Here are some examples:

  • Setting Up Repeat Group Using Previous Choices: this form shows two main ways of coding to repeat a set of questions over previously selected responses (i.e. a set of crops cultivated or activities performed).
  • Select Member in Roster Based on Criteria: in this example we have a roster over children and then we want the respondent to be asked to select the youngest child if the mother is present, if she is not present, we ask the respondent to select the second youngest child if the mother of the child is present, and so fourth.
  • Filtering on Conditions of Repeat Group Questions: this form utilizes responses found inside a repeat group roster as conditions upon which to filter choices for questions further down in a form.
  • Filtering in Repeated Choice Questions - this form shows how to code a repeating question where the list of choices is reduced if an option was previously selected.
  • Dealing with 'Other' Crops Over Different Repeat Group Levels: this form presents a solution for introducing new crops in different repeats and being able to recall them at other points in the survey. In general, many challenges arise when coding agriculture sections of household surveys. There is a lot of data to capture at different and changing levels: per season, per plot, per crop, etc. Sometimes you might want to change the level of questions from crop within plot within season to, for example, just the crop level. It's important that the respondents are able to recall harvest and sales information as accurately as possible, therefore we must structure surveys well to account for this. This form presents useful information for doing so.

Using Question Groups

Use a lot of question groups but do not overuse them. In general, use question groups to:

Applying Choice Lists

Choice lists are the answer options from which an enumerator chooses in a select_one or select_multiple question. They are listed in the choices tab in the SurveyCTO questionnaire. Open Data Kit, the programming language of SurveyCTO, has very few restrictions on how you can code your options. However, there are choice list best practices that matter for data quality:

  • Dynamically Populated Choice Lists - basic: it is possible to program dynamically populated choice lists using answers given by the respondents in a previous question.
  • Dynamically Populated Choice Lists - from repeated select_one: a specific example of dynamically populated choice list is when you populate a select_multiple question with answers from a select_one asked in a repeat group. For example, say that you list crops grown in a repeat group where each repeat is a crop, and later you want to be able to ask "which crop did you grow the most?" and only the crops already selected in the repeat group should display.
  1. REDIRECTSurveyCTO Additional Topics#Integrating Calculations

Integrating Calculations

SurveyCTO has developed a best practices guide for using calculations, which help with the design of smarter surveys. For example, you can use calculations to find out how long it takes respondents to reach a certain point in the survey; to monitor your respondents’ observance of suggested response times for skill assessments or to ensure that PII doesn’t get captured in your data analysis. This guide provides tips and examples for using calculations in SurveyCTO.

Conducting Audio Audits

SurveyCTO supports random audio audits as a part of the survey meta-data. Audio audits are audio recordings that occur during an interview without an indication that the recording has been initiated. They are one of several tools that research teams can use to ensure that they are collecting the highest possible quality of data. They also provide a cost-effective way for research teams to better understand how enumerators are conducting surveys in the field. You can learn more about best practices, logistical and ethical considerations of audio audits in this SurveyCTO article.

Collecting Sensor Data

SurveyCTO can collect sensor meta-data using built-in Android device sensors. Android devices can come with a number of sensors beyond GPS including an accelerometer, gyroscope, light sensor, microphone, among others. The sensor data field types on SurveyCTO use these sensors to capture data during the survey that can provide users with an idea of:

  • Light conditions around the device
  • How much the device moved
  • How loud the sounds were around the device.
  • The tone of the sounds around the device.
  • An estimate of whether a conversation was taking place around the device.

SurveyCTO has built Stata commands to help users easily analyze large volumes of sensor data streams. Sensor streams can be time-consuming to work with because for every submission, a sensor stream records a stream of observations (potentially thousands) and stores it as an additional .csv file attached to the submission. You can learn about these commands in this article and you can access the scto package here. Visit this SurveyCTO help article to learn more about sensor data.

Related Pages

Click here to see pages that link to this topic.

Additional Resources