Difference between revisions of "Back Checks"

Jump to: navigation, search
(26 intermediate revisions by 6 users not shown)
Line 1: Line 1:
Back checks are quality control method used to verify data collected during a survey. After survey data has been collected, certain households are re-interviewed for certain questions to verify and determine the legitimacy of the data collected in the actual survey.  
Back-checks are a [[Monitoring Data Quality | quality]] control method implemented to verify the quality and legitimacy of [[Primary Data Collection | data collected]] during a survey. Throughout the course of fieldwork, a back-check team returns to a randomly-selected subset of households for which data has been collected. The back-check team re-interviews these respondents with a short subset of survey questions, otherwise known as a back-check survey. Back-checks are used to verify the quality and legitimacy of key data collected in the actual survey. This page will provide points on how to coordinate, sample for, and design questionnaires for back-checks.


==Read First ==
==Read First ==
*Back-checks are an important tool to detect fraud (i.e. enumerators sitting under a tree and filling out questionnaires themselves).
*Back-checks help researchers to assess the accuracy and quality of the data collected.
*Back-checks can be conducted by in-person visits or phone calls. A complementary approach to in-person back checks is conducting [[Random Audio Audits]].
*Problems identified through back checks can be remedied by further [[Enumerator Training | training enumerators]] or replacing low-performing or problematic enumerators.


==Coordinating Back-Checks==
*The total duration of each back-check survey should be around 10-15 minutes.
*The back-checks should be conducted by a team of specialized back-check enumerators. The back-check enumerators should be experienced, skilled enumerators.
*The back-check team should be independent from the rest of the survey staff. They should be trained separately and have minimal contact with the survey team.
*Administer 20% of back-checks within the first two weeks of fieldwork. This helps the research team to identify early whether the [[Questionnaire Design | questionnaire]] is effective, whether enumerators are doing their jobs well, and which changes to make to ensure high quality data collection.


== Purpose==
==Sampling for Back-Checks==
Back checks are done to monitor the quality of the field work. This gives us valuable information on whether the questionnaire accurately captures the key outcomes of the study or not, and on whether the enumerators are performing their jobs as expected.  
*Aim to back-check 10-20% of the total observations.
*The back-check sample should be [[Stratified Random Sample | stratified]] across survey teams/enumerators. Every team and every enumerator must be back-checked as soon as possible and regularly.
*Include missing respondents in the back-check sample to verify that enumerators are not biasing your sample by not tracking hard-to-find respondents. Also include observations flagged in other quality tests like [[Monitoring Data Quality#Guidelines#High Frequency Checks | high frequency checks]] and observations collected by enumerators suspected of cheating.


==Best Practices during back checks ==  
==Designing the Back-Check Survey==
Back-check questions are drawn from the original [[Questionnaire Design | questionnaire]]. There are four types of questions that should be included in a back-check to gauge data and enumerator quality:


Here are some of the best practices that should be done while performing back checks:
*Questions to identify respondent and interview information:  
*The number of back checks that can be done depends on the budget of the survey team. The survey team should aim for at least 10% of the total observations.
: These questions verify the identity of the respondent and check if, when, and where the original survey took place. This is useful to check for fraud and verify reported completion rates.  
*Back checks should also be front-heavy i.e. majority of them occurring early in the survey. This helps find whether the questionnaire/enumerators are doing their jobs well and can be remedied through training/replacement. 
*The back checks should include surveys done by all the survey teams/surveyors. Households should be selected randomly from these teams.
*The back check sample should be proportional in terms of respondent selection with the actual survey.


==How to Select Back Check Questions ==
* Questions to detect fraud
Back check questions should be selected with the performance of both the questionnaire and the surveyor in mind. Using different types of questions during the back check helps in finding the cause of poor data quality i.e. questionnaire language, surveyor performance, CAPI errors etc. Some of the questions that should be asked during a back check are as follows:
:Include questions that ask for straightforward information with no expected variation or room for error. These should be questions that do not require particularly skilled enumeration, and do not vary over time (specifically the time period between the main interview and the backcheck). Examples include type of dwelling, education level, marital status, occupation, whether the respondent has children or not, etc. The specific variables to include will depend on the survey instrument and context. If values differ between the questionnaire and the backcheck survey, they indicate poor quality data, a serious enumerator problem, and/or potentially falsified work.  


To test for questionnaire language, back checks can be done on questions which can be interpreted differently by different surveyors. Asking questions that can be interpreted different during the survey and the back check provides the survey team with the knowledge on whether or not the surveyor is interpreting a question correctly.  
* Questions to detect errors in survey execution
:These are questions for which capable enumerators should get the true answer. These should be questions which involve relatively complex logic or consistency checks. If values for these questions differ between the questionnaire and the backcheck survey, they indicate that the enumerator may need more training.


Testing surveyor performance can be done using questions which can not have different answers at different times. Simple questions like the age of the respondent, or the number of member in the households are questions who should not differ between the survey and the back check.  
* Questions to detect problems with the questionnaire or key outcomes
:These should be a selection of questions that are key outcome variables for the survey. The backcheck provides an additional accuracy checks, and are useful to flag difficulties and/or inconsistencies in enumerator interpretation of the questions. If these values differ between the questionnaire and the backcheck, it indicates the need for further enumerator training or, in particular cases, questionnaire modification.  


To test for CAPI errors, question sections with complex skips can be tested.  
* Questions that determine repeated sections of the questionnaire
: These should be included to check whether enumerators are falsifying data to reduce the length of interviews. For example, if there is a long series of questions about each household member, verify that the number of household members is correct. If an agricultural survey asks for production information by plot, verify the number of plots is correct.  


==A framework for back checks from Innovations for Poverty Action==
Note that it is important that enumerators do not know what questions will be audited. To that end, you may consider randomizing questions or changing the back-check survey regularly during data collection.
The following framework for back checks has been developed by [http://www.poverty-action.org/ Innovation for Poverty Action]


;Identifying Respondents and Interview Information
== Analyzing Back-Check Data ==
:- Check if we have the right person
:- Check if they interview took place and when did it take place.


;Type 1 Variables
After completing a back-check, you can compare the back-check data to the original survey data. This can be done by using the Stata command <code>bcstats</code>, developed by [http://www.poverty-action.org/ Innovations for Poverty Action]. This command produces a dataset of the comparisons between the back-check and original survey data. The command also completes enumerator checks and stability checks for variables.
:-Straightforward questions where we expect no variation.
:-For example - education level, marital status, occupation, has children or not, etc.
 
;Type 2 Variables 
:- Questions where we expect capable enumerators to get the true answer.
 
;Type 3 Variables
:- Questions that we expect to be difficult. We back check these questions to understand if they were correctly interpreted in the field.
 
The total duration of the back checks should be around 10-15 minutes.
 
=== Comparing Back Checks to Actual Survey Data ===
After completing a back check, you can now compare the data obtained from the back check to your actual survey data. This can be done by using the Stata command <code> bcstats </code> developed by [http://www.poverty-action.org/ '''Innovations for Poverty Action.'''] This command compares the back check data and the survey data, and produces a data set of the comparisons between the two data sets. The command also completes enumerator checks and stability checks for variables.


The steps are as follows:
The steps are as follows:
  <nowiki>
ssc install bcstats
bcstats, surveydata(filename) bcdata(filename) id(varlist) [options]
</nowiki>
To learn about the options for <code>bcstats</code> and back-checks, please type <code> help bcstats </code> on Stata after installing the command.


<code>
==Back to Parent==
ssc install bcstats  </br>
This article is part of the topic [[Field Management]].
bcstats, surveydata(''filename'') bcdata(''filename'') id(''varlist'') [options]
</code>.
 
To learn about the options for bcstats, please type <code> help bcstats </code> on Stata after installing the command.
 
=== Action to take after back checks ===
The three types of questions asked during the back check helps determine whether the problems in the data are due to the surveyor or the questionnaire. The remedial actions  after back checks are as follows:
=== Type 1 data ===
Since type 1 variables should have little to no variation between the main survey and the back check, discrepancies in the data are most likely due to surveyor errors. A breakdown of the discrepancy percentage and the suggested corrective measures are as follows:
* More than 10% discrepancy - You should warn the surveyor.
* Discrepancy of 20-30% - 2nd back check needs to be conducted to correct the errors.
** If the errors are surveyor errors, then 3 additional surveys by the surveyors in the same week should be audited. If 20-30% discrepancies are found in those surveys as well, then the surveyor should be put on probation.
*Discrepancy of more than 40%- 2nd back check to determine who made the errors and maybe resurvey the household. If the surveyor made the errors, resurvey the household and audit all the surveys done by the surveyor in the batch.
**If one more survey has more than 40% discrepancy, fire the surveyor immediately and redo all surveys with 20% or more discrepancy.
 
=== Type 2 Data ===
Since we expect qualified surveyors to get the answers to type 2 questions, there shouldn't be much variation in the data from the survey and the data from the back check. High discrepancy could mean that the surveyors are not particularly well trained to ask the question. Some suggested corrective measures are as follows:
 
* If the discrepancy is more than 10%, consider retraining the surveyor.
* If a particular surveyor is responsible for more than 30% of the errors in the single survey, follow the steps for Type 1.
 
=== Type 3 Data ===
*If the discrepancy is more than 10%, discuss with your survey team and let your PIs know. They may decide to edit the survey or add additional rounds of surveying.
 
== Back to Parent ==
This article is part of the topic [[Monitoring Data Quality]]


== Additional Resources ==
== Additional Resources ==
* World Health Organization's  [http://unstats.un.org/unsd/hhsurveys/pdf/Chapter_10.pdf '''Quality Assurance in Surveys: standards, guidelines, and procedures''']. This chapter provides, in detail,  the approach and methodology on quality control during surveys.
*DIME Analytics’ [https://github.com/worldbank/DIME-Resources/blob/master/stata1-4-quality.pdf Real Time Data Quality Checks]
 
*DIME Analytics’ [https://github.com/worldbank/DIME-Resources/blob/master/stata2-4-quality.pdf Data Quality Assurance]
[[Category: Data Quality ]]
* World Health Organization's  [http://unstats.un.org/unsd/hhsurveys/pdf/Chapter_10.pdf Quality Assurance in Surveys: standards, guidelines, and procedures]. This chapter provides, in detail,  the approach and methodology on quality control during surveys.
*[https://ideas.repec.org/c/boc/bocode/s458173.html bcstats], a  Stata program written by an IPA staff member for conducting back checks on survey data.
[[Category: Field Management ]]

Revision as of 17:51, 17 September 2020

Back-checks are a quality control method implemented to verify the quality and legitimacy of data collected during a survey. Throughout the course of fieldwork, a back-check team returns to a randomly-selected subset of households for which data has been collected. The back-check team re-interviews these respondents with a short subset of survey questions, otherwise known as a back-check survey. Back-checks are used to verify the quality and legitimacy of key data collected in the actual survey. This page will provide points on how to coordinate, sample for, and design questionnaires for back-checks.

Read First

  • Back-checks are an important tool to detect fraud (i.e. enumerators sitting under a tree and filling out questionnaires themselves).
  • Back-checks help researchers to assess the accuracy and quality of the data collected.
  • Back-checks can be conducted by in-person visits or phone calls. A complementary approach to in-person back checks is conducting Random Audio Audits.
  • Problems identified through back checks can be remedied by further training enumerators or replacing low-performing or problematic enumerators.

Coordinating Back-Checks

  • The total duration of each back-check survey should be around 10-15 minutes.
  • The back-checks should be conducted by a team of specialized back-check enumerators. The back-check enumerators should be experienced, skilled enumerators.
  • The back-check team should be independent from the rest of the survey staff. They should be trained separately and have minimal contact with the survey team.
  • Administer 20% of back-checks within the first two weeks of fieldwork. This helps the research team to identify early whether the questionnaire is effective, whether enumerators are doing their jobs well, and which changes to make to ensure high quality data collection.

Sampling for Back-Checks

  • Aim to back-check 10-20% of the total observations.
  • The back-check sample should be stratified across survey teams/enumerators. Every team and every enumerator must be back-checked as soon as possible and regularly.
  • Include missing respondents in the back-check sample to verify that enumerators are not biasing your sample by not tracking hard-to-find respondents. Also include observations flagged in other quality tests like high frequency checks and observations collected by enumerators suspected of cheating.

Designing the Back-Check Survey

Back-check questions are drawn from the original questionnaire. There are four types of questions that should be included in a back-check to gauge data and enumerator quality:

  • Questions to identify respondent and interview information:
These questions verify the identity of the respondent and check if, when, and where the original survey took place. This is useful to check for fraud and verify reported completion rates.
  • Questions to detect fraud
Include questions that ask for straightforward information with no expected variation or room for error. These should be questions that do not require particularly skilled enumeration, and do not vary over time (specifically the time period between the main interview and the backcheck). Examples include type of dwelling, education level, marital status, occupation, whether the respondent has children or not, etc. The specific variables to include will depend on the survey instrument and context. If values differ between the questionnaire and the backcheck survey, they indicate poor quality data, a serious enumerator problem, and/or potentially falsified work.
  • Questions to detect errors in survey execution
These are questions for which capable enumerators should get the true answer. These should be questions which involve relatively complex logic or consistency checks. If values for these questions differ between the questionnaire and the backcheck survey, they indicate that the enumerator may need more training.
  • Questions to detect problems with the questionnaire or key outcomes
These should be a selection of questions that are key outcome variables for the survey. The backcheck provides an additional accuracy checks, and are useful to flag difficulties and/or inconsistencies in enumerator interpretation of the questions. If these values differ between the questionnaire and the backcheck, it indicates the need for further enumerator training or, in particular cases, questionnaire modification.
  • Questions that determine repeated sections of the questionnaire
These should be included to check whether enumerators are falsifying data to reduce the length of interviews. For example, if there is a long series of questions about each household member, verify that the number of household members is correct. If an agricultural survey asks for production information by plot, verify the number of plots is correct.

Note that it is important that enumerators do not know what questions will be audited. To that end, you may consider randomizing questions or changing the back-check survey regularly during data collection.

Analyzing Back-Check Data

After completing a back-check, you can compare the back-check data to the original survey data. This can be done by using the Stata command bcstats, developed by Innovations for Poverty Action. This command produces a dataset of the comparisons between the back-check and original survey data. The command also completes enumerator checks and stability checks for variables.

The steps are as follows:

 
ssc install bcstats 
bcstats, surveydata(filename) bcdata(filename) id(varlist) [options]

To learn about the options for bcstats and back-checks, please type help bcstats on Stata after installing the command.

Back to Parent

This article is part of the topic Field Management.

Additional Resources