Difference between revisions of "Back Checks"

Jump to: navigation, search
Line 1: Line 1:
<onlyinclude>
Back-checks are a [[Monitoring Data Quality | quality]] control method implemented to verify the quality and legitimacy of [[Primary Data Collection | data collected]] during a survey. Throughout the course of fieldwork, a back-check team returns to a randomly-selected subset of households for which data has been collected. The back-check team re-interviews these respondents with a short subset of survey questions, otherwise known as a back-check survey. Back-checks are used to verify the quality and legitimacy of key data collected in the actual survey. This page will provide points on how to coordinate, sample for, and design questionnaires for back-checks.
Back checks are quality control method used to verify data collected during a survey. After survey data has been collected, a randomly-selected subset of households are re-interviewed with a very short questionnaire to verify and determine the legitimacy of key data collected in the actual survey.  
 
</onlyinclude>
==Read First ==
==Read First ==
Back checks are an important tool to detect fraud, i.e. enumerators sitting under a tree and filling out questionnaires themselves, and to assess the accuracy of the data collected. Back checks can be conducted by in-person visits or phone calls. A complementary approach to in-person back checks is to do [[Random Audio Audits]].
*Back-checks help to evaluate how effective the instrument is and how well the enumerators are collecting quality data.
*Back-checks are an important tool to detect fraud (i.e. enumerators sitting under a tree and filling out questionnaires themselves).
*Back-checks help researchers to assess the accuracy of the data collected.  
*Back-checks can be conducted by in-person visits or phone calls. A complementary approach to in-person back checks is conducting [[Random Audio Audits]].
*Problems identified through back checks can be remedied by further [[Enumerator Training | training enumerators]] or replacing low-performing or problematic enumerators.
 
==Coordinating Back-Checks==
*The total duration of each back-check survey should be around 10-15 minutes.
*The back-checks should be conducted by a specialized team of a few exclusively back-checking enumerators. The back-check enumerators should be of the highest trust and quality.
*Administer 20% of back-checks within the first two weeks of fieldwork. This helps the research team to identify early whether the [[Questionnaire Design | questionnaire]] is effective, whether enumerators are doing their jobs well, and which changes to make to ensure high quality data collection.


==Best Practices during back checks ==  
==Sampling for Back-Checks==
Here are some of the best practices for back checks:
*Aim to back-check 10-20% of the total observations.
* Aim to back check at least 10% of the total observations
*The back-check sample should be [[Stratified Random Sample | stratified]] across survey teams/enumerators. Every team and every enumerator must be back-checked as soon as possible and regularly.
* Back checks should also be front-heavy i.e. majority of them occurring in the first few days / weeks of data collection. This helps find whether the questionnaire/enumerators are doing their jobs well and can be remedied through training/replacement.
*Include missing respondents in the back-check sample to verify that enumerators are not biasing your sample by not tracking hard-to-find respondents. Also include observations flagged in other quality tests like [[Monitoring Data Quality#Guidelines#High Frequency Checks | high frequency checks]] and observations collected by enumerators suspected of cheating.
*The back check sample should be stratified across survey teams/surveyors.  
* The back checks should be done in person by an independent third party.  
* It is important that enumerators do not know what questions will be audited. to that end, many people randomly select a small number of questions from the survey instrument to back check, and change the back check form regularly during data collection.


==How to Select Back Check Questions ==
==Designing the Back-Check Survey==
Back check questions should be selected with the performance of both the questionnaire and the surveyor in mind. Using different types of questions during the back check helps in finding the cause of poor data quality, i.e. questionnaire language, surveyor performance, survey fraud, etc. Some of the questions that should be asked during a back check are as follows:
Back-check questions are drawn from the original [[Questionnaire Design | questionnaire]]. [http://www.poverty-action.org/ Innovation for Poverty Action] identifies four groups of questions to include in a back-check to best gauge data and enumerator quality.


* To test for translation issues, back check questions which can be interpreted differently by different surveyors. *
# Questions to identify respondent and interview information: these questions verify the identity of the respondent and check if, when, and where the original survey took place.  
* To test whether enumerators are falsifying data to shorten interviews, back check questions that determine repeated sections of the questionnaire. For example, if there is a long series of questions about household members, verify the correct number of household members. If an agricultural survey asks for production information by plot, verify the number of plots.  
* To test for fraud, check simply that an enumerator visited the household and conducted an interview with the correct respondent


==A framework for back checks from Innovations for Poverty Action==
# Type 1 Variable Questions: these questions ask straightforward information with no expected variation or room for error. They may include questions about education level, marital status, occupation, whether the respondent has children or not, etc. If Type 1 variable values differ between the questionnaire and the backcheck survey, they indicate poor quality data, a serious enumerator problem, and potentially falsified work.  
The following framework for back checks has been developed by [http://www.poverty-action.org/ Innovation for Poverty Action].


;Identifying Respondents and Interview Information
# Type 2 Variable Questions: these are questions for which capable enumerators should get the true answer. If the Type 2 response value differ between the questionnaire and the backcheck survey, they indicate that the enumerator may need more training.
:- Check if we have the right person
:- Check if they interview took place and when did it take place.  


;Type 1 Variables
# Type 3 Variable Questions: these questions are expected to be difficult. They help research teams to understand if the questionnaire is effectively designed and if enumerators are interpreting difficult and/or nuanced questions correctly and uniformly. If Type 3 variable values differ between the questionnaire and the backcheck, they indicate the need for further enumerator training or, in particular cases, questionnaire modification.  
:- Straightforward questions where we expect no variation.
:- For example - education level, marital status, occupation, has children or not, etc.


;Type 2 Variables 
Back-check surveys may also test for translation issues by including questions that could be interpreted differently by different surveyors. Finally, to test whether enumerators are falsifying data to shorten interviews, back-check questions that determine repeated sections of the questionnaire. For example, if there is a long series of questions about household members, verify the correct number of household members. If an agricultural survey asks for production information by plot, verify the number of plots.  
:- Questions where we expect capable enumerators to get the true answer.


;Type 3 Variables
Note that it is important that enumerators do not know what questions will be audited. To that end, you may consider randomly changing the back-check survey regularly during data collection.
:- Questions that we expect to be difficult. We back check these questions to understand if they were correctly interpreted in the field.  


The total duration of the back checks should be around 10-15 minutes.
== Analyzing Back-Check Data ==


=== Comparing Back Checks to Actual Survey Data ===
After completing a back-check, you can compare the back-check data to the original survey data. This can be done by using the Stata command <code>bcstats</code>, developed by [http://www.poverty-action.org/ Innovations for Poverty Action]. This command produces a dataset of the comparisons between the back-check and original survey data. The command also completes enumerator checks and stability checks for variables.
After completing a back check, you can now compare the data obtained from the back check to your actual survey data. This can be done by using the Stata command <code> bcstats </code> developed by [http://www.poverty-action.org/ '''Innovations for Poverty Action.'''] This command compares the back check data and the survey data, and produces a data set of the comparisons between the two data sets. The command also completes enumerator checks and stability checks for variables.


The steps are as follows:
The steps are as follows:


<code>  
:: <code> ssc install bcstats  </br>
ssc install bcstats  </br>
:: bcstats, surveydata(''filename'') bcdata(''filename'') id(''varlist'') [options]</code>.
bcstats, surveydata(''filename'') bcdata(''filename'') id(''varlist'') [options]
</code>.


To learn about the options for bcstats and survey back checks, please type <code> help bcstats </code> on Stata after installing the command.
To learn about the options for <code>bcstats</code> and back-checks, please type <code> help bcstats </code> on Stata after installing the command.


== Back to Parent ==
==Back to Parent==
This article is part of the topic [[Field Management]].
This article is part of the topic [[Field Management]].


Line 58: Line 51:
*DIME Analytics’ [https://github.com/worldbank/DIME-Resources/blob/master/stata2-4-quality.pdf Data Quality Assurance]
*DIME Analytics’ [https://github.com/worldbank/DIME-Resources/blob/master/stata2-4-quality.pdf Data Quality Assurance]
* World Health Organization's  [http://unstats.un.org/unsd/hhsurveys/pdf/Chapter_10.pdf '''Quality Assurance in Surveys: standards, guidelines, and procedures''']. This chapter provides, in detail,  the approach and methodology on quality control during surveys.
* World Health Organization's  [http://unstats.un.org/unsd/hhsurveys/pdf/Chapter_10.pdf '''Quality Assurance in Surveys: standards, guidelines, and procedures''']. This chapter provides, in detail,  the approach and methodology on quality control during surveys.
*[https://ideas.repec.org/c/boc/bocode/s458173.html bcstats], a  Stata program written by an IPA staff member for conducting back checks on survey data
*[https://ideas.repec.org/c/boc/bocode/s458173.html bcstats], a  Stata program written by an IPA staff member for conducting back checks on survey data.
[[Category: Field Management ]]
[[Category: Field Management ]]

Revision as of 21:58, 14 May 2019

Back-checks are a quality control method implemented to verify the quality and legitimacy of data collected during a survey. Throughout the course of fieldwork, a back-check team returns to a randomly-selected subset of households for which data has been collected. The back-check team re-interviews these respondents with a short subset of survey questions, otherwise known as a back-check survey. Back-checks are used to verify the quality and legitimacy of key data collected in the actual survey. This page will provide points on how to coordinate, sample for, and design questionnaires for back-checks.

Read First

  • Back-checks help to evaluate how effective the instrument is and how well the enumerators are collecting quality data.
  • Back-checks are an important tool to detect fraud (i.e. enumerators sitting under a tree and filling out questionnaires themselves).
  • Back-checks help researchers to assess the accuracy of the data collected.
  • Back-checks can be conducted by in-person visits or phone calls. A complementary approach to in-person back checks is conducting Random Audio Audits.
  • Problems identified through back checks can be remedied by further training enumerators or replacing low-performing or problematic enumerators.

Coordinating Back-Checks

  • The total duration of each back-check survey should be around 10-15 minutes.
  • The back-checks should be conducted by a specialized team of a few exclusively back-checking enumerators. The back-check enumerators should be of the highest trust and quality.
  • Administer 20% of back-checks within the first two weeks of fieldwork. This helps the research team to identify early whether the questionnaire is effective, whether enumerators are doing their jobs well, and which changes to make to ensure high quality data collection.

Sampling for Back-Checks

  • Aim to back-check 10-20% of the total observations.
  • The back-check sample should be stratified across survey teams/enumerators. Every team and every enumerator must be back-checked as soon as possible and regularly.
  • Include missing respondents in the back-check sample to verify that enumerators are not biasing your sample by not tracking hard-to-find respondents. Also include observations flagged in other quality tests like high frequency checks and observations collected by enumerators suspected of cheating.

Designing the Back-Check Survey

Back-check questions are drawn from the original questionnaire. Innovation for Poverty Action identifies four groups of questions to include in a back-check to best gauge data and enumerator quality.

  1. Questions to identify respondent and interview information: these questions verify the identity of the respondent and check if, when, and where the original survey took place.
  1. Type 1 Variable Questions: these questions ask straightforward information with no expected variation or room for error. They may include questions about education level, marital status, occupation, whether the respondent has children or not, etc. If Type 1 variable values differ between the questionnaire and the backcheck survey, they indicate poor quality data, a serious enumerator problem, and potentially falsified work.
  1. Type 2 Variable Questions: these are questions for which capable enumerators should get the true answer. If the Type 2 response value differ between the questionnaire and the backcheck survey, they indicate that the enumerator may need more training.
  1. Type 3 Variable Questions: these questions are expected to be difficult. They help research teams to understand if the questionnaire is effectively designed and if enumerators are interpreting difficult and/or nuanced questions correctly and uniformly. If Type 3 variable values differ between the questionnaire and the backcheck, they indicate the need for further enumerator training or, in particular cases, questionnaire modification.

Back-check surveys may also test for translation issues by including questions that could be interpreted differently by different surveyors. Finally, to test whether enumerators are falsifying data to shorten interviews, back-check questions that determine repeated sections of the questionnaire. For example, if there is a long series of questions about household members, verify the correct number of household members. If an agricultural survey asks for production information by plot, verify the number of plots.

Note that it is important that enumerators do not know what questions will be audited. To that end, you may consider randomly changing the back-check survey regularly during data collection.

Analyzing Back-Check Data

After completing a back-check, you can compare the back-check data to the original survey data. This can be done by using the Stata command bcstats, developed by Innovations for Poverty Action. This command produces a dataset of the comparisons between the back-check and original survey data. The command also completes enumerator checks and stability checks for variables.

The steps are as follows:

ssc install bcstats
bcstats, surveydata(filename) bcdata(filename) id(varlist) [options].

To learn about the options for bcstats and back-checks, please type help bcstats on Stata after installing the command.

Back to Parent

This article is part of the topic Field Management.

Additional Resources