Difference between revisions of "Monitoring Data Quality"

Jump to: navigation, search
Line 4: Line 4:


==Things we are testing for during Data Quality tests ==  
==Things we are testing for during Data Quality tests ==  
Data quality tests are done to test various aspects of the questionnaire. Primarily, we want to test whether or not our survey questionnaire is written in a way that it accurately captures what we want to study. Additionally, we also want to test that the questionnaire has been correctly programmed in the CAPI software and also test for the human factor i.e. surveyor performance.
Data quality tests are done to test various aspects of the questionnaire. Primarily, we want to test whether or not our survey questionnaire is written in a way that it accurately captures what we want to study. Additionally, we also want to test that the questionnaire has been correctly programmed in the CAPI software and also test for the human factor i.e. surveyor performance.
 
== Steps important in the quality ==
== Steps important in the quality ==
===Duplicates and Survey Logs ===  
===Duplicates and Survey Logs ===  

Revision as of 19:41, 8 March 2017

Read First

  • The best time write the data quality checks is in parallel to the Questionnaire Design and the Questionnaire Programming. Data quality checks are often completed too late to be relevant or does often omit important tests if the tests are not written in parallel to the questionnaire.
  • Lots of preparation should be made before the survey and steps should be followed during the survey so any error that is caught can be changed quickly before it is too late.

Things we are testing for during Data Quality tests

Data quality tests are done to test various aspects of the questionnaire. Primarily, we want to test whether or not our survey questionnaire is written in a way that it accurately captures what we want to study. Additionally, we also want to test that the questionnaire has been correctly programmed in the CAPI software and also test for the human factor i.e. surveyor performance.

Steps important in the quality

Duplicates and Survey Logs

It is very important to do quality checks on data during the survey as it is difficult to fix the problem/recollect the data if the error is found after the completion of the survey.

  • Testing for Duplicates - Since SurveyCTO/ODK data has a number of duplicates, the first thing you need to do is check for duplicates and remove the duplicates.
  • Test that all data from the field is on the server - Survey data logs from the field can then be matched with the logs from the survey data logs on the server to see if the all the data from the field has been transferred to the server.

Tip: Verifying that the data is complete should be done the day of or the day after the survey. Since, the interviewer is most likely close by, it would be easy to re-interview and get missing data if significant chunks of data were missing.

To see how to remove duplicates and check that all the field data is on servers, please see the main article at Duplicates and Survey Logs.

High Frequency Checks

After you have verified that all the data is on the server, the following steps should be undertaken:

  • High frequency tests of data quality
    • IPA Template only (Template assumes SurvyCTO)
      • if not written in SurveyCTO -possible to adapt data to template, or template to data, but might be easier to write your own tests in Stata
    • IPA Template + additional tests in Stata
    • Test written in Stata only
      • Option if data is not collected with SurveyCTO
      • This may also be ideal if you wish to add additional checks not covered by the IPA template or not written in by SurveyCTO.
  • Follow up using the Data Explorer in SurveyCTO

Back Checks

Back Checks, also known as Survey Audits, are a second visit to the household to confirm the interview was conducted and verify key pieces of information. Best practice is for back checks to be completed by an independent third party.

Additional Resources