Difference between revisions of "Monitoring Data Quality"
Kbjarkefur (talk | contribs) |
Kbjarkefur (talk | contribs) |
||
Line 12: | Line 12: | ||
*High frequency tests of data quality | *High frequency tests of data quality | ||
**IPA Template only (Template assumes SurvyCTO) | **IPA Template only (Template assumes SurvyCTO) | ||
**IPA Template + | ***if not written in SurveyCTO -possible to adapt data to template, or template to data, but might be easier to write your own tests in Stata | ||
**IPA Template + additional tests in Stata | |||
**Test written in Stata only | **Test written in Stata only | ||
***Option if data is not collected with SurveyCTO | |||
*Follow up using the Data Explorer in SurveyCTO | *Follow up using the Data Explorer in SurveyCTO | ||
Revision as of 15:51, 25 January 2017
Read First
- Data quality checks should be done before and during the survey, as there is little that we can do after a survey if the data contains errors.
- Lots of preparation should be made before the survey and steps should be followed during the survey so any error that is caught can be changed quickly before it is too late.
Guidelines
Since there is little time during the actual survey, prioritizing what time should be spent on is very important.
Steps important in the quality checks
- Test that all data is on the server
- Test for duplicates
- High frequency tests of data quality
- IPA Template only (Template assumes SurvyCTO)
- if not written in SurveyCTO -possible to adapt data to template, or template to data, but might be easier to write your own tests in Stata
- IPA Template + additional tests in Stata
- Test written in Stata only
- Option if data is not collected with SurveyCTO
- IPA Template only (Template assumes SurvyCTO)
- Follow up using the Data Explorer in SurveyCTO
Verifying that all survey data is on the server
Removing Duplicates
Before analyzing the outcomes of quality checks or sometimes even before running real time quality checks, we need to check for duplicates in the data. Duplicates are common in ODK/SurveyCTO and need to be removed before starting other data quality checks.
To remove duplicates, you can use the DIME's Stata command ieduplicates
which can be found in the ietoolkit
Stata package.
Comparing Server Data to Field Logs
Comparing server data to field logs makes sure that all the data collected during the survey has made it to your server. This can be done by writing code that generates a survey log which counts the number of surveys on the server and matching that log with the field logs.
Data Quality Checks
Comparing back checks with the main data
Back to Parent
This article is part of the topic Monitoring data quality
Additional Resources
- list here other articles related to this topic, with a brief description and link