Difference between revisions of "Duplicates and Survey Logs"
Mrijanrimal (talk | contribs) |
Mrijanrimal (talk | contribs) |
||
Line 11: | Line 11: | ||
To remove duplicates, you can use the DIME's Stata command <code> '''ieduplicates''' </code> which can be found in the <code> '''ietoolkit''' </code> Stata package. | To remove duplicates, you can use the DIME's Stata command <code> '''ieduplicates''' </code> which can be found in the <code> '''ietoolkit''' </code> Stata package. | ||
<code> .ssc install ietoolkit </code> | |||
==Comparing Server Data to Field Logs == | ==Comparing Server Data to Field Logs == |
Revision as of 15:13, 26 January 2017
Read First
- To complete that the survey data is complete, both these steps must be done. Skipping one might result in an incomplete data set.
Data Duplicates
Before analyzing the outcomes of quality checks or sometimes even before running real time quality checks, we need to check for duplicates in the data. Duplicates are common in ODK/SurveyCTO and need to be removed before starting other data quality checks.
- The data should be downloaded daily and checked for duplicates daily.
- This cannot be done later because it is much easier to solve the problem when the field team remembers the interview. Other quality checks depend on uniquely identifying ID variables.
To remove duplicates, you can use the DIME's Stata command ieduplicates
which can be found in the ietoolkit
Stata package.
.ssc install ietoolkit
Comparing Server Data to Field Logs
Comparing server data to field logs makes sure that all the data collected during the survey has made it to your server. This can be done in a few steps:
- Gener
Back to Parent
This article is part of the topic Monitoring Data Quality
Additional Resources
- list here other articles related to this topic, with a brief description and link