Difference between revisions of "Duplicates and Survey Logs"

Revision as of 15:19, 26 January 2017

Read First

To complete that the survey data is complete, both these steps must be done. Skipping one might result in an incomplete data set.

Data Duplicates

Before analyzing the outcomes of quality checks or sometimes even before running real time quality checks, we need to check for duplicates in the data. Duplicates are common in ODK/SurveyCTO and need to be removed before starting other data quality checks.

The data should be downloaded daily and checked for duplicates daily.
This cannot be done later because it is much easier to solve the problem when the field team remembers the interview. Other quality checks depend on uniquely identifying ID variables.

To remove duplicates, you can use the DIME's Stata command ieduplicates which can be found in the ietoolkit Stata package.

ssc install ietoolkit ieduplicates ID_varname

This identifies the duplicates in the ID variable and exports them to an Excel file which is also used to correct duplicates in Stata. Field supervisors without knowledge of Stata can make the corrections in the Excel file and the duplicates will be corrected the next time you run the code.

Three main types of duplicates in SurveyCTO

Comparing Server Data to Field Logs

Comparing server data to field logs makes sure that all the data collected during the survey has made it to your server. This can be done in a few steps:

Gener

Back to Parent

This article is part of the topic Monitoring Data Quality

Additional Resources

list here other articles related to this topic, with a brief description and link

Navigation

Tools

Difference between revisions of "Duplicates and Survey Logs"

Revision as of 15:19, 26 January 2017

Contents

Read First

Data Duplicates

Three main types of duplicates in SurveyCTO

Comparing Server Data to Field Logs

Back to Parent

Additional Resources

Revision as of 15:17, 26 January 2017 (view source) Mrijanrimal (talk \| contribs) (→‎Data Duplicates) ← Older edit	Revision as of 15:19, 26 January 2017 (view source) Mrijanrimal (talk \| contribs) m (Mrijanrimal moved page Real Time Data Quality Checks to Removing Data Duplicates) Newer edit →
(No difference)

Difference between revisions of "Duplicates and Survey Logs"

Revision as of 15:19, 26 January 2017

Read First

Data Duplicates

Three main types of duplicates in SurveyCTO

Comparing Server Data to Field Logs

Back to Parent

Additional Resources

follow us

newsletter