Difference between revisions of "Iefieldkit"

Jump to: navigation, search
Line 17: Line 17:
These four commands in this package make sure that inputs and outputs are significantly more human-readable by working with spreadsheets instead of Stata '''do-files'''. In doing so, they allow field personnel who do not specialize in [[Software Tools#Statistical Software|code tools]] to understand and review the tasks involved in '''primary data collection'''.  '''<code>iefieldkit</code>''' thus recognizes the vital role played by field personnel in supporting [[Data Management|data management]] and [[Data Cleaning|data cleaning]] even if they are not proficient in Stata.
These four commands in this package make sure that inputs and outputs are significantly more human-readable by working with spreadsheets instead of Stata '''do-files'''. In doing so, they allow field personnel who do not specialize in [[Software Tools#Statistical Software|code tools]] to understand and review the tasks involved in '''primary data collection'''.  '''<code>iefieldkit</code>''' thus recognizes the vital role played by field personnel in supporting [[Data Management|data management]] and [[Data Cleaning|data cleaning]] even if they are not proficient in Stata.


==Commands==
== Before Data Collection ==
 
===Before Data Collection===
 
Before data collection occurs, <code>[[ietestform]]</code> allows for rapid error-checking of ODK-based electronic surveys, including best practices for [[SurveyCTO Coding Practices | SurveyCTO]]-styled forms. This ensures that data, once collected, will import in Stata-friendly formats -- such as avoiding name conflicts and ensuring compliant variable naming and labelling.
Before data collection occurs, <code>[[ietestform]]</code> allows for rapid error-checking of ODK-based electronic surveys, including best practices for [[SurveyCTO Coding Practices | SurveyCTO]]-styled forms. This ensures that data, once collected, will import in Stata-friendly formats -- such as avoiding name conflicts and ensuring compliant variable naming and labelling.


complements the ODK syntax test on [[SurveyCTO Coding Practices | SurveyCTO]] server. It runs tests to inform researchers how to use ODK programming language features to ensure high data quality. This command is especially useful if the data that will be imported to Stata has other restrictions in addition to ODK syntax.
complements the ODK syntax test on [[SurveyCTO Coding Practices | SurveyCTO]] server. It runs tests to inform researchers how to use ODK programming language features to ensure high data quality. This command is especially useful if the data that will be imported to Stata has other restrictions in addition to ODK syntax.
 
== During Data Collection ==
===During Data Collection===
 
During data collection, <code>[[ieduplicates]]</code> and <code>[[iecompdup]]</code> (both previously released as a part of the package <code>ietoolkit</code> but now moved to this package) provide a workflow for detecting and resolving duplicate entries in the dataset, ensuring that the final survey dataset will be a correct record of the survey sample to merge onto the master sampling database.
During data collection, <code>[[ieduplicates]]</code> and <code>[[iecompdup]]</code> (both previously released as a part of the package <code>ietoolkit</code> but now moved to this package) provide a workflow for detecting and resolving duplicate entries in the dataset, ensuring that the final survey dataset will be a correct record of the survey sample to merge onto the master sampling database.
===After Data Collection===
== After Data Collection ==
 
After data collection, the <code>[[iecodebook]]</code> commands provide a workflow for rapidly [[Data Cleaning | cleaning]], harmonizing, and [[Data Documentation | documenting]] datasets. <code>iecodebook</code> uses input specified in an Excel sheet, which provides a much more well-structured and easy to follow overview – especially for non-technical users – than the same operations written directly to a dofile.
After data collection, the <code>[[iecodebook]]</code> commands provide a workflow for rapidly [[Data Cleaning | cleaning]], harmonizing, and [[Data Documentation | documenting]] datasets. <code>iecodebook</code> uses input specified in an Excel sheet, which provides a much more well-structured and easy to follow overview – especially for non-technical users – than the same operations written directly to a dofile.



Revision as of 17:17, 30 April 2020

Primary data collection and cleaning involve highly repetitive but extremely important processes that contribute to high quality reproducible research. DIME Analytics has developed iefieldkit as a package in Stata to standardize and simplify best practices involved in primary data collection. Iefieldkit consists of commands that automate: error-checking for electronic Open Data Kit (ODK)-based survey modules; duplicate checking and resolution; data cleaning and survey harmonization; and codebook creation.

Read First

Objective

One of the most important developments in economics over the past two decades has been the rise of empirical research, through primary as well as secondary data collection. The authors of iefieldkit have developed the package to support data collection by researchers directly in a wide range of fields like agriculture, health, energy and environment, transport, financial and private sector development, gender, governance, and fragility, conflict and violence (FCV). iefieldkit therefore supports general best practices in primary data collection from start to finish:

These four commands in this package make sure that inputs and outputs are significantly more human-readable by working with spreadsheets instead of Stata do-files. In doing so, they allow field personnel who do not specialize in code tools to understand and review the tasks involved in primary data collection. iefieldkit thus recognizes the vital role played by field personnel in supporting data management and data cleaning even if they are not proficient in Stata.

Before Data Collection

Before data collection occurs, ietestform allows for rapid error-checking of ODK-based electronic surveys, including best practices for SurveyCTO-styled forms. This ensures that data, once collected, will import in Stata-friendly formats -- such as avoiding name conflicts and ensuring compliant variable naming and labelling.

complements the ODK syntax test on SurveyCTO server. It runs tests to inform researchers how to use ODK programming language features to ensure high data quality. This command is especially useful if the data that will be imported to Stata has other restrictions in addition to ODK syntax.

During Data Collection

During data collection, ieduplicates and iecompdup (both previously released as a part of the package ietoolkit but now moved to this package) provide a workflow for detecting and resolving duplicate entries in the dataset, ensuring that the final survey dataset will be a correct record of the survey sample to merge onto the master sampling database.

After Data Collection

After data collection, the iecodebook commands provide a workflow for rapidly cleaning, harmonizing, and documenting datasets. iecodebook uses input specified in an Excel sheet, which provides a much more well-structured and easy to follow overview – especially for non-technical users – than the same operations written directly to a dofile.

Additional Resources

  • Visit the iefieldkit GitHub page here