Reproducibility is the ability to duplicate the results of a study using the same materials and procedures as were used by the original investigators (Bollen et al., 2015). In data work and coding, this translates to computational reproducibility: the ability to reproduce outputs using the same code and data inputs.
Read First
Violation of Exogeneity Assumption
Simultaneity Bias
Reverse Causality Bias
Additional Resources
Related Pages
In addition to traditional data sources, such as information gathered during surveys, data can be collected from a variety of alternative sources.
Administrative data is any data collected by national or local governments (i.e. ministries, agencies etc.) outside of the context of an impact evaluation. Examples include national census data, tax data, and school enrollment data. Administrative data is generally not initially collected for research purposes but rather to document or track policy beneficiaries, firm owners and the general population. Researchers should aim not to use administrative data in place of survey data but rather in addition to it.
Monitoring data is collected to understand the implementation of the assigned treatment in the field. Typically, survey round data helps us understand changes in the outcome variables throughout the duration of the project, and monitoring data helps us understand how these changes are related to the intervention of our treatment.
NOTE: While a majority of these practices are applicable for any ODK based-platform, this guide was drafted using SurveyCTO, so some features may not be applicable/available on all ODK based platforms.
