GitHub is a web-based hosting service for managing code work and tracking changes made to code. It is a useful collaborative tool through all stages of research and fieldwork. This page provides resources and links to resources on how to get started with GitHub.
Data Management
Due to the long life span of a typical impact evaluation, multiple generations of team members often contribute to the same data work. Clear methods for organization of the data folder, the structure of the data sets in the folder, and identification of the observations in the data sets is critical.
This page outlines the steps in a typical research project and lists each topic within data security that a research team should consider at that point. If you are following these best practices, then not even the full research team has access to identifying data, but very rarely that is ever needed to do the analysis.
Impact Evaluation projects should follow a clear file naming convention as many team members will need to understand and interact with files over the project lifetime. It is very important to use a naming convention that not only you understand but someone looking at the files after years also understands.
The master do-file is the main do-file that calls upon and runs all the other do-files of a project. It plays a critical role throughout all stages of the research project and functions as a map to the data folder. This page outlines the components of a well-structured and replicable master do-file.
In today's world of research, researchers regularly handle data, send it over the internet, and store it in the cloud. At any point, especially when the internet is involved, the data is exposed to some risk. Keeping data safe and encrypted is hence a key component of IRB requirements and research ethics.
Checking for duplicates ensures that the answers of the survey respondent are not recorded twice. Matching field survey logs with server logs ensures that data has been fully transferred to the survey.
All research projects collect and use multiple datasets for a given unit of observation.
Data documentation is the process of recording any aspect of project design, sampling, data collection, cleaning and analysis that may affect results.
This article discusses different aspects of data storage (such as different types of storage, data back up and data retention). It is important to make sure you have appropriate data storage solutions before you start receiving data. You should plan your data storage for the full life-cycle of a project and not just for your immediate needs. Changing data storage solution mid-project can be costly and break the code already written for the project making earlier research outputs non-reproducible.
Pagination
- Page 1
- Next page
