Difference between revisions of "Data Documentation"
Line 1: | Line 1: | ||
<onlyinclude> | |||
Documenting any aspects of the data work that may affect the analysis is a crucial part of dealing with data. Impact evaluation projects often take years to be completed and are executed by large teams. If the data work is not documented while it is ongoing, it is likely that many details will be lost and a considerable amount of time spent trying to understand what was previously done. For example, say it became clear during the field work that some respondents didn't understand a test that was applied because they had reading difficulties. If the [[Impact Evaluation Team#Field Coordinator | field coordinator]] didn't document this issue, the [[Impact Evaluation Team#Research Assistant | research assistant]] will not know to flag them during [[Data Cleaning | data cleaning]]. And if the [[Impact Evaluation Team#Research Assistant | research assistant]] doesn't document why the observations were flagged and what the flag means, they may not be correctly dealt with during [[Data Analysis | analysis]]. | Documenting any aspects of the data work that may affect the analysis is a crucial part of dealing with data. Impact evaluation projects often take years to be completed and are executed by large teams. If the data work is not documented while it is ongoing, it is likely that many details will be lost and a considerable amount of time spent trying to understand what was previously done. For example, say it became clear during the field work that some respondents didn't understand a test that was applied because they had reading difficulties. If the [[Impact Evaluation Team#Field Coordinator | field coordinator]] didn't document this issue, the [[Impact Evaluation Team#Research Assistant | research assistant]] will not know to flag them during [[Data Cleaning | data cleaning]]. And if the [[Impact Evaluation Team#Research Assistant | research assistant]] doesn't document why the observations were flagged and what the flag means, they may not be correctly dealt with during [[Data Analysis | analysis]]. | ||
</onlyinclude> | |||
There are different ways to document data work. One widespread practice is to send e-mails reporting issues to the team. Though this is easily done, it is time-consuming to find answers later on in the project development, even if someone in the team needs to remember that an e-mail was sent. For data cleaning, data analysis and variables construction, it is best practice to document the data work through comments on the code. However, even though this is very helpful for some reading the codes carefully, if these comments are not documented elsewhere, it may also take a long time to go through all the do-files and find the answer to a specific question. It's usually advisable to have all data work documentation in one file or folder, though how it is structured and when, how and by whom it is updated will vary from one project to the other. One advantage of submitting codes for [[Code Review| code review]] and depositing data on the [[Microdata Catalog | microdata catalog]] is that both cases the data work documentation will be reviewed, though does not guarantee that everything that should be documented is in fact, as reviewers cannot ask about issues unkown to them. | There are different ways to document data work. One widespread practice is to send e-mails reporting issues to the team. Though this is easily done, it is time-consuming to find answers later on in the project development, even if someone in the team needs to remember that an e-mail was sent. For data cleaning, data analysis and variables construction, it is best practice to document the data work through comments on the code. However, even though this is very helpful for some reading the codes carefully, if these comments are not documented elsewhere, it may also take a long time to go through all the do-files and find the answer to a specific question. It's usually advisable to have all data work documentation in one file or folder, though how it is structured and when, how and by whom it is updated will vary from one project to the other. One advantage of submitting codes for [[Code Review| code review]] and depositing data on the [[Microdata Catalog | microdata catalog]] is that both cases the data work documentation will be reviewed, though does not guarantee that everything that should be documented is in fact, as reviewers cannot ask about issues unkown to them. | ||
Revision as of 11:30, 5 April 2018
Documenting any aspects of the data work that may affect the analysis is a crucial part of dealing with data. Impact evaluation projects often take years to be completed and are executed by large teams. If the data work is not documented while it is ongoing, it is likely that many details will be lost and a considerable amount of time spent trying to understand what was previously done. For example, say it became clear during the field work that some respondents didn't understand a test that was applied because they had reading difficulties. If the field coordinator didn't document this issue, the research assistant will not know to flag them during data cleaning. And if the research assistant doesn't document why the observations were flagged and what the flag means, they may not be correctly dealt with during analysis.
There are different ways to document data work. One widespread practice is to send e-mails reporting issues to the team. Though this is easily done, it is time-consuming to find answers later on in the project development, even if someone in the team needs to remember that an e-mail was sent. For data cleaning, data analysis and variables construction, it is best practice to document the data work through comments on the code. However, even though this is very helpful for some reading the codes carefully, if these comments are not documented elsewhere, it may also take a long time to go through all the do-files and find the answer to a specific question. It's usually advisable to have all data work documentation in one file or folder, though how it is structured and when, how and by whom it is updated will vary from one project to the other. One advantage of submitting codes for code review and depositing data on the microdata catalog is that both cases the data work documentation will be reviewed, though does not guarantee that everything that should be documented is in fact, as reviewers cannot ask about issues unkown to them.
Read first
Field Work Documentation
Sampling
- Sample selection
- Replacement criteria
Field work dates
Tracking respondents
- Total number of respondents listed
- Total number of respondents visited
- Refusal rates
- Total number of respondents in final sample
Issues on the field
Report any problems that occurred during the administration of the survey (strikes, inclement weather, inability to enter parts of the country)