Difference between revisions of "DataWork Folder"
Mrijanrimal (talk | contribs) |
Mrijanrimal (talk | contribs) |
||
Line 10: | Line 10: | ||
The data sub-folder inside a survey round should be further divided into multiple types of data which are explained below: | The data sub-folder inside a survey round should be further divided into multiple types of data which are explained below: | ||
;Raw Folder | ;Raw Folder | ||
This folder should contain the data sets you got as soon as you get them. This includes data downloaded from the internet, data received from data collection, and data received from other projects. The data in this folder should be exactly as you got it and '''''absolutely no changes''''' should be made to it. Even simple changes like correcting obvious mistakes, changing variable names, changing format from csv to Stata or other formats, file name changes should not be done to the data in this folder. | This folder should contain the data sets you got as soon as you get them. This includes data downloaded from the internet, data received from data collection, and data received from other projects. The data in this folder should be exactly as you got it and '''''absolutely no changes''''' should be made to it. Even simple changes like correcting obvious mistakes, changing variable names, changing format from csv to Stata or other formats, file name changes should not be done to the data in this folder. The only exception to this is if the file name needs to be changed to be imported, then the file name changes can be done in this folder. | ||
**Intermediate Folder - Raw datasets on which simple changes has been made as mentioned above should be put in the intermediate folder. | **Intermediate Folder - Raw datasets on which simple changes has been made as mentioned above should be put in the intermediate folder. | ||
**Final folder - The Final data folder contains clean, and final constructed datasets. | **Final folder - The Final data folder contains clean, and final constructed datasets. |
Revision as of 20:02, 30 January 2017
Since the data folder is setup to be used throughout the impact evaluation project, it is important to set it up correctly. Setting the folder up correctly can help increase efficiency of the data work being done and also reduces the sources of errors in data work.
Read First
Guidelines
Survey Round
Each round of the survey should have it's own sub-folder inside the data folder. For example - Inside the main data folder, you can have sub-folders like baseline, follow up 1 , follow up 2, midline, endline, etc.
Data folder
The data sub-folder inside a survey round should be further divided into multiple types of data which are explained below:
- Raw Folder
This folder should contain the data sets you got as soon as you get them. This includes data downloaded from the internet, data received from data collection, and data received from other projects. The data in this folder should be exactly as you got it and absolutely no changes should be made to it. Even simple changes like correcting obvious mistakes, changing variable names, changing format from csv to Stata or other formats, file name changes should not be done to the data in this folder. The only exception to this is if the file name needs to be changed to be imported, then the file name changes can be done in this folder.
- Intermediate Folder - Raw datasets on which simple changes has been made as mentioned above should be put in the intermediate folder.
- Final folder - The Final data folder contains clean, and final constructed datasets.
- Dofiles
- Have a master do file that runs all other dofiles needed for this project. This is also your map to the data folder.
- Organize all other master files in sub folders.
Subsection 3
Back to Parent
This article is part of the topic *topic name, as listed on main page*
Additional Resources
- list here other articles related to this topic, with a brief description and link