Data Management

Revision as of 16:17, 25 January 2017 by Kbjarkefur (talk | contribs)
Jump to: navigation, search

Due to the long life span of a typical impact evaluation, where multiple generations of team members will contribute to the same data work, clear methods for organization of the data folder, the structure of the data sets in that data folder, and the identification of the observations in those data sets is critical


Read First

  • never work with a data set where the observations does not have standardized IDs. If you get a data set without IDs, the first thing you need to do is to create IDs
  • always create master data sets for all unit of observations relevant to the analysis

Guidelines

  • organize information on the topic into subsections. for each subsection, include a brief description / overview, with links to articles that provide details

Orgnaization of Project folder

Setting up your folder for the first time

Specific rules for different folders

  • Data folder
    • Raw fodler, do not make any edits to files here
    • Final folder
  • Dofiles
    • Have a master do file that runs all other dofiles needed for this proiject. This is also your map to the data folder
    • Organize all other master files in subfolders

Naming conventions

  • Use the version control in Box/DropBox instead of naming the folders _v01, _v02 etc.

Master data sets

Back to Parent

This article is part of the topic *topic name, as listed on main page*


Additional Resources

  • list here other articles related to this topic, with a brief description and link