Difference between revisions of "Checklist: Data Cleaning"

Jump to: navigation, search
 
(6 intermediate revisions by 3 users not shown)
Line 1: Line 1:
Get printable version by clicking on ''printable version'' in the menu to the left. The latest version of this checklist can be found at https://dimewiki.worldbank.org/wiki/Checklist:_Data_Cleaning.  
Get printable version [https://dimewiki.worldbank.org/index.php?title=Checklist:_Data_Cleaning&printable=yes here]. For more detailed instructions on how to implement the different tasks in this checklist, see [[Data Cleaning]]. Note that this checklist is best displayed in Chrome, Firefox, Safari or any other modern browser.
 
For more detailed instructions on how to implement the different tasks in this checklist, see [[Data Cleaning]].


<div id="chk_datacleaning"></div>
<div id="chk_datacleaning"></div>
Line 7: Line 5:
==Back to Parent ==  
==Back to Parent ==  
This article is part of the topic [[Check Lists]]
This article is part of the topic [[Check Lists]]
 
== Additional Resources ==
*DIME Analytics’ guidelines on data cleaning [https://github.com/worldbank/DIME-Resources/blob/master/stata1-3-cleaning.pdf 1] and [https://github.com/worldbank/DIME-Resources/blob/master/welcome-datacleaning.pdf 2]
* The Stata Cheat Sheets on [http://geocenter.github.io/StataTraining/pdf/StataCheatsheet_processing_15_June_2016_TE-REV.pdf Data processing] and [http://geocenter.github.io/StataTraining/pdf/StataCheatsheet_Transformation15_June_2016_TE-REV.pdf Data Transformation] are helpful reminder of relevant Stata code.
* The [https://github.com/Quartz/bad-data-guide#values-are-missing Quartz guide to bad data] on Github has lots of helpful tips for dealing with the kind of data problems that often come up in real world settings.
*See this [[Checklist:_Data_Cleaning| data cleaning checklist]] to ensure that common cleaning actions have been completed. Note that this is not an exhaustive list. Such a list is impossible to create as the individual datasets and the analysis require different cleaning depending on context.


[[Category: Data Cleaning]] [[Category: Check Lists]]
[[Category: Data Cleaning]] [[Category: Check Lists]]

Latest revision as of 14:26, 12 June 2019

Get printable version here. For more detailed instructions on how to implement the different tasks in this checklist, see Data Cleaning. Note that this checklist is best displayed in Chrome, Firefox, Safari or any other modern browser.

Back to Parent

This article is part of the topic Check Lists

Additional Resources

  • DIME Analytics’ guidelines on data cleaning 1 and 2
  • The Stata Cheat Sheets on Data processing and Data Transformation are helpful reminder of relevant Stata code.
  • The Quartz guide to bad data on Github has lots of helpful tips for dealing with the kind of data problems that often come up in real world settings.
  • See this data cleaning checklist to ensure that common cleaning actions have been completed. Note that this is not an exhaustive list. Such a list is impossible to create as the individual datasets and the analysis require different cleaning depending on context.