Data cleaning is an essential step between data collection and data analysis. Raw primary data is always imperfect and needs to be prepared for a high quality analysis and overall replicability.
Data Cleaning
De-identification is the process of removing or masking personally identifiable information (PII) in order to reduce the risk that subjects’ identities be connected with data.
In the context of a survey, personally identifiable information (PII) are variables that can, either on their own or in combination with other variables, be used to identify a single surveyed individual with reasonable certainty.
Primary data collection and cleaning involve highly repetitive but extremely important processes that contribute to high quality reproducible research.
Get printable version here. For more detailed instructions on how to implement the different tasks in this checklist, see Data Cleaning. Note that this checklist is best displayed in Chrome, Firefox, Safari or any other modern browser.
Get printable version here. For more detailed instructions on Sections 1 and 2 of this checklist, see Data Cleaning and De-identification.
