Tidying Data
Data is often acquired in various shapes and sizes, but it is most commonly received in the form of data tables. Data tables can organize information in different ways, but not all of them result in datasets that are easy to work with. Fortunately, numerous papers on database management have identified the format that makes interacting with data very easy. In the context of statistics, this ideal format is called tidy data. Specifically, a tabular dataset is tidy when - each column corresponds to one variable in the dataset, each row corresponds to one observation, and all variables in the dataset have the same unit of observation.