Difference between revisions of "Data Analysis"
Line 14: | Line 14: | ||
== Preparing the Data Set for Analysis == | == Preparing the Data Set for Analysis == | ||
Once data is collected, it must be recombined into a final format for analysis, including the construction of derived variables not present in the initial collection. See [[Data Cleaning]]. | |||
== Outputting the Result of the Analysis == | == Outputting the Result of the Analysis == |
Revision as of 19:37, 6 November 2017
Read First
Data analysis typically has two stages:
- Exploratory Analysis
- Final Analysis
In exploratory analysis, emphasis will be on producing easily understood summaries of the trends in the data so that the reports, publications, presentations, and summaries that need to be produced can begin to be outlined. Once those stories begin to come together, the code is re-written in a "final" form which would be appropriate for public release with the results.
Preparing the Data Set for Analysis
Once data is collected, it must be recombined into a final format for analysis, including the construction of derived variables not present in the initial collection. See Data Cleaning.
Outputting the Result of the Analysis
Just as the rest of your code the output of results must also be replicable. There are different degrees of replicability. The basic that is obviously a must is that all parts of the results used in the table is replicable.
Even better is that all part of the same table is outputted in a single file. Sometimes tables are consist of results from multiple estimations and it is preferably that they are outputted to a single file. See Stata command estout.
Optimally all tables are outputted in a way that no manual formatting is required. A very common tool for that is LaTeX. DIME has prepared material for getting started with LaTeX that assumes no knowledge in LaTeX and aims to explain the work flow from software as Stata and R to final reports using LaTeX. [[1]]
Different Specific Types of Analysis
Principal Component Analysis
Principal Component Analysis (PCA) is an analytical tool looks to explain the maximum amount of variance with the fewest number of principal components.
Cost Effectiveness Analysis
One type is Cost-effectiveness Analysis