Difference between revisions of "Sampling & Power Calculations"

Jump to: navigation, search
Line 1: Line 1:
<span style="font-size:150%">
<span style="color:#ff0000"> '''NOTE: this article is only a template. Please add content!''' </span>
</span>
add introductory 1-2 sentences here
== Read First ==
* include here key points you want to make sure all readers understand
== Guidelines ==
* organize information on the topic into subsections. for each subsection, include a brief description / overview, with links to articles that provide details
===Subsection 1===
===Subsection 2===
===Subsection 3===
== Back to Parent ==
This article is part of the topic [[*topic name, as listed on main page*]]
== Additional Resources ==
* list here other articles related to this topic, with a brief description and link
[[Category: *category name* ]]
== Read First ==
== Read First ==
Creating a statistically valid sample representative of the population of interest for the impact evaluation is a crucial aspect of impact evaluation design. This task can be roughly divided into two phases: sample design and implementation. Implementation typically means writing a software program to enact the sampling strategy. Sampling code requires extra care! Errors cannot be corrected after the intervention (or survey) has started. Always ask a second person to doublecheck your code before you use the sampling it generated in the field. For DIME projects, you should always consult any member of DIME Analytics before sending a sample to the field.
Creating a statistically valid sample representative of the population of interest for the impact evaluation is a crucial aspect of impact evaluation design. This task can be roughly divided into two phases: sample design and implementation. Implementation typically means writing a software program to enact the sampling strategy. Sampling code requires extra care! Errors cannot be corrected after the intervention (or survey) has started. Always ask a second person to doublecheck your code before you use the sampling it generated in the field. For DIME projects, you should always consult any member of DIME Analytics before sending a sample to the field.
Line 8: Line 35:
=== Sample Design ===
=== Sample Design ===
==== Power Calculations ====
==== Power Calculations ====
[[Power Calculations]] are a statistical tool to help determine sample size. You can estimate either sample size or minimum detectable effect. Which you should estimate depends on the research design and constraints of a specific impact evaluation. The types of questions you can answer through power calculations include:
[[Power Calculations]] are a statistical tool to help determine sample size. This is important, a sample that is too small means that you will not be able to detect a statistically significant effect, and a sample size that is too large can be a waste of limited resources.
You can estimate either sample size or minimum detectable effect. Which you should estimate depends on the research design and constraints of a specific impact evaluation. The types of questions you can answer through power calculations include:


* Given that I want to be able to statistically distinguish program impact of a 10% change in my outcome of interest, what is the minimum sample size needed?
* Given that I want to be able to statistically distinguish program impact of a 10% change in my outcome of interest, what is the minimum sample size needed?
Line 26: Line 54:


==== Sampling Unit ====
==== Sampling Unit ====
The most basic sampling technique is a [[Simple Random Sample]]. Often, impact evaluations use [[Stratified Random Sample]] and/or [[Multi-Stage (Clustered) Sample]].
The most basic sampling technique is a [[Simple Random Sample]]. This works well for studies of small populations, with a complete sampling frame for the population. More typically, impact evaluations rely on[[Multi-stage (Cluster) Sampling|Multi-Stage]] or [[Multi-stage (Cluster) Sampling|Clustered]] Sampling, often with [[Stratified Random Sample|stratification]].  


==== Randomization in Stata ====
==== Randomization in Stata ====
 
All sampling code you produce must be reproducible. Any code that includes randomization needs version, seed and sort to be reproducible. See [[Randomization in Stata|reproducible randomization in Stata]] for details.
All code work you produce should be reproducible. Any code that includes randomization needs version, seed and sort to be reproducible. See [[Randomization in Stata|reproducible randomization in Stata]] for details.


== Additional Resources ==
== Additional Resources ==


* [http://unstats.un.org/unsd/demographic/sources/surveys/Series_F98en.pdf Designing Household Survey Samples: Practical Guidelines] United Nations, Department of Economic and Social Affairs, Statistics Division - 2008
* [http://unstats.un.org/unsd/demographic/sources/surveys/Series_F98en.pdf Designing Household Survey Samples: Practical Guidelines] United Nations, Department of Economic and Social Affairs, Statistics Division - 2008

Revision as of 17:39, 6 February 2017

NOTE: this article is only a template. Please add content!


add introductory 1-2 sentences here


Read First

  • include here key points you want to make sure all readers understand


Guidelines

  • organize information on the topic into subsections. for each subsection, include a brief description / overview, with links to articles that provide details

Subsection 1

Subsection 2

Subsection 3

Back to Parent

This article is part of the topic *topic name, as listed on main page*


Additional Resources

  • list here other articles related to this topic, with a brief description and link

Read First

Creating a statistically valid sample representative of the population of interest for the impact evaluation is a crucial aspect of impact evaluation design. This task can be roughly divided into two phases: sample design and implementation. Implementation typically means writing a software program to enact the sampling strategy. Sampling code requires extra care! Errors cannot be corrected after the intervention (or survey) has started. Always ask a second person to doublecheck your code before you use the sampling it generated in the field. For DIME projects, you should always consult any member of DIME Analytics before sending a sample to the field.

Do not randomize the sample from a temporary data set or a data set constructed for only this purpose. Instead, always randomize from a Master data set. If no master data set exist for the unit of observation you are sampling on, then it is very important that you start by creating that.

Guidelines

Sample Design

Power Calculations

Power Calculations are a statistical tool to help determine sample size. This is important, a sample that is too small means that you will not be able to detect a statistically significant effect, and a sample size that is too large can be a waste of limited resources. You can estimate either sample size or minimum detectable effect. Which you should estimate depends on the research design and constraints of a specific impact evaluation. The types of questions you can answer through power calculations include:

  • Given that I want to be able to statistically distinguish program impact of a 10% change in my outcome of interest, what is the minimum sample size needed?
  • Given that I only have budget to sample 1,000 households, what is the minimum effect size that I will be able to distinguish from a null effect?


Population (Sampling Frame)

What is the population of interest for the impact evaluation? In other words, what population does your sample need to represent? This will vary depending on the study design. Some data on the overall population is required, in order to draw a representative sample.

Stratification

To ensure a representative sample you can use stratification. A typical variable to stratify on is gender. When you stratify on gender you guarantee that your sample has the same ratio of women as the population frame you are sampling from.


Sample Selection

Sampling Frame

You should always work from a master data set of the population. If you do not have a master data set for the unit of observation you are sampling from (for example, households, villages, clinics, schools) you should always start by creating one.

Sampling Unit

The most basic sampling technique is a Simple Random Sample. This works well for studies of small populations, with a complete sampling frame for the population. More typically, impact evaluations rely onMulti-Stage or Clustered Sampling, often with stratification.

Randomization in Stata

All sampling code you produce must be reproducible. Any code that includes randomization needs version, seed and sort to be reproducible. See reproducible randomization in Stata for details.

Additional Resources