- 1 Read First
- 2 Guidelines
- 2.1 Why is sample size important?
- 2.2 What factors influence what sample size I need?
- 2.3 Additional considerations for clustered sampling
- 2.4 Indirect Effects on Sample Size
- 3 Back to Parent
- 4 Additional Resources
The size of the sample will determine whether you can distinguish an impact of the studied program or intervention that is statistically distinguishable from the null.
Why is sample size important?
It is rarely cost-effective to collect data from the full population of interest. Rather, a sample is used. The size of the sample will effect the precision of your estimates. It is important to think about the trade-offs between accuracy and cost, i.e. the marginal value of added observations.
What factors influence what sample size I need?
Below find a description of each element in the formula, including expected direction of relation to sample size.
Expected size of impact
D: Minimum Detectable Effect Size (MDES)
The lowest effect size you want to be able to precisely distinguish from zero. If you set the MDES at 10%, a 7% increase in income would not necessarily be distinguishable from a null effect. The appropriate assumption to use for MDES will depend on the expected impact of the program. For example, if a program is expected to raise incomes by a minimum of 10%, it may not be necessary to be able to distinguish program impacts of less than 10% from a null effect.
The smaller the effect we want to be able to distinguish, the larger the sample size required.
Variation in outcome
σ: standard deviation in population outcome measure
The higher the level of variance in the outcome, the larger the sample size required, as the image below illustrates.
Statistical confidence / precision
α relates to “type I error” -- typically set this to 5%
β relates to “type II error” -- typically set this to 80%
The more precision, the larger the sample size required
Additional considerations for clustered sampling
Level of clustering
ρ: intracluster correlation effect m: number of units per cluster As a whole, the second part of this formula (distinguishing it from the SRS formula above), represents what is referred to as the Design Effect
Indirect Effects on Sample Size
MDE is “diluted” by proportion of compliers
If program take up is 50%, this means that the observed effect in treatment group will be half the size when compared to 100% take up
If MDE is half the size, n quadruples...
Poor data quality effectively increases required sample size
- Missing observations
- High measurement error
Best way to avoid this is a field coordinator on the ground monitoring data collection.
Back to Parent
This article is part of the topic Sampling & Power Calculations
Please add here any articles related to this topic, with a brief description and link