Selection bias is when participants in a program (treatment group) are systematically different from non-participants (control group). Selection bias affects the validity of program evaluations whenever selection of treatment and control groups is done non-randomly.
- Selection bias means that treatment and control groups are not comparable, and therefore the impact evaluation is not internally valid.
- The only foolproof way to avoid selection bias is to do a randomized control trial.
Selection bias can be either positive or negative. For example, an evaluation of an after-school program for at-risk youth compares those who volunteered for the program to those who did not. It is likely that the volunteers are more motivated and eager than those who did not, which may make the program appear more effective than it is. On the other hand, comparing evaluating the program by comparing outcomes for the participants in the after-school program to 'average' students may understate the effect of the program, as the at-risk youth likely perform worse on average.
There is no reliable way to estimate the size of selection bias.
How to avoid selection bias?
The best way to avoid selection bias is to use randomization. Randomizing selection of beneficiaries into treatment and control groups, for example, ensures that the two groups are comparable in terms of observable and unobservable characteristics.
It is important to randomize both at the level of treatment and to have a random sample of survey respondents. For example, for an evaluation of the impact of tablets in the classroom, treatment is randomly assigned by classroom and for the survey, a random sample of students is drawn per classroom.
Non-randomized evaluations attempt to avoid selection bias by making the control group as comparable as possible, typically by matching on observables. The more data that is available for matching, the more convincing this is.
Selection bias in Sampling?
Selection bias can be a problem even in randomized control trials. For example:
- High levels of attrition between survey rounds: the respondents for the follow-up survey may be systematically different. For example, this would be the case if wealthier households are more likely to migrate and therefore the sample at follow-up would be systematically poorer.
- High item non-response: missing data can create worries of selection bias within a particular question. For example, if half the sample answers 'doesn't know' to a question on income, those respondents will be excluded from the analysis. However, if people who have lower levels of numeracy or less systematic income are less likely to know, this can create bias.
- Survey mode: for example, phone surveys limit the set of respondents to those who have access to a mobile phone. If the full population of interest does not, this could bias responses (respondents of higher socioeconomic status are more likely to have phones, but impacts may differ by socioeconomic status)
- list here other articles related to this topic, with a brief description and link