Randomized Evaluations: Principles of Study Design

Jump to: navigation, search

Randomized evaluations are field experiments involving the assignment of subjects randomly to one of two groups: one, the treatment group, which is receiving the policy intervention being evaluated, and two, the control group, which remains in status-quo/untreated.

For the purpose of this article, we will rely on the example of a study designed to evaluate the effects of cash transfers on farm outputs. In this setup, members of the treatment group will receive cash transfers in addition to an information pamphlet related to cropping systems like inter-cropping, while the control group will only receive the pamphlet.

The results of such a trial are used to then answer questions about the effectiveness of an intervention, and can prevent inefficient allocation of resources to programs that might prove to be ineffective after a study on a small-scale.

Read First

  • This section covers the key principles of study design to guide researchers on best-practices in conducting field evaluations. For more about evaluations, and their types, please see experimental methods.
  • There can be various biases that can affect results of an experiment, such as selection bias, or recall bias.
  • We will also look at ways of tackling these through efficient study design.

Step 1: Comprehensive protocol for the evaluation

The first step involves selecting a hypothesis (assumption) that specifies the anticipated link between the predictor variables and the outcomes, that is, the null hypothesis.

In the cash transfer experiment mentioned at the beginning, this would involve laying out the hypothesis that cash transfers do indeed have an effect on farm output. We would also need to lay out the target population, in terms of geographical coverage, and perhaps an exclusion criteria, limiting the study to only include farmers with annual incomes under $1000.

Key Concerns

  • The sample to be studied must be clearly specified, including exclusion/inclusion criteria.
  • Pilot studies can help identify ideal target population, as well as ascertain take-up rates, that can help with sample-size and power calculations.
  • The sample size must be selected in a manner that provides a high probability of detecting a significant effect. (See power calculations for information on under-powered evaluations).
  • A good study can be designed by consulting experienced researchers.

Step 2: Randomization

Broadly speaking, randomization involves allocating the sample selected (based on calculations in Step 1) into one of two groups: treatment group, and control group. This is the basis for establishing the causal effect, which is the cornerstone of a randomized evaluation. (See Randomization in Stata for technical explanation)

In our cash-transfer study, for instance, this would involve randomly assigning half, or close to half, of the farmers to the treatment and control groups. Alongside this process, we would also need to collect baseline data on demographics and output in previous period, in order to ensure our treatment and control groups are indeed similar in all respects except the intervention.

Key Concerns

  • Effective randomization is important to tackle the issue of confounding, that is, when a characteristic is associated with the intervention, as well as the outcome. For example, if younger people are less likely to experience symptoms of heart disease, and also less likely to visit their doctors for annual check-ups, then an intervention that tries to evaluate the impact of a coronary-health campaign on hospital visits is said to be confounded by age.
  • This process must be concealed from the investigator. (see also Research Ethics)
  • Initial baseline characteristics must be measured across the two groups, and these should not be significantly different. One solution to this problem is randomization inference, a concept that is gaining ground in the field of randomized evaluations.
  • Care must be taken to ensure minimum attrition, that is, dropping out of some subjects after assignment.
  • Regardless, outcomes should still be compared against initial members of the control group - this is called intention-to-treat.
Fig.1. Attrition

Step 3: Intervention, followed by measuring the outcomes

The next step is to apply the intervention, and then measure outcomes, called endline characteristics, after the pre-determined time-period has passed since the intervention.

In our cash-transfer example, this would involve providing cash transfers to the treatment group while ensuring that the control group members do not receive them. After the intervention timeline is completed, we will collect data on farm outputs again for the two groups and compare.

Key Concerns

  • Sufficient time should be given for the intervention to have its intended effect. Premature calculations of outcomes can indirectly affect the power of the evaluation by affecting the minimum detectable effect size(MDES).
  • Blinding of the investigator to the intervention is crucial. It is also important for the subject to be blind to both, the assignment as well as the intervention, to prevent spillovers. This is called double blinding.
  • Also refer to measuring abstract concepts, and questionnaire design.
Fig.2. Spillovers

Final Step: Quality Control

Quality control is not just a step that needs to be exercised when measuring outcomes. It is a constant, rigorous process that needs to be carried out at various stages to ensure the integrity of the evaluation.

It includes dealing with concerns about design, measurement of outcomes, as well as handling data, and ensuring anonymity of subjects. (See data quality)

Key Concerns

  • Lack of quality control can lead to erroneous conclusions, for instance, evidence of ineffective treatment even though the problem really was ineffective evaluation.
  • This is where training manuals can help, by setting out rigorous standards for investigators, and providing ways to enforce these standards.
  • Training can also include information on standardizing data-collection and reporting.
  • See Data for Development Impact for a more comprehensive discussion on quality control.

Back to Parent

This article is part of Experimental Methods. However, most of the principles highlighted above can be applied in general to all kinds of evaluations, and therefore act as a crucial pointer to anyone looking to foray into the world of evaluations.

Additional Resources