Difference between revisions of "Matching"

Revision as of 14:38, 7 August 2023

Matching is a quasi-experimental method in which the researcher uses statistical techniques to construct an artificial control group by matching each treated unit with a non-treated unit of similar characteristics. Matching is useful for estimating the impact of a program or event for which it is not ethically or logistically feasible to randomize. This page outlines approaches to and limitations of matching methods.

Read First

Matching requires extensive datasets with information on the characteristics of treated and non-treated units before the treatment.
To implement matching in Stata, use the iematch command. For more information on matching implementation, see Additional Resources.
Matching methods rely on the assumption that there are no systematic differences in unobserved characteristics between the treatment units and the matched comparison units.

Overview

Matching is a quasi-experimental method in which the researcher uses statistical techniques to construct an artificial control group by matching each treated unit with a non-treated unit of similar characteristics. Consider, for example, a researcher who wants to measure the effect of a water filter installment program on health outcomes; however, the program doesn’t have clear assignment rules or randomization to explain why participating households enrolled in the program and why non-participating households did not.

Using a dataset that contains information on the units that enrolled in the program and units that didn’t, the researcher can use matching methods to identify non-participant units most similar to the participant units. The dataset should contain baseline data. The characteristics on which the units are matched should be pre-intervention traits; if not, matching is a very risky approach. Then, the researcher can approximate the characteristics that most influence the decision of the unit to enroll and find matches to serve as the control group. These matches make it possible to estimate the counterfactual and the impact of the program.

Approaches and Variations

Propensity Score Matching

Propensity score matching is a matching method that computes that probability that a unit will enroll in the program. This probability is called the propensity score and is used to match units in the treatment group with unenrolled units of similar propensity scores. For more information, see Propensity Score Matching.

Matched Difference-in-Differences

Matched difference-in-differences combines matching methods with difference-in-differences to reduce the risk of bias in the estimation. To implement:

Match treatment units to control units
Compute the difference-in-differences.

This method controls for any unobserved, time-invariant characteristics between the two groups. For more information, see Difference-in-Differences.

Synthetic Control Method

The synthetic control method estimates impact for an event or intervention (i.e. political event, natural disaster) experienced by a single unit (i.e. state, country). The method uses data on the treated unit and the untreated units, weighting each untreated unit in a manner that most closely resembles the treated unit to ultimately create a synthetic control. This process requires extensive panel data on the characteristics of the treated and untreated units.

Limitations

Matching methods have two main limitations: they require extensive datasets to properly match units and they rely on broad assumptions that are difficult to prove. First, matching requires extensive, datasets – ideally on baseline characteristics. This is not always available. Second, the validity of matching methods relies on the assumption that there are no systematic differences in unobserved characteristics between the treatment units and the matched comparison units. It is difficult to prove this assumption correct, making matching methods a less robust approach than, for example, randomized control trials (RCTs) or regression discontinuity design (RDD), which do not require this assumption. As mentioned, the matched difference-in-differences method controls for unobserved, time-invariant characteristics.

Back to Parent

This article is part of the topic Quasi-Experimental Methods.

Additional Resources

Barbara Sianesi’s An Introduction to Matching Methods for Causal Inference and Their Implementation on Stata
King et al.'s Comparative Effectiveness of Matching Methods for Causal Inference
The University of Wisconsin’s Propensity Score Matching in Stata
Heinrich et al.’s Primer for Applying Propensity Score Matching
Rosenbaum’s Design of Observational Studies
Gertler et al.’s Impact Evaluation in Practice

@@ Line 9: / Line 9: @@
 Matching is a [[Quasi-Experimental Methods | quasi-experimental method]] in which the researcher uses statistical techniques to construct an artificial control group by matching each treated unit with a non-treated unit of similar characteristics. Consider, for example, a researcher who wants to measure the effect of a water filter installment program on health outcomes; however, the program doesn’t have clear assignment rules or [[Randomization | randomization]] to explain why participating households enrolled in the program and why non-participating households did not.
-Using a '''dataset''' that contains information on the units that enrolled in the program and units that didn’t, the researcher can use matching methods to identify non-participant units most similar to the participant units. The '''dataset''' should contain baseline data. The characteristics on which the units are matched should be pre-intervention traits; if not, matching is a very risky approach. Then, the researcher can approximate the characteristics that most influence the decision of the unit to enroll and find matches to serve as the control group. These matches make it possible to estimate the counterfactual and the impact of the program.
+Using a [[Master Dataset|dataset]] that contains information on the units that enrolled in the program and units that didn’t, the researcher can use matching methods to identify non-participant units most similar to the participant units. The '''dataset''' should contain baseline data. The characteristics on which the units are matched should be pre-intervention traits; if not, matching is a very risky approach. Then, the researcher can approximate the characteristics that most influence the decision of the unit to enroll and find matches to serve as the control group. These matches make it possible to estimate the counterfactual and the impact of the program.
 ==Approaches and Variations==

Navigation

Tools

Difference between revisions of "Matching"

Revision as of 14:38, 7 August 2023

Contents

Read First

Overview

Approaches and Variations

Propensity Score Matching

Matched Difference-in-Differences

Synthetic Control Method

Limitations

Back to Parent

Additional Resources

Difference between revisions of "Matching"

Revision as of 14:38, 7 August 2023

Read First

Overview

Approaches and Variations

Propensity Score Matching

Matched Difference-in-Differences

Synthetic Control Method

Limitations

Back to Parent

Additional Resources

follow us

newsletter