MediaWiki API result

This is the HTML representation of the JSON format. HTML is good for debugging, but is unsuitable for application use.

Specify the format parameter to change the output format. To see the non-HTML representation of the JSON format, set format=json.

See the complete documentation, or the API help for more information.

{
    "batchcomplete": "",
    "continue": {
        "gapcontinue": "Remote_Sensing",
        "continue": "gapcontinue||"
    },
    "warnings": {
        "main": {
            "*": "Subscribe to the mediawiki-api-announce mailing list at <https://lists.wikimedia.org/postorius/lists/mediawiki-api-announce.lists.wikimedia.org/> for notice of API deprecations and breaking changes."
        },
        "revisions": {
            "*": "Because \"rvslots\" was not specified, a legacy format has been used for the output. This format is deprecated, and in the future the new format will always be used."
        }
    },
    "query": {
        "pages": {
            "132": {
                "pageid": 132,
                "ns": 0,
                "title": "Recall Bias",
                "revisions": [
                    {
                        "contentformat": "text/x-wiki",
                        "contentmodel": "wikitext",
                        "*": "<onlyinclude>Recall bias is bias caused by inaccurate or incomplete recollection of events by the respondent. It is a particular concern for retrospective survey questions. </onlyinclude>\n\n== Read First ==\n* Research shows that lower salience and longer recall periods increase forgetfulness [http://blogs.worldbank.org/impactevaluations/decomposing-response-error-improve-consumption-survey-design] [http://www.sciencedirect.com/science/article/pii/S0304387811000939]\n\n== Guidelines ==\n\n=== How long is \"too long\" for recall? ===\nIt depends on the type of event respondents are being asked to recall. Research shows strong evidence of recall bias in food consumption, but little evidence for agricultural production. \nAs a rule of thumb, infrequent events (e.g. purchases of major assets) will be memorable for longer periods of time than routine events (e.g. use of public transportation). \n\n===How to avoid recall bias?===\nUseful strategies:\n# Reduce recall periods as much as possible. For example, add follow-up surveys by phone, or personal diaries.\n# Conduct focus groups to understand salience of the indicator in question, and gauge a reasonable recall period. \n# When [[Survey Pilot|Piloting your Survey]], carefully test recall periods; if possible try shorter and longer periods and check for differences in variance\n\n\n== Back to Parent ==\nThis article is part of the topic [[Questionnaire Design]]\n\n\n== Additional Resources ==\n* DIME Analytics (World Bank), [https://osf.io/rqb5m/ Survey Instrument Design and Pilot]\n* IFPRI, [https://www.ifpri.org/blog/do-you-remember-measuring-anchoring-bias-recall-data Measuring anchoring bias in recall data]\n* Jed Friedman (World Bank), [http://blogs.worldbank.org/impactevaluations/decomposing-response-error-improve-consumption-survey-design Response Error in Consumption Surveys], and the [https://documents1.worldbank.org/curated/en/122481467999693721/pdf/WPS7646.pdf related paper]\n* Financial Access Initiative, [http://www.financialaccess.org/blog/2015/7/30/reliability-of-self-reported-data-recall-bias The Reliability of Self-reported Data]\n* Jishnu Das, Jeffrey Hammer, Carolina S\u00e1nchez-Paramo, [http://www.sciencedirect.com/science/article/pii/S0304387811000708 The impact of recall periods on reported morbidity and health seeking behavior]\n* Kathleen Beegle, Calogero Carletto, Kristen Himelein, [http://www.sciencedirect.com/science/article/pii/S0304387811000939 Reliability of recall in agricultural data]\n* Philip Wollburg, Marco Tiberti and Alberto Zezza, [https://www.sciencedirect.com/science/article/pii/S0306919220302098?fbclid=IwAR1at8ueH2h4j3mHXlGvcIGEX4wgoxTgN6IdmxGejJJsz3DJkmra2bn6jas Recall length and measurement error in agricultural surveys]\n[[Category: Questionnaire Design]]"
                    }
                ]
            },
            "68": {
                "pageid": 68,
                "ns": 0,
                "title": "Regression Discontinuity",
                "revisions": [
                    {
                        "contentformat": "text/x-wiki",
                        "contentmodel": "wikitext",
                        "*": "Regression Discontinuity Design (RDD) is a [[Quasi-Experimental Methods | quasi-experimental]] impact evaluation method used to evaluate programs that have a cutoff point determining who is eligible to participate. RDD allows researchers to compare the people immediately above and below the cutoff point to identify the impact of the program on a given outcome. This page will cover when to use RDD, sharp vs. fuzzy design, how to interpret results, and methods of treatment effect estimation. \n\n==Read First==\n*A RDD requires a continuous eligibility score on which the population of interest is ranked and a clearly defined cutoff point above or below which the population is determined eligible for a program. \n*The eligibility index must be continuous around the cutoff point and the population of interest around the cutoff point must be very similar in observable and unobservable characteristics.  \n* RDD estimates local average treatment effects around the cutoff point, where treatment and comparison units are most similar. It provides useful evidence on whether a program should be cut or expanded at the margin. However, it does not answer whether a program should exist or not: in this case, the average treatment effect provides better evidence than the local average treatment effect.\n\n==Overview==\n\nRDD is a key method in the toolkit of any applied researcher interested in unveiling the causal effects of policies. The method was first used in 1960 by Thistlethwaite and Campbell, who were interested in identifying the causal impacts of merit awards, assigned based on observed test scores, on future academic outcomes ([https://www.princeton.edu/~davidlee/wp/RDDEconomics.pdf Lee and Lemieux, 2010]). The use of RDD has increased exponentially in the last few years. Researchers have used it to evaluate electoral accountability; SME policies; social protection programs such as conditional cash transfers; and educational programs such as school grants. \n\nIn RDD, assignment of treatment and control is not random, but rather based on some clear-cut threshold (or cutoff point) of an observed variable such as age, income, and score. Causal inference is then made comparing individuals on both sides of the cutoff point. \n\n==Conditions and Assumptions==\n\n===Conditions===\n\nTwo main conditions are needed in order to apply a regression discontinuity design:\n#A continuous eligibility index: a continuous measure on which the population of interest is ranked (i.e. test score, poverty score, age). \n#A clearly defined cutoff point: a point on the index above or below which the population is determined to be eligible for the program. For example, students with a test score of at least 80 of 100 might be eligible for a scholarship, households with a poverty score less than 60 out of 100 might be eligible for food stamps, and individuals age 67 and older might be eligible for pension. The cutoff points in these examples are 80, 60, and 67, respectively. The cutoff point may also be referred to as the threshold.\n\n===Assumptions===\n\n#The eligibility index should be continuous around the cutoff point. There should be no jumps in the eligibility index at the cutoff point or any other sign of individuals manipulating their eligibility index in order to increase their chances of being included in or excluded from the program. The McCrary Density Test tests this assumption by checking eligibility index density function for discontinuities around the cutoff point.\n#Individuals close to the cutoff point should be very similar, on average, in observed and unobserved characteristics. In the RD framework, this means that the distribution of the observed and unobserved variables should be continuous around the threshold. Even though researchers can check similarity between observed covariates, the similarity between unobserved characteristics has to be assumed. This is considered a plausible assumption to make for individuals very close to the cutoff point, that is, for a relatively narrow window. \n\n==Fuzzy vs. Sharp Design==\nThe assignment rule indicates how people are assigned or selected into the program. In practice, the assignment rule can be deterministic or probabilistic (see [https://onlinelibrary.wiley.com/doi/abs/10.1111/1468-0262.00183 Hahn et al., 2001]). If deterministic, the regression discontinuity takes a sharp design; if probabilistic, the regression discontinuity takes a fuzzy design. \n\n[[File:RD_sharp_fuzzy.png|upright=2.5]]\n===Sharp RDD===\nIn sharp designs, the probability of treatment changes from 0 to 1 at the cutoff. There are no cross-overs and no no-shows. For example, if a scholarship award is given to all students above a threshold test score of 80%, then the assignment rule defines treatment status deterministically with probabilities of 0 or 1. Thus, the design is sharp.\n\n===Fuzzy RDD===\nIn fuzzy designs, the probability of treatment is discontinuous at the cutoff, but not to the degree of a definitive 0 to 1 jump. For example, if food stamp eligibility is given to all households below a certain income, but not all households receive the food stamps, then the assignment rule defines treatment status probabilistically but not perfectly. Thus, the design is fuzzy. The fuzziness may result from imperfect compliance with the law/rule/program; imperfect implementation that treated some non-eligible units or neglected to treat some eligible units; spillover effects; or manipulation of the eligibility index. \n\nA fuzzy design assumes that, in the absence of the assignment rule, some of those who take up the treatment would not have participated in the program. The eligibility index acts as a nudge. The subgroup that participates in a program due to the selection rule is called ''compliers'' (see e.g. [https://pdfs.semanticscholar.org/8714/260129e51abb09fd89d6ff79065af17bb106.pdf Angrist and Imbens (1994)], and [http://www.math.mcgill.ca/dstephens/AngristIV1996-JASA-Combined.pdf Imbens, Angrist, and Rubin (1996)]); under the RDD, the treatment effects are estimated only for the group of compliers. The estimates of the causal effect under the fuzzy design require more assumptions than under the sharp design, but are weaker than any IV approach. \n\n==How to Interpret==\n\nRDD estimates local average treatment effects around the cutoff point, where treatment and comparison units are most similar. The units to the left and right of the cutoff look more and more similar as they near the cutoff. Given that the design meets all assumptions and conditions outlined above, the units directly to the left and right of the cutoff point should be so similar that they lay the groundwork for a comparison as well as does randomized assignment of the treatment.\n\nBecause the RDD estimates the local average treatment effects around the cutoff point, or locally, the estimate does not necessarily apply to units with scores further away from the cutoff point. These units may not be as similar to each other as the eligible and ineligible units close to the cutoff. RDD\u2019s inability to compute an average treatment effect for all program participants is both a strength and a limitation, depending on the question of interest. If the evaluation primarily seeks to answer whether the program should exist or not, then the RDD will not provide a sufficient answer: the average treatment effect for the entire eligible population would be the most relevant parameter in this case. However, if the policy question of interest is whether the program should be cut or expanded at the margin, then the RDD produces precisely the local estimate of interest to inform this important policy decision. \n \nNote that the most recent advances in the RDD literature suggest that it is not very accurate to interpret a discontinuity design as a local experiment. To be considered as good as a local experiment for the units close enough to the cutoff point, one must use a very narrow bandwidth and drop the assignment variable (or a function of it) from the regression equation. For more details on this point see [http://faculty.chicagobooth.edu/max.farrell/research/Calonico-Cattaneo-Farrell-Titiunik_2018_RESTAT.pdf Cattaneo et al. (2018)]. \n\n==Treatment Effect Estimation==\n\n===Parametric Methods===\nThe estimation of the treatment effects can be performed parametrically as follows:\n\n<math>y_i=\\alpha+\\delta X_i+h(Z_i)+\\varepsilon_i</math>\n\nwhere <math>y_i</math> is the outcome of interest of individual i, <math>X_i</math> is an indicator function that takes the value of 1 for individuals assigned to the treatment and 0 otherwise, <math>Z_i</math> is the assignment variable that defines an observable clear cutoff point, and <math>h(Z_i )</math> is a flexible function in <math>Z</math>. The identification strategy hinges on the exogeneity of<math> Z</math> at the threshold. It is standard to center the assignment variable at the cutoff point. In this case, one would use <math>h(Z_i-Z_0 )</math> instead with <math>Z_0</math> being the cutoff. With that assumption, the parameter of interest, <math>\\delta</math>, provides the treatment effect estimate. \n\nIn the case of a sharp design with perfect compliance, the parameter <math>\\delta</math> identifies the average treatment effect on the treated (ATT). In the case of a fuzzy design, <math>\\delta</math> corresponds to the intent-to-treat effects \u2013 i.e. the effect of the eligibility rather than the treatment itself on the outcomes of interest. The LATE can be estimated using an IV approach. This could be done as follows:\n\nFirst stage: <math>P_i =\\alpha+\\delta X_i+h(Z_i)+\\varepsilon_i</math>\n\nSecond stage: <math>y_i=\\mu+\\delta\\hat{P_i}+h(Z_i)+u_i</math>,\n\nwhere <math>P_i</math> is a dummy variable that identifies actual participation of individual i in the program/intervention. Notice that with a parametric specification, the researcher should specify <math>h(Z_i )</math> the same way in both regressions ([http://faculty.smu.edu/millimet/classes/eco7377/papers/imbens%20lemieux.pdf Imbens and Lemieux, 2008]). \n\nDespite the natural appeal of parametric method such as the one just outlined, this method has some direct practical implications. First, the right functional form of <math>h(Z_i )</math> is never known. Researchers are thus encouraged to fit the model with different specifications of <math>h(Z_i )</math> ([https://www.princeton.edu/~davidlee/wp/RDDEconomics.pdf Lee and Lemieux, 2010]), particularly when they must consider data farther away from the cutoff point to have enough statistical power. \n\nAlthough some authors test the sensitivity of results using high order polynomials, there is some recent discussion arguing against the use of high order polynomials given that they assign too much weight to observations away of the cutoff point ([https://www.nber.org/papers/w20405.pdf Imbens and Gelman, 2014]).  \n\n===Non-Parametric Methods===\nAnother way of estimating treatment effects with RDD is via non-parametric methods. In fact, the use of non-parametric methods has been growing in the last few years to both estimate treatment effects and check robustness of estimates obtained parametrically. This might be partially explained by the increasing number of available Stata commands, but perhaps more importantly, by some attractive properties of the method compared to parametric ones (see i.e. [https://www.nber.org/papers/w20405.pdf Imbens and Gelman (2014)] for this point). In particular, non-parametric methods provide estimates based on data closer to the cut-off, reducing bias that may otherwise result from using data farther away from the cutoff to estimate local treatment effects.\n\nThe use of non-parametric methods does not come without costs: the researcher must make an array of decisions about the kernel function, the algorithm for selecting optimal bandwidth size, and the specifications. [https://www.nber.org/papers/w20405.pdf Imbens and Gelman (2014)] suggest the use of local linear and at most local quadratic polynomials.\n\n====Bandwidth size====\nIn practice, the bandwidth size depends on data availability. Ideally, one would like to have enough sample to run the regressions using information very close to the cutoff. The main advantage of using a very narrow bandwidth is that the functional form h(Z_i ) becomes much less of a worry and treatment effects can be obtained with parametric regression using a linear or piecewise linear specification of the assignment variable (see [https://www.princeton.edu/~davidlee/wp/RDDEconomics.pdf Lee and Lemieux (2010)] for this point). However, [https://files.eric.ed.gov/fulltext/ED511782.pdf Schochet (2008)] points out two diadvantages to a narrow versus wider bandwidth: \n#For a given sample size, a narrower bandwidth could yield less precise estimates if the outcome-score relationship can be correctly modelled using a wider range of scores. \n#External validity: extrapolating results to units further away from the threshold using the estimated parametric regression lines may be more defensible if you have a wider range of scores to fit these lines over.\n\n===Placebo Tests===\n\nFalsification (or placebo) tests are really important when using RDD as identification strategy. The researcher needs to convince the reader (and referees!) that the discontinuity exploited to inform causal impacts of an intervention was very much likely caused by the assignment rule to the intervention. In practice, researchers use fake cutoffs or different cohorts to run those tests. Examples can be seen [https://www.princeton.edu/~davidlee/wp/RDrand.pdf here], [http://onlinelibrary.wiley.com/doi/10.1111/rssa.12003/abstract here], and [https://www.cambridge.org/core/services/aop-cambridge-core/content/view/192AB48618B0E0450C93E97BE8321218/S0003055416000253a.pdf/deliberate_disengagement_how_education_can_decrease_political_participation_in_electoral_authoritarian_regimes.pdf here].    \n\t\n== Back to Parent ==\nThis article is part of the topic [[Quasi-Experimental Methods]]\n\n== Additional Resources ==\n*For more information on Stata commands for RDD, please visit this [https://sites.google.com/site/matiasdcattaneo/software page]. \n*For those interested in some fresh discussion on power calculation for RDD, please see the World Bank blogs [http://blogs.worldbank.org/impactevaluations/power-calculations-regression-discontinuity-evaluations-part-1 here], [http://blogs.worldbank.org/impactevaluations/power-calculations-regression-discontinuity-evaluations-part-2 here] and [http://blogs.worldbank.org/impactevaluations/power-calculations-regression-discontinuity-evaluations-part-3 here].\n*See examples and visualizations of RDD application in David Evan\u2019s [https://blogs.worldbank.org/impactevaluations/regression-discontinuity-porn blog post].\n* For those interested in knowing more RDD and its recent ramifications, check this practical introduction [http://www-personal.umich.edu/~cattaneo/books/Cattaneo-Idrobo-Titiunik_2017_Cambridge.pdf here]. For more advanced content, check this [http://www.emeraldinsight.com/doi/book/10.1108/S0731-9053201738 e-book].\n*Owen Ozier\u2019s World Bank [http://pubdocs.worldbank.org/en/663221440083705317/08-Regression-Discontinuity-Owen-Ozier.pdf Regression Discontinuity training]\n*Lee and Thomas\u2019 [http://www.princeton.edu/~davidlee/wp/RDDEconomics.pdf Regression discontinuity designs in economics] \n*Northwestern's [https://www.ipr.northwestern.edu/workshops/past-workshops/quasi-experimental-design-and-analysis-in-education/quasi-experiments/docs/QE-Day2.pdf Regression Discontinuity] slides\n*[https://www.chime.ucla.edu/seminars/RCMAR_Wherry.pdf An Introduction to Regression Discontinuity Design] by Laura Wherry\n* Pischke\u2019s [http://econ.lse.ac.uk/staff/spischke/ec533/RD.pdf Regression Discontinuity Design] slides\n* Yamamoto\u2019s [http://web.mit.edu/teppei/www/teaching/Keio2016/05rd.pdf Regression Discontinuity Design] slides \n\n[[Category: Quasi-Experimental Methods]]"
                    }
                ]
            }
        }
    }
}