Administrative and Monitoring Data
While impact evaluations most commonly rely on primary data, secondary data can often provide important context for impact evaluation design and data analysis. In some cases, for example administrative data from a program conducted in a district, secondary data is the only source which covers the relevant population for an impact evaluation. Similary, in some cases, monitoring data can help assess who received the treatment, and if this was as per the initial impact evaluation design.
- Impact Evaluations rely on many different sources of secondary data - administrative, geospatial, sensors, telecom, and crowd-sourcing.
- An important step in designing an impact evaluation is to evaluate which of the available data sources are best suited in a particular context.
- Administrative data is any data collected by national/local governments, ministries or agencies that are outside the context of an impact evaluation.
- Monitoring data is data that is collected to track the implementation of treatment in a given impact evaluation.
Administrative data is any data collected by national/ local governments, ministries or government agencies that are outside the context of an impact evaluation. Administrative data can include data from land registries, road networks, infrastructure investments, tax, energy billing, or social transfers. Generally, administrative data is collected to document or track beneficiaries of a government policy and the general population, and not for research purposes. Research teams should aim to use administrative data in addition to other sources of data - survey data. Using administrative data along with other sources—survey, census, remote sensing, and crowdsourcing—to create multi-sector, georeferenced data sets, maps, and dashboards that are tailored to each specific country context and policy interest. Researchers should aim not to use administrative data in place of survey data but rather in addition to it.
Administrative data offers advantages in quality, cost, and time. It is often considered more accurate than self-reported survey data; consider, for example, that a firm is more likely to accurately report its turnover rate to Financial Administrations than to a research team conducting a firm survey. Furthermore, notwithstanding potential access costs, administrative data doesn't pose additional costs as it is collected independent of the impact evaluation. Finally, administrative data can avail information frequently because it is often collected on a regular basis. This makes administrative data especially advantageous and attractive for research teams retrospectively evaluating interventions for which data collection did not occur.
Nonetheless, administrative data doesn't come without a few potential challenges: access, merging, and quality. Accessing administrative data requires strong relationships with national and/or local authorities. In some cases, authorities may not be inclined to share the information. Once accessed, consolidating administrative data with other data often entails merging different databases together: this can be an extensive task when no common unique identifiers exist across the databases. Finally, while in some cases administrative data can provide high accuracy, in others, it may be badly reported, not exhaustive, or not at all existent. Not all governments have the same capacity to collect this information.
Monitoring data is collected to understand the implementation of the assigned treatment in the field. Typically, survey round data helps us understand changes in the outcome variables throughout the duration of the project, and monitoring data helps us understand how these changes are related to the intervention of our treatment. For example, monitoring data could be data on who actually received the treatment and if the treatment was implemented according to the research design. Our analysis might be invalid if we do not have this information and base our analysis only on what was meant by the research team to happen. Monitor data helps us understand what is usually referred to as internal validity.
Back to Parent
This article is part of the topic Secondary Data Sources
Please include here links to relevant existing resources outside of the wiki