Primary Data Collection
Primary data collection is the process of gathering data through surveys, interviews or experiments. A typical example of primary data is household surveys. In this form of data collection, researchers can personally ensure that primary data meets the standards of quality, availability, statistical power and sampling required for a particular research question. With globally increasing access to specialized survey tools, survey firms, and field manuals, primary data has become the dominant source for empirical inquiry in development economics.
- The DIME Research Standards provide a comprehensive checklist to ensure that collection and handling of research data is in line with global best-practices.
- Personal interviews are the most effective medium for primary data collection. Depending on the research question, these interviews may take the form of household surveys, business (firm) surveys, or agricultural (farm) surveys.
iefieldkitis a Stata package that aids primary data collection. It currently supports three major components of that workflow: survey design; survey completion; and data-cleaning and survey harmonization.
While impact evaluations often benefit from secondary sources of data like administrative data, census data, or household data, these sources may not always be available. In such cases, researchers need to collect data directly through a series of well-designed interviews and surveys. The process of collecting primary data requires a great deal of foresight, planning and coordination. Listed below are the crucial steps involved the in preparation and collection of primary data:
Acquire approval from human subjects
There are strict rules about acquiring approval from human subjects. Researchers must understand the ethics and rules for security of sensitive data, and therefore should use proper tools for encryption and de-identification of personally identifiable information (PII).
Compile the survey budget
Researchers must prepare a survey budget before procuring a survey firm. This step allows researchers to calculate expected costs of conducting a study, and compare these with the proposals submitted by firms that have submitted an expression of interest (EOI).
Determine relevant parameters of a study
After agreeing upon a budget, researchers then decide factors like the adequate sampling frame (which is a list of individuals or units in a population from which a sample can be drawn), sample size, statistical power based on which they can then randomize treatment.
Procure a survey firm
Carry out a pre-pilot of the survey
The first-stage of the survey pilot involves two things: piloting content, and piloting protocols. Clear protocols are important to ensure that field data-collection is carried out consistently across teams and/or regions, and for generating reproducible research.
Refine and review the survey design
The first-stage of the survey pilot allows researchers to develop a design for the instrument. The researchers then conduct the second stage of the survey pilot, called content-focused pilot, to review and refine the structure of the instrument.
Translate the survey instrument
After the content-focused pilot, the questionnaire (or the instrument) is translated. This step is crucial for the purpose of making the study more effective.
Program the instrument
Once an IRB Approval is obtained, the instrument is programmed. This step makes it easier to administer surveys that rely on methods like Computer-Assisted Personal Interviews (CAPI) or Computer-Assisted Field Entry (CAFE)
Also refer to these guidelines for coding in SurveyCTO.
Train enumerators and monitor data quality
After validating the programming of the questionnaire, the researchers train enumerators, and monitor data-quality to generate a final draft of the instrument. Monitoring can be done in the form of can be done through back checks, high frequency checks, and other methods.
Maintain an organized data folder
DIME has created a Stata package, via
iefolder, which creates the DataWork folder. This package helps increase project efficiency and reduces the risk of error in a study, therefore increasing the effectiveness of the study's final results.
Back to Parent
This article is part of Primary Data Collection
- Brief from Oxfam: Planning Survey Research
- DIME Guide on Planning, Preparing & Monitoring Household Surveys
- DIME Analytics Guidelines on Preparing for Data Collection
- Guidelines and tools for Preparing for Data Collection from the World Bank's Results Based Financing Impact Evaluation Toolkit
- Oxfam provides a detailed case study of how to use electronic data collection (SurveyCTO) combined with Stata code to improve data quality in the field.