Difference between revisions of "Primary Data Collection"

Jump to: navigation, search
Line 12: Line 12:


Before moving on to the discussion of concerns about ownership and handling, however, it is important to understand the process of '''collecting primary data'''. The process of '''primary data collection''' consists of several steps, from [[Questionnaire Design|questionnaire development]], to [[Enumerator Training|enumerator training]]. Each of these steps are listed below, and require detailed [[Field Management|planning]], and coordination among the members of the '''research team'''.   
Before moving on to the discussion of concerns about ownership and handling, however, it is important to understand the process of '''collecting primary data'''. The process of '''primary data collection''' consists of several steps, from [[Questionnaire Design|questionnaire development]], to [[Enumerator Training|enumerator training]]. Each of these steps are listed below, and require detailed [[Field Management|planning]], and coordination among the members of the '''research team'''.   
=== Acquire approval from human subjects ===
There are strict rules about [[Human Subjects Approval | acquiring approval from human subjects]]. Researchers must understand the [[Research Ethics|ethics]] and rules for [[Data Security|security of sensitive data]], and should use proper tools for [[Encryption | encryption]] and [[De-identification | de-identification]] of [[Personally Identifiable Information_(PII)|personally identifiable information (PII)]].


=== Compile the survey budget ===
== Develop Questionnaire ==
The '''first stage''' of the [[Survey Pilot|survey pilot]] allows researchers to develop a [[Questionnaire_Design|design]] for the instrument. The researchers then conduct the '''second stage''' of the survey pilot, called [[Piloting_Survey_Content|content-focused pilot]], to review and refine the structure of the instrument.
 
== Pilot Questionnaire ==
The '''first stage''' of the [[Survey Pilot|survey pilot]], the '''pre-pilot''' involves two things: [[Piloting Survey Content |piloting content]] and [[Piloting Survey Protocols| piloting protocols]]. Clear protocols allow researchers to ensure that [[Preparing for Field Data Collection|field collection]] is carried out consistently across teams and/or regions, and ensure that published [[Reproducible_Research|research is reproducible]].
== Pilot Recruitment Strategy ==
== TOR and Procurement ==
Researchers must prepare a [[Survey Budget | survey budget]] before [[Procuring a Survey Firm|procuring a survey firm]]. This step allows researchers to calculate expected costs of conducting a study, and compare these with the proposals of firms that submit an '''expression of interest (EOI)'''.
Researchers must prepare a [[Survey Budget | survey budget]] before [[Procuring a Survey Firm|procuring a survey firm]]. This step allows researchers to calculate expected costs of conducting a study, and compare these with the proposals of firms that submit an '''expression of interest (EOI)'''.


Line 24: Line 28:
The next step is to [[Procuring a Survey Firm|procure a survey firm]] after issuing detailed [[Survey Firm TOR|terms of reference (TOR)]], and performing due diligence among local research firm options.
The next step is to [[Procuring a Survey Firm|procure a survey firm]] after issuing detailed [[Survey Firm TOR|terms of reference (TOR)]], and performing due diligence among local research firm options.


=== Carry out a pre-pilot===
== Data Quality Assurance Plan ==
The '''first stage''' of the [[Survey Pilot|survey pilot]], the '''pre-pilot''' involves two things: [[Piloting Survey Content |piloting content]] and [[Piloting Survey Protocols| piloting protocols]]. Clear protocols allow researchers to ensure that [[Preparing for Field Data Collection|field collection]] is carried out consistently across teams and/or regions, and ensure that published [[Reproducible_Research|research is reproducible]].
== Obtain Ethical Approval ==  
 
There are strict rules about [[Human Subjects Approval | acquiring approval from human subjects]]. Researchers must understand the [[Research Ethics|ethics]] and rules for [[Data Security|security of sensitive data]], and should use proper tools for [[Encryption | encryption]] and [[De-identification | de-identification]] of [[Personally Identifiable Information_(PII)|personally identifiable information (PII)]].
=== Refine and review the survey design ===
The '''first stage''' of the [[Survey Pilot|survey pilot]] allows researchers to develop a [[Questionnaire_Design|design]] for the instrument. The researchers then conduct the '''second stage''' of the survey pilot, called [[Piloting_Survey_Content|content-focused pilot]], to review and refine the structure of the instrument.
 
=== Translate the survey instrument ===
After the content-focused pilot, the research firm [[Questionnaire_Translation|translates the instrument]] into all local languages. This step helps to ensure that the survey can be taken by more people, therefore making the study more effective.
 
=== Program the instrument ===
After obtaining [[IRB Approval|IRB approval]], researchers [[Questionnaire Programming|program the questionnaire]]. This step makes it easier to share surveys that rely on methods like [[Computer-Assisted Personal Interviews (CAPI)]] or [[Computer-Assisted Field Entry (CAFE)]].
<br/> Also refer to [[SurveyCTO_Coding_Practices|SurveyCTO coding practices]] to learn more about programming surveys.


=== Train enumerators and monitor data quality ===
== Train Enumerators ==  
After validating the programming of the questionnaire, the researchers [[Enumerator Training | train enumerators]] and [[Monitoring_Data_Quality|monitor data quality]] to generate a '''final draft''' of the instrument. '''Monitoring''' can be done in the form of [[Back_Checks|back checks]], [[Monitoring Data Quality#High Frequency Checks|high frequency checks]], as well as other methods.
After validating the programming of the questionnaire, the researchers [[Enumerator Training | train enumerators]] and [[Monitoring_Data_Quality|monitor data quality]] to generate a '''final draft''' of the instrument. '''Monitoring''' can be done in the form of [[Back_Checks|back checks]], [[Monitoring Data Quality#High Frequency Checks|high frequency checks]], as well as other methods.


=== Maintain  an organized data folder ===
DIME Analytics has created a Stata command, <code>[[iefolder]]</code>. Part of the DIME Analytics Stata package<code>[[ietoolkit]]</code> , it helps increase project efficiency, and reduces the risk of error in a study.
DIME Analytics has created a Stata command, <code>[[iefolder]]</code>. Part of the DIME Analytics Stata package<code>[[ietoolkit]]</code> , it helps increase project efficiency, and reduces the risk of error in a study.



Revision as of 23:24, 25 May 2020

Primary data collection is the process of gathering data through surveys, interviews, or experiments. A typical example of primary data is household surveys. In this form of data collection, researchers can personally ensure that primary data meets the standards of quality, availability, statistical power and sampling required for a particular research question. With globally increasing access to specialized survey tools, survey firms, and field manuals, primary data has become the dominant source for empirical inquiry in development economics.

Read First

Overview

While impact evaluations often benefit from secondary sources of data like administrative data, census data, or household data, these sources may not always be available. In such cases, the research team will need to collect data directly using well-designed interviews and surveys, and the research team typically owns the data that it collects. However, even then, the research team must keep in mind certain ethical concerns related to owning and handling sensitive, or personally identifiable information (PII).

Before moving on to the discussion of concerns about ownership and handling, however, it is important to understand the process of collecting primary data. The process of primary data collection consists of several steps, from questionnaire development, to enumerator training. Each of these steps are listed below, and require detailed planning, and coordination among the members of the research team.

Develop Questionnaire

The first stage of the survey pilot allows researchers to develop a design for the instrument. The researchers then conduct the second stage of the survey pilot, called content-focused pilot, to review and refine the structure of the instrument.

Pilot Questionnaire

The first stage of the survey pilot, the pre-pilot involves two things: piloting content and piloting protocols. Clear protocols allow researchers to ensure that field collection is carried out consistently across teams and/or regions, and ensure that published research is reproducible.

Pilot Recruitment Strategy

TOR and Procurement

Researchers must prepare a survey budget before procuring a survey firm. This step allows researchers to calculate expected costs of conducting a study, and compare these with the proposals of firms that submit an expression of interest (EOI).

Determine relevant parameters of a study

After agreeing upon a budget, researchers then decide upon factors like the adequate sampling frame (which is a list of individuals or units in a population from which a sample can be drawn), sample size, and statistical power based on which they can then randomize treatment.

Procure a survey firm

The next step is to procure a survey firm after issuing detailed terms of reference (TOR), and performing due diligence among local research firm options.

Data Quality Assurance Plan

Obtain Ethical Approval

There are strict rules about acquiring approval from human subjects. Researchers must understand the ethics and rules for security of sensitive data, and should use proper tools for encryption and de-identification of personally identifiable information (PII).

Train Enumerators

After validating the programming of the questionnaire, the researchers train enumerators and monitor data quality to generate a final draft of the instrument. Monitoring can be done in the form of back checks, high frequency checks, as well as other methods.

DIME Analytics has created a Stata command, iefolder. Part of the DIME Analytics Stata packageietoolkit , it helps increase project efficiency, and reduces the risk of error in a study.

Related Pages

Click here for pages that link to this topic.

Additional Resources