Crowdsourced data collection is a participatory method of building a dataset with the help of a large group of people. This page provides a brief overview of crowdsourced data collection in development and highlights points to consider when crowdsourcing data. Crowdsourced Data is a form of secondary data.
- Secondary Data refers to data that is collected by any party other than the researcher. Secondary data provides important context for any investigation into a policy intervention.
- When crowdsourcing data, researchers collect plentiful, valuable, and dispersed data at a cost typically lower than that of traditional data collection methods.
- Consider the trade-offs between sample size and sampling issues before deciding to crowdsource data.
- Ensuring data quality means making sure the platform on which you are collecting crowdsourced data is well-tested.
Crowdsourced data collection allows researchers to cheaply outsource simple tasks or questionnaires, gather data in real-time, and obtain far more numerous and widespread observations than in traditional data collection given its relatively low cost.
Notably, crowdsourced data collection allows researchers to more easily reach people and places, giving researchers insight into local markets, events, or even prices. Researchers may crowdsource data collection via a number of platforms including mobile apps or internet marketplaces like Amazon Mechanical Turk.
Considerations when crowdsourcing data
- Ensure a large network of contributors. This is essential to crowdsourcing success. If collecting geographically specific data, keep in mind that the potential for crowdsourcing is limited in rural areas due to technology constraints and low levels of connectivity.
- Follow network growth carefully. Crowdsourcing requires a crowd, not a handful!
- Consider the trade-offs between sample size and sampling issues. The reliability of crowdsourcing data is often questioned because of the lack of an underlying sampling frame. Crowdsourcing may not be the right tool if you need rigorous sampling and data structure.
- Request simple tasks from contributors. The instruments used in crowdsourced data collection should not look like traditional questionnaires that includes skip codes, relevancies, constraints. Remember that contributors will not have the training of typical enumerators.
- Ensure that the platform on which you are collecting crowdsourced data is well-testested. An example of taking the promises of a Silicon Valley partner at face value -- but the available version of their technology delivered less than hoped.
- Quantify trade-offs carefully. What are the cost savings compared to traditional enumeration? Will they offset losses in precision or quality?
- Hunt and Spect’s Crowdsourced Mapping in Crisis Zones: Collaboration, Organisation and Impact
- Bott, Gigler and Young's The Role of Crowdsourcing for Better Governance in Fragile State Contexts
- Komarov, Reinecke and Gajos’ Crowdsourcing Performance Evaluations of User Interfaces tests whether Amazon Mechanical Turk results differ from traditional questionnaire results
- In a DAI blogpost, Kelsey Stern Buchbinder explains the use of crowdsourced data in development and its role in providing on-the-ground insights