Crowd-sourced Data

Jump to: navigation, search

Crowdsourced data collection is a participatory method of building a dataset with the help of a large group of people. This page provides a brief overview of crowdsourced data collection in development and highlights points to consider when crowdsourcing data.

Read First

  • Through crowdsourced data collection, researchers can collect plentiful, valuable, and disperse data at a cost typically lower than that of traditional data collection methods.
  • Crowdsourced data may introduce sampling issues. Consider the trade-offs between sample size and sampling issues before deciding to crowdsource data.
  • Make sure the platform on which you are collecting crowdsourced data is well-tested.


Crowdsourced data collection allows researchers to cheaply outsource simple tasks or questionnaires, gather data in real time, and obtain far more numerous and widespread observations than in traditional data collection given its relatively low cost. Notably, crowdsourced data collection allows researchers to more easily reach people and places, giving researchers insight into local markets, events, or even prices. Researchers may crowdsourced data collection via a number of platforms including mobile apps or internet marketplaces like Amazon Mechanical Turk.

Considerations when crowdsourcing data

  • Ensure a large network of contributors: this is essential to crowdsourcing success. If collecting geographically specific data, keep in mind that the potential for crowdsourcing is limited in rural areas due to technology constraints and low levels of connectivity.
  • Follow network growth carefully. Crowdsourcing requires a crowd, not a handful!
  • Consider the trade-offs between sample size and sampling issues. The reliability of crowdsourcing data is often questioned because of the lack of underlying sampling frame. Crowdsourcing may not be the right tool if you need rigorous sampling and data structure.
  • Request simple tasks from contributors. The instruments used in crowdsourced data collection should not look like traditional questionnaires that includes skip codes, relevancies, constraints. Remember that contributors will not have the training of typical enumerators.
  • Ensure that the platform on which you are collecting crowdsourced data is well-tested: in one case, DIME took the promises of a Silicon Valley partner at face value -- but the available version of their technology delivered less than hoped.
  • Quantify trade-offs carefully. What are the cost savings compared to traditional enumeration? Will they offset losses in precision or quality?

Back to Parent

This article is part of the topic Secondary Data Sources

Additional Resources