Crowd-sourced Data

Revision as of 11:23, 5 April 2018 by Admin (talk | contribs)
Jump to: navigation, search

Modern ICT has created the possibility to collect large amounts of data by outsourcing the task to a 'crowd'. Crowd-sourcing typically involves large numbers of untrained contributors, who provide needed data points.

Read First

Crowd-sourcing has many applications (e.g. this wiki!); this article focuses on crowd-sourcing data collection for impact evaluations.


Advantages of Crowd-Sourced Data

  • Crowdsourced data is inexpensive, so number of observations can be far larger than for a traditional survey. This can increase statistical power (though sampling issues need to be carefully considered)
  • Real-time data gathering
  • Simple tasks can be cheaply outsourced
  • Can be more participatory than traditional data collection

Pitfalls of Crowd-Sourced Data

Lessons DIME has learned from crowdsourcing:

  • Recruiting a large network of contributors is essential to crowdsourcing success. The potential for crowdsourcing is limited in rural areas by technology constraints and low levels of social media connectivity. There are examples of successful recruiting through Facebook and similar social media in South Asia and Latin America; fewer in Africa
  • Follow network growth carefully. Crowdsourcing requires a crowd, not a handful!
  • The reliability of crowdsourcing data is often questioned because of the lack of underlying sampling frame. Crowdsourcing may not be the right tool when rigorous sampling and data structure are required
  • Make sure the technology is well-tested: in one case, DIME took the promises of a Silicon Valley partner at face value – but the available version of their technology delivered less than hoped. In practice, it looked rather like traditional enumeration - a few contributors, filling out long mobile surveys with no training. This took away the advantage of multiple observations and triangulation we had assumed. Moreover, it made the advantage of going with this model, where the contributor had very little training, rather than traditional enumeration, much less clear.
  • Stick with simple tasks. Instruments should not look like typical questionnaire – skip codes, relevancies, constraints. Contributors will not have the training of typical enumerators.
  • Quantify trade-offs carefully. What are the cost savings compared to traditional enumeration? Will they offset losses in precision or quality?

Back to Parent

This article is part of the topic Secondary Data Sources

Additional Resources