Difference between revisions of "Crowd-sourced Data"

Jump to: navigation, search
 
(6 intermediate revisions by 2 users not shown)
Line 1: Line 1:
'''Crowdsourced data''' collection is a participatory method of building a [[ Master Dataset | dataset]] with the help of a large group of people. This page provides a brief overview of crowdsourced data collection in development and highlights points to consider when crowdsourcing data. '''Crowdsourced Data''' is a form of [[Secondary Data Sources | secondary data]].  
'''Crowdsourced data''' collection is a participatory method of building a [[ Master Dataset | dataset]] with the help of a large group of people. This page provides a brief overview of '''crowdsourced''' data collection in development and highlights points to consider when crowdsourcing data. '''Crowdsourced Data''' is a form of [[Secondary Data Sources | secondary data]].  


== Read First ==
== Read First ==
* [[Secondary Data Sources | Secondary Data]] refers to data that is collected by any party other than the researcher. ''' Secondary data''' provides important context for any investigation into a policy intervention.  
* [[Secondary Data Sources | Secondary Data]] refers to data that is collected by any party other than the researcher. ''' Secondary data''' provides important context for any investigation into a policy intervention.  
*'''Crowdsourced data collection.''' Researchers collect plentiful, valuable, and dispersed data at a cost typically lower than that of traditional data collection methods.
*When '''crowdsourcing data''', researchers collect plentiful, valuable, and dispersed data at a cost typically lower than that of traditional [[Primary Data Collection|data collection]] methods.
*'''Sampling issues.''' Consider the trade-offs between sample size and [[Sampling | sampling]] issues before deciding to crowdsource data.  
*Consider the trade-offs between sample size and [[Sampling | sampling]] issues before deciding to '''crowdsource data'''.  
* [[Data Quality Assurance Plan | Data Quality]]. Make sure the platform on which you are collecting crowdsourced data is well-tested.
* Ensuring [[Data Quality Assurance Plan | data quality]] means making sure the platform on which you are collecting '''crowdsourced data''' is well-tested.
 
== Overview ==
== Overview ==


'''Crowdsourced data''' collection allows researchers to cheaply outsource simple tasks or questionnaires, gather data in real-time, and obtain far more numerous and widespread observations than in traditional data collection given its relatively low cost. Notably, crowdsourced data collection allows researchers to more easily reach people and places, giving researchers insight into [https://mdp.berkeley.edu/data-crowdsourcing-the-gap-between-ideation-and-implementation/ local markets], [http://www.lse.ac.uk/international-development/conflict-and-civil-society/current-projects/crowdsourcing-conflict-and-peace-events-in-the-syrian-conflict events], or even [https://www.technologyreview.com/s/520151/crowdsourcing-mobile-app-takes-the-globes-economic-pulse/ prices]. Researchers may crowdsource data collection via a number of platforms including mobile apps or internet marketplaces like [https://www.mturk.com/ Amazon Mechanical Turk].
'''Crowdsourced data''' collection allows researchers to cheaply outsource simple tasks or [[Questionnaire Design | questionnaires]], gather data in real-time, and obtain far more numerous and widespread observations than in traditional [[Primary Data Collection|data collection]] given its relatively low cost.  
 
Notably, '''crowdsourced data collection''' allows researchers to more easily reach people and places, giving researchers insight into [https://mdp.berkeley.edu/data-crowdsourcing-the-gap-between-ideation-and-implementation/ local markets], [http://www.lse.ac.uk/international-development/conflict-and-civil-society/current-projects/crowdsourcing-conflict-and-peace-events-in-the-syrian-conflict events], or even [https://www.technologyreview.com/s/520151/crowdsourcing-mobile-app-takes-the-globes-economic-pulse/ prices]. Researchers may '''crowdsource data collection''' via a number of platforms including mobile apps or internet marketplaces like [https://www.mturk.com/ Amazon Mechanical Turk].


== Considerations when crowdsourcing data ==
== Considerations when crowdsourcing data ==
* Ensure a large network of contributors: this is essential to crowdsourcing success. If collecting geographically specific data, keep in mind that the potential for crowdsourcing is limited in rural areas due to technology constraints and low levels of connectivity.  
* '''Ensure a large network of contributors.''' This is essential to crowdsourcing success. If collecting geographically specific data, keep in mind that the potential for crowdsourcing is limited in rural areas due to technology constraints and low levels of connectivity.  
* Follow network growth carefully. Crowdsourcing requires a crowd, not a handful!
* '''Follow network growth carefully.''' Crowdsourcing requires a [[Sampling | crowd]], not a handful!
* Consider the trade-offs between sample size and [[Sampling | sampling issues]]. The reliability of crowdsourcing data is often questioned because of the lack of an underlying sampling frame. Crowdsourcing may not be the right tool if you need rigorous sampling and data structure.
* '''Consider the trade-offs between sample size and [[Sampling | sampling issues]].''' The reliability of crowdsourcing data is often questioned because of the lack of an underlying [[Sampling#Establish the Sampling Frame and Master Dataset|sampling frame]]. Crowdsourcing may not be the right tool if you need rigorous '''sampling''' and data structure.
* Request simple tasks from contributors. The instruments used in crowdsourced data collection should not look like traditional [[Questionnaire Design | questionnaires]] that includes skip codes, relevancies, constraints. Remember that contributors will not have the training of typical [[Enumerator Training | enumerators]].  
* '''Request simple tasks from contributors.''' The instruments used in crowdsourced data collection should not look like traditional [[Questionnaire Design | questionnaires]] that includes skip codes, relevancies, constraints. Remember that contributors will not have the training of typical [[Enumerator Training | enumerators]].  
* Ensure that the platform on which you are collecting crowdsourced data is well-tested: in one case, DIME [https://blogs.worldbank.org/impactevaluations/lessons-crowdsourcing-failure took the promises] of a Silicon Valley partner at face value -- but the available version of their technology delivered less than hoped.
* '''Ensure that the platform on which you are collecting crowdsourced data is well-testested.''' An example of [https://blogs.worldbank.org/impactevaluations/lessons-crowdsourcing-failure taking the promises] of a Silicon Valley partner at face value -- but the available version of their technology delivered less than hoped.
* Quantify trade-offs carefully. What are the cost savings compared to traditional enumeration? Will they offset losses in precision or quality?
* '''Quantify trade-offs carefully.''' What are the cost savings compared to traditional enumeration? Will they offset losses in precision or quality?


== Related Pages ==
== Related Pages ==
This article is part of the topic [[Secondary Data Sources]]
[[Special:WhatLinksHere/Crowd_sourced_Data|Click here for pages that link to this topic]].


== Additional Resources ==
== Additional Resources ==

Latest revision as of 16:37, 7 August 2023

Crowdsourced data collection is a participatory method of building a dataset with the help of a large group of people. This page provides a brief overview of crowdsourced data collection in development and highlights points to consider when crowdsourcing data. Crowdsourced Data is a form of secondary data.

Read First

  • Secondary Data refers to data that is collected by any party other than the researcher. Secondary data provides important context for any investigation into a policy intervention.
  • When crowdsourcing data, researchers collect plentiful, valuable, and dispersed data at a cost typically lower than that of traditional data collection methods.
  • Consider the trade-offs between sample size and sampling issues before deciding to crowdsource data.
  • Ensuring data quality means making sure the platform on which you are collecting crowdsourced data is well-tested.

Overview

Crowdsourced data collection allows researchers to cheaply outsource simple tasks or questionnaires, gather data in real-time, and obtain far more numerous and widespread observations than in traditional data collection given its relatively low cost.

Notably, crowdsourced data collection allows researchers to more easily reach people and places, giving researchers insight into local markets, events, or even prices. Researchers may crowdsource data collection via a number of platforms including mobile apps or internet marketplaces like Amazon Mechanical Turk.

Considerations when crowdsourcing data

  • Ensure a large network of contributors. This is essential to crowdsourcing success. If collecting geographically specific data, keep in mind that the potential for crowdsourcing is limited in rural areas due to technology constraints and low levels of connectivity.
  • Follow network growth carefully. Crowdsourcing requires a crowd, not a handful!
  • Consider the trade-offs between sample size and sampling issues. The reliability of crowdsourcing data is often questioned because of the lack of an underlying sampling frame. Crowdsourcing may not be the right tool if you need rigorous sampling and data structure.
  • Request simple tasks from contributors. The instruments used in crowdsourced data collection should not look like traditional questionnaires that includes skip codes, relevancies, constraints. Remember that contributors will not have the training of typical enumerators.
  • Ensure that the platform on which you are collecting crowdsourced data is well-testested. An example of taking the promises of a Silicon Valley partner at face value -- but the available version of their technology delivered less than hoped.
  • Quantify trade-offs carefully. What are the cost savings compared to traditional enumeration? Will they offset losses in precision or quality?

Related Pages

Click here for pages that link to this topic.

Additional Resources