Difference between revisions of "SurveyCTO Random Draw of Beneficiaries 1"

Jump to: navigation, search
Line 1: Line 1:
== Best Practice == <onlyinclude>
== Best Practice ==  
This example comes from an agriculture survey in Brazil, where the survey firm did a poor job in listing the members of each association. We are looking to survey 8 people in each association, but we cannot be sure that each person on the list is actually a valid member. The pools are quite large (>100) and most are assumed to be valid IDs, but we still need to be careful that there are enough IDs chosen for the list.</onlyinclude>
This example comes from an agriculture survey in Brazil, where the survey firm did a poor job in listing the members of each association. We are looking to survey 8 people in each association, but we cannot be sure that each person on the list is actually a valid member. The pools are quite large (>100) and most are assumed to be valid IDs, but we still need to be careful that there are enough IDs chosen for the list.</onlyinclude>



Revision as of 18:42, 30 June 2022

Best Practice

This example comes from an agriculture survey in Brazil, where the survey firm did a poor job in listing the members of each association. We are looking to survey 8 people in each association, but we cannot be sure that each person on the list is actually a valid member. The pools are quite large (>100) and most are assumed to be valid IDs, but we still need to be careful that there are enough IDs chosen for the list.

We approach the problem in 2 stages:

  1. Randomly select enough IDs from the total to be almost certain that 8 will be valid (here we choose 25)
  2. Have the enumerator validate if these IDs are in fact proper members of the association. If not, more IDs are pulled from the randomized list to verify until 8 are selected for participation.

Ideally you should not be need to select participants in this manner - you would like a sample frame from which to select a sample before your survey teams go to the field to administer the household survey. But in this case it was not possible because of a problem with the listing process.

The approach works nicely for selecting a smaller number of people, but would be rather clunky if you are looking to select, say 200 from 1,000 members. The main advantage over other methods is that you are easily able to preload the randomized numbers created in Stata or R to ensure replicability.

This approach can be altered or scaled for application in various situations. It's important to consider the number of permutations that will mostly guarantee that there are enough IDs selected once they have been deduplicated and dropped for not being valid. If your situation is choosing from a smaller pool, or has IDs are more likely to be dropped for not being valid, you should increase the number of draws and the questions to validate members. In this example we assume that we will be able to select 8 people from the 25 drawn IDs.

Code Example

Here is the code example for a form that selects 8 members (IDs) from an association to participate in a survey round. The label column of the calculate fields also denote what actions are taking place on each line.

Stage 1 - Random draws

  • We first define 25 random numbers (0 to 1). We already know in practice that random numbers should be preloaded into our SurveyCTO forms, unlike here.
  • Each of these 25 random numbers is scaled to select an ID between 1 and N, using the calculate field: round((${randX} * ${num}+.5), 0 )
  • We then concatenate all of these IDs so that the field ${randmems} has the structure of a select_multiple field and is a list of 25 numbers, e.g. "43 1 83 1 30 9 30 etc."
  • This list is then deduplicated, so that we are left with a list of individual IDs of up to 25 IDs to verify from, e.g. "43 1 83 30 9 etc." This is a list of randomly selected participants, and can be applied to many different contexts.

Stage 2 - Verification of IDs

  • The fields ${idX} pulls each of these IDs in order from the list. Don't forget that the first instance is indexed at '0', not '1', these are then set up as choice names.
  • Then we ask the enumerator to verify if each ID from the first 8 IDs are valid association members. If they are, the survey finishes. If not, the survey asks about sets of 3 further IDs until there are at least 8 valid IDs.
  • The field ${idsX_missing} in each question takes the number of IDs that still need to be valid, when this reaches '0' or below, the form presents the final list.
  • Note that if no IDs are valid, the ${idsX} fields take the value of '0', which must be removed before concatenating the final list - this is performed in the calculate fields ${idsX_final}.
  • To prepare the final list, we need to take the index numbers from the verification stage (the choice values) and pull the associated member ID from the deduplicated randomly selected ID list from the first stage. The calculate field: selected-at(${randmems_dedup}, selected-at(${concat_final}, X)-1) does this for each of the 8 final IDs.

Back to Parent

This article is part of the topic SurveyCTO Coding Practices