Iedropone

Jump to: navigation, search

This article is meant to describe use cases, work flow, and the reasoning used when developing the commands. For instructions on how to use the command specifically in Stata and for a complete list of the options available, see the help files by typing help iedropone in Stata.

Read First

  • iedropone is used to make sure that no additional observations are dropped when dropping an exact number of observations.
  • This command is a part of the package ietoolkit. To install all the commands in this package, type ssc install ietoolkit in Stata.

Intended use cases

It is common that observations needs to be dropped when cleaning a dataset. For example, we might know that an interview was done incorrectly and the data for that observation needs to be dropped. Or perhaps a whole village should be dropped. At the same time, it is important that these observations are dropped so that they do not introduce error in the analysis. It is also important that we do not delete more observations than intended. When we first write drop if HHID == 123456 we can easily check that Stata deletes exactly one observation. And if we want to delete all observations from one village and we know that it is 12 observations in that village, we can write drop if village_code == 123 and check that exactly 12 observations are deleted.

However, in the cleaning process the data can change, especially if the data collection is still ongoing. And that means that more observations might incorrectly be deleted, or observations that are supposed to be deleted are not. Let's say that someone changes all village code of 123 to missing as it is incorrect. If that change happens before the code that drops those village, then these twelve villages are no longer deleted. In these examples, when we delete based on ID information, we are likely to catch the mistake eventually, but perhaps not after some damage is done. And when we delete observations without having clear IDs available, then this might be an issue that we never catch. This is where iedropone comes in. It will test that exactly one observation is dropped if no number is specified, and it can be set to test for any other number of observations. If it the expected number of observations that are not dropped, then iedropone will return an error and you get a chance to investigate why the number has changed.

Intended Work Flow

Simply replace the command drop with iedropone, and keep running the code as normal.

Instructions

These instructions are meant to help you understand how to use the command. For technical instructions on how to implement the command in Stata, see the help files by typing help iedropone in Stata.

The usage of this command is easy. The syntax works the same way as Stata's built in command drop, however, there are more options when using iedropone (see help file). The main challenge when using iedropone is to document well why a particular number of observations were dropped, so that it is possible for anyone in the future who may get an error message from iedropone to know why those observations were dropped and what could have caused an incorrect number of observations to be dropped now.

Related Pages

Click here to see pages that link to this topic.