Difference between revisions of "Ietestform"

Revision as of 21:40, 18 November 2020

DIME Analytics has created iefieldkit as a package in Stata to support the process of primary data collection from start to finish. In most cases, third party survey firms or local partners collect data on behalf of the research team. Therefore, data quality assurance is a particularly important aspect of data collection. ietestform allows the research team to test Open Data Kit (ODK)-based electronic survey forms for common errors, as well as best practices for SurveyCTO-based forms before field data collection starts. For example, the SurveyCTO server has a built-in test feature that tests the ODK syntax of a form when it is uploaded by the research team. ietestform complements these built-in tests to ensure that the collected data is in a format that is easily readable in Stata, and warns users who use practices we have learnt are prone to data quality errors.

Read First

Please refer to Stata coding practices for coding best practices in Stata.
ietestform is part of the package iefieldkit, which has been developed by DIME Analytics.
To install ietestform, as well as other commands in the iefieldkit package, type ssc install iefieldkit in Stata.
For instructions and available options, type help ietestform.

Overview

In Open Data Kit (ODK)-based electronic survey kits, including SurveyCTO, survey forms (or questionnaires) are typically built in Excel using a specialized structured syntax. Before the research team starts with field data collection, they can use ietestform to test ODK-based electronic survey forms for common errors, as well as best practices for SurveyCTO-based forms.

For example, the SurveyCTO server has a built-in feature that tests the ODK syntax of a form when it is uploaded by the research team. ietestform complements these built-in tests to ensure that the collected data is in a format that is easily readable in Stata, and warns users who use practices we have learnt are prone to data quality errors. Therefore, the ietestform command should be used after testing the survey form on a SurveyCTO server to make sure there are no syntax errors.

Syntax

The basic syntax for ietestform is as follows:

ietestform 
   , surveyform("filename.xlsx") 
     report("report.csv")

The ietestform command generates a report in .csv format. The report flags errors in coding, as well as practices that are not strictly wrong, but which may indicate bad practices, and therefore need a manual review. The report generated by ietestform can be displayed in a number of software applications, and can also be used with collaboration tools like GitHub.

If you think that the command incorrectly flagged issues in your SurveyCTO form, please report the case here to help DIME Analytics improve the command. Refer to the following sections for a detailed explanation of the tests performed by ietestform. These tests are meant to flag errors that may interrupt field work. Note that the ietestform should be used only after the form has passed the ODK syntax checks on the SurveyCTO server.

Required Column

Required fields ensure that the enumerators cannot proceed without entering a response to a particular field (each question is a field). This prevents submissions of incomplete forms, and helps ensure that enumerators complete forms in the right order. A field is required if it has the "Yes" value in the required column.

It is common that respondents do not have an answer, or do not want to share an answer, to a question, but a missing value should never be used to represent such non-answers. Instead, the questionnaire should allow non-answers, for example, "I do not know" or "Decline to answer" as valid answers. Therefore, almost all fields should be required in an ODK survey while still being able to handle non-answers.

Note that only column types that show up when filling the form are affected by that value. For example, fields like begin_group, end_repeat, text_audit do not show up while filling the form, and so tests related to the required columns ignore these fields.

ietestform runs two tests related to the required columns depending on whether they are note type or non-note type. Fields which are of the note type are those for which the enumerator does not have to enter any input. Instead, the enumerator only needs to read out a specific text note.

Non-note fields: required

ietestform tests to make sure that all fields that are not of note type have the value "Yes" in the required column, that is, they are required. The final report then lists all those fields not of type note, but are not required.

Even when some type of non-response by a respondent, such as “Declined to answer”, is acceptable, there should always be a valid method to record the reason for no response. The enumerator should not leave the input field empty in this case. The absence of a recorded answer should only mean that the enumerator did not ask the question during the survey. In cases where it is acceptable to skip a question, you should use an appropriate relevance condition.

Fields that record GPS coordinates for instance, are some of the fields that may intentionally have a "No" value under the required column. Such fields often have their type as geopoint, geoshape, or geotrace. If you know that you will have no problem collecting GPS coordinates, then you should have a "Yes" value in the required column to ensure that you get valid data points.

However, if GPS coordinates are difficult to collect, then it might be a good idea to not have a "Yes" value under the required column. This will allow the enumerator to complete the other fields and submit the survey even if it is not possible to record GPS coordinates. In this case, ietestform will still report these fields, but you can still proceed with launching the survey if it was an active decision you are happy with.

Note fields: not required

While fields of the note type can have a "Yes" value in the required column, they cannot record an input. Therefore, if an enumerator comes across such a field during a live survey , they cannot move past this field. In this case, there is no way to continue with the interview, and the enumerator will not be able to submit the data already collected from previous questions. ietestform therefore reports a list of all fields that are of the note type, and have a "Yes" value in the required column.

Note that there are cases in which note fields which are required may be useful. Since enumerators cannot move past these fields, you may use them with a relevance condition so that these fields show up if an earlier entry in the form is incorrect. This will force the enumerator to go back and correct the error before continuing with the interview..

For example, enumerators often enter respondent IDs twice to make sure there is no typo in the ID. You may name the two entry fields id1 and id2. Then you can follow these fields with a required note field which has the relevance expression as ${id1} != ${id2}. In this case, the note type field will only appear if the two entries are not identical. You can use the note text to inform the enumerator that the two ID fields are not identical, and that the enumerator must go back and change the values in order to continue.

Matching begin_ and end_

The ietestform command checks that all begin_group fields are matched by an end_group, and that all begin_repeat fields are matched by an end_repeat. While the ODK syntax tester on the SurveyCTO server also tests for matching begin_ and end_ values, ietestform command provides additional information that makes it faster and easier to solve this problem, especially when the survey form (or questionnaire) is very large.

For example, ODK does not require that the end_group and end_repeat fields should have field names (begin_group and begin_repeat are required to have names). This makes it difficult to identify where the error is in the underlying survey form. However, ietestform fills that gap because it requires also end_group and end_repeat fields should have names and that they should match the corresponding begin_group and begin_repeat field. ietestform lists these missing names in the report, along with the row number (in the Excel form) of other non-valid begin_ and end_ pairs.

For a begin_ and end_ pair to be considered valid by ietestform, the following three criteria must be met:

For each begin_ field, there must be an end_ field.
The corresponding end_ field must be of the correct type. That is, a begin_group should not be closed by an end_repeat, and a begin_repeat should not closed by an end_group.
The names of the end_ fields must match the names of begin_ fields. The SurveyCTO server already tests to makes sure that the begin_ names are unique, so each begin_ and end_ pair will also be unique if this condition is met.

Naming and Labeling

ODK applies very few restrictions to field names and other inputs. Therefore, datasets crated in ODK often contain variable names and labels that are not valid in Stata and will cause an error when the dataset is imported in Stata. For example, ODK only requires that all variable names must be unique, and does not allow the use of a few special characters. The ODK syntax test on the SurveyCTO server tests for only these restrictions. ietestform performs some additional tests which ensure that the datasets are valid, and optimized for being imported in Stata.

Stata-specific labels

ietestform returns a flag if your survey form is not programmed to display Stata-specific labels. In SurveyCTO, for instance, you can program your form to display questions in multiple languages. This is done by creating label columns named label:english, label:swahili, label:hindi, and so on. You can then choose which language to use for labels when exporting the dataset to Stata from SurveyCTO.

You can use the same feature to create Stata-specific labels, by adding a label "language" called label:stata. You can obviously add and modify labels after importing the dataset to Stata as well. However, this is the simplest way to add Stata-specific labels. If this practice is not used, the data set may end up being incorrectly labeled, or require labor intensive re-labeling after importing to Stata. ietestform applies the same test on the choices sheet as well, to ensure that all labels in the choices sheet are optimized for importing into Stata.

Length of variable labels

In Stata, there is a restriction on the length of variable labels. Variable labels in Stata cannot be longer than 80 characters, and Stata truncates variable labels that are longer. ietestform checks for this by listing all fields with entries in Stata's label column that are longer than 80 characters.

Length of variable names

Similarly, Stata also restricts the length of variable names to 32 characters. If the name is longer than that, Stata will either truncate the name, or replace the name with generic names like var1, var2, etc. if the truncated name is no longer unique. While you can make these changes in Stata as well, it is much easier to solve these issues before starting with the data collection. ietestform therefore flags all fields with variable names longer than 32 characters.

Length of field names in repeat groups

With respect to field names in repeat groups, ietestform lists two kinds of fields in the report. Firstly, it lists fields in repeat groups that have names that will be too long in the wide format after exporting to Stata. Secondly, it lists fields in repeat groups for which the risk of having names that are too long is high, but not certain.

It is important to remember that when you use the SurveyCTO-generated Stata do-file, or export a dataset in wide format, a suffix is automatically added to the variable names that are created inside repeat groups. For example, if a group of questions is repeated three times, the wide version of the resulting dataset will contain three variables for each question in the repeat group. Each of these three variables will have the same name, followed by 1, 2 and 3, that is varname_1, varname_2, and varname_3. Therefore, variables created inside a single repeat group should not have a name that is longer than 30 characters so that final length is not longer than 32 characters.

Similarly, if the field is in a nested repeat group (a repeat group inside another repeat group), a suffix will be added once for each repeat group. In this case, the actual restriction on the length that will be used by ietestform is given by this formula: 32 − (2 × depth of nested repeats). In this case, ietestform will list all variables that have names longer than the number given by this formula.

However, these restrictions assume that there are no more than 9 questions in each repeat group. If there were more than 9 questions, the suffixes would be 10, 11, etc., which take up three characters. For example, for the 10th question of a repeat group , the variable name would be suffixed as varname_10. In this case, ietestform lists all fields with names that are longer than 32 − (3 × depth of nested repeats). This is an example of the second test, since it is is uncertain whether this will create an issue with names that are too long. However, if you think that field names are so long that they might be reported by this test, you may consider reducing the length of the field names.

Repeat group naming conflicts

ietestform also flags name conflicts that could result from repeat suffixes (like _1, _2) that are added to field names inside a repeat group. The ODK syntax test in SurveyCTO checks whether field names are unique. For example, the names myvar and myvar_1 are both unique according to the ODK syntax test. But if myvar appears as a variable in a repeat group, it will appear with a repeat suffix as myvar_1 for the answer to the first question in the repeat group. This will then create a name conflict with the variable named myvar_1 which lies outside the repeat group.

In such cases, ietestform flags all variables inside a repeat group that could possibly create such a naming conflict. For example, if there is a variable with the name myvar, the command checks if there are any other variable names with the format myvar_#, where # is one or more digits. Similarly, if the variable myvar is in a nested repeat group (a repeat group inside a repeat group), then ietestform checks for myvar_#, myvar_#_# and so on.

Note: If the variables myvar and myvar_1 are both in non-nested repeat groups, there will be no naming conflicts. In this case, the repeat suffixes will generate myvar_1 and myvar_1_1. However, ietestform will still list these fields as it may be not be clear to someone going through the dataset that myvar_1 is from the field myvar, and not from myvar_1.

Leading and trailing spaces

ietestform also reports any fields that have leading (" ABC") or trailing ("ABC ") spaces, as these can cause unexpected problems. For example, consider a list in the choice sheet called "village", but what is actually written is "village ". In Excel you will not see this extra space unless you look closely. While some tools will treat this as "village", others might treat it as "village ", which are not the same. ietestform will flag these fields so you can prevent such errors.

Choice Lists

ietestform tests also deal with choice lists, that is, lists that are created for select_one and select_multiple types of fields in the choices sheet on Excel. The choices sheet lists all response labels in a separate Excel sheet, along with corresponding integer values. The ODK syntax is very lenient when it comes to choice lists which are then translated into value labels in Stata. This can lead to a lot of errors such as typographical errors, missing values, and duplicate values which affect the datasets imported into Stata. ietestform flags issues like these that can arise due to coding errors in ODK-based platforms. For example, unused choice lists and duplicate labels could mean that the person coding the survey copied and pasted the elements of a list incompletely or incorrectly.

Numeric value and name

Stata usually stores categorical data by assigning integer (numeric) values to string (alphabetical) labels. For example, this means assigning a value of 2 to "Yes", 1 to "No", and 0 to "Declined to answer". Although SurveyCTO allows string values for questions that have categorical responses, we recommend using integer labels instead. This is because string labels take up more memory, especially when importing large datasets, and many Stata functions that deal with categorical variables cannot handle string labels. ietestform therefore reports all list items that have a non-numeric value in the value or name column.

Unused choice lists

ietestform checks that all choice lists defined in the choices sheet are actually used in at least one select_one or select_multiple field in the survey sheet. While it is not incorrect to have some lists that are unused, it could still be a sign of choice lists that are not in sync with an updated version of the survey form. In such cases, unused choice lists can cause errors, or contain items that will not be displayed during the survey.

For example, imagine you have 10 villages in a choice list called village, but you incorrectly type vilage for one of them. Then, according to ODK syntax you will have two lists - one called village with 9 items, and one called vilage with 1 item. In this case, it is likely that there are no select_one or select_multiple fields that uses the choice list called vilage, so ietestform is a good way to spot a typographical error like this.

Duplicate value and label

ietestform makes sure that there are no duplicates in the names given to individual items in a choice list , and the codes (under the value column) assigned to each item in the choices sheet. This test will list all items of the choice list that have the same two values under the name and value columns.

ietestform also makes sure that there is only one label in a given choice list for a given code. This test lists all list items that have the same two values in the name and label columns. For example, suppose that for the choice list called village, "Village A" and "Village B" both have the same code, that is, 1, under the code column. Then ietestform will list both "Village A" and "Village B" along with the name of the choice list, that is, village.

Missing label, value, or name

In the first part of this test, ietestform lists all items in a choice list that have an entry under the label column, but have nothing under the value or name column. In the second part of the test, it also lists cases where the exact opposite occurs. This can sometimes happen when the survey form is programmed in multiple languages, or when the coding is incomplete.

Outdated Syntax

SurveyCTO updates their syntax of expressions which tend to have advanced features compared with the previous versions of the syntax. It is recommended to use the latest syntax to ensure full functionality of the expression and avoid potential issues.

ietestform tests to make sure the latest syntax is being used in the survey form. This includes

when the outdated syntax of position() is being used instead of the index()
when the outdated syntax of jr:choice-name() is being used instead of the choice-label()

Encryption

Encryption of survey forms is an integral part of reducing the risk of exposing confidential or personally identifiable data. You can learn how to encrypt your form on SurveyCTO here.

Related Pages

Click here for pages that link to this topic.
This page is part of the topic iefieldkit.

Additional Resources

Other frameworks for testing ODK-based or SurveyCTO forms include:

IPA, ipacheckscto
PMA2020, xform-test

@@ Line 1: / Line 1: @@
-'''ietestform''' is a Stata command used to test ODK based SurveyCTO forms before they are used in the field. SurveyCTO's server has a test feature that tests the ODK syntax of the form. This command is not meant as a substitute to that test, but a complement as '''ietestform''' test, for example, for cases that are likely to be typos that affect the ODK logic or that the data it generates will suit the format required by Stata, and it also points the user to commonly used best practices unless they are already used in the form.
+[https://www.worldbank.org/en/research/dime/data-and-analytics DIME Analytics] has created <code>[[iefieldkit]]</code> as a package in [[Stata Coding Practices|Stata]] to support the process of [[Primary Data Collection|primary data collection]] from start to finish. In most cases, third party [[Survey Firm|survey firms]] or local partners collect data on behalf of the [[Impact Evaluation Team|research team]]. Therefore, [[Data Quality Assurance Plan|data quality assurance]] is a particularly important aspect of data collection. <code>ietestform</code> allows the research team to test [https://opendatakit.org/ Open Data Kit (ODK)-based] electronic  [[Field Surveys|survey forms]] for common errors, as well as [[SurveyCTO Coding Practices | best practices]] for [https://www.surveycto.com/ SurveyCTO-based] forms before [[Preparing for Field Data Collection|field data collection]] starts. For example, the [[SurveyCTO Server Management|SurveyCTO server]] has a built-in test feature that tests the '''ODK''' syntax of a form when it is uploaded by the '''research team'''. <code>ietestform</code> complements these built-in tests to ensure that the collected data is in a format that is easily readable in Stata, and warns users who use practices we have learnt are prone to data quality errors.
-There are other frameworks for testing ODK/SurveyCTO forms similarly to '''ietestform'''. Two examples are [https://github.com/PovertyAction/ipacheckscto IPA's ipacheckscto] and [http://xform-test-docs.pma2020.org/ PMA2020's xform-test].
+==Read First==
+* Please refer to [[Stata Coding Practices|Stata coding practices]] for coding best practices in Stata.
+* <code>ietestform</code> is part of the package <code>[[iefieldkit]]</code>, which has been developed by [https://www.worldbank.org/en/research/dime/data-and-analytics DIME Analytics].
+* To install <code>ietestform</code>, as well as other commands in the <code>iefieldkit</code> package, type <syntaxhighlight lang="Stata" inline>ssc install iefieldkit</syntaxhighlight> in Stata.
+* For instructions and available options, type <syntaxhighlight lang="Stata" inline>help ietestform</syntaxhighlight>.
-This article is meant to describe use cases, work flow and the reasoning used when developing the tests in this command. For instructions on how to use the command specifically in Stata and for a complete list of the options available, see the help files by typing <code>help '''ietestform'''</code> in Stata after installing it. This command is a part of the package [[Stata_Coding_Practices#iefieldkit|iefieldkit]], to install all the commands in this package including this command, type <code>ssc install iefieldkit</code> in Stata or by following the [https://github.com/worldbank/iefieldkit installation instructions here].
+== Overview ==
+In '''Open Data Kit (ODK)-based''' electronic survey kits, including [https://www.surveycto.com/ SurveyCTO], '''survey forms''' (or questionnaires) are typically [[SurveyCTO Programming#Programming in Excel|built in Excel]] using a specialized structured syntax. Before the [[Impact Evaluation Team|research team]] starts with [[Preparing for Field Data Collection|field data collection]], they can use <code>ietestform</code> to test '''ODK-based''' [[Field Surveys|electronic survey forms]] for common errors, as well as [[SurveyCTO Coding Practices | best practices]] for '''SurveyCTO-based''' forms.
-== Intended use cases ==
+For example, the [[SurveyCTO Server Management|SurveyCTO server]] has a built-in feature that tests the '''ODK''' syntax of a form when it is uploaded by the '''research team'''. <code>ietestform</code> complements these built-in tests to ensure that the collected data is in a format that is easily readable in Stata, and warns users who use practices we have learnt are prone to data quality errors. Therefore, the <code>ietestform</code> command should be used after testing the survey form on a '''SurveyCTO server''' to make sure there are no syntax errors.
-This command is intended to be used '''after''' it is tested on SurveyCTO's server to make sure that there are no syntax errors in the form, but '''before''' it is used in the field. This command writes a report that outputs the results of several tests (the tests are described below). The report is in csv-format so it can be viewed in Excel and is raw text so it can be tracked in versioning control frameworks like GitHub.
-If you are not sure why something was caught by this command and listed in the report, then read the explanations of each test below. If you think that the command incorrectly catches cases in your SurveyCTO form then please report that [https://github.com/worldbank/iefieldkit/issues here] and we will be very happy to work on improving the command.
+== Syntax ==
+The basic syntax for <code>ietestform</code> is as follows:
+<syntaxhighlight lang="Stata">ietestform
+   , surveyform("filename.xlsx")
+     report("report.csv")</syntaxhighlight>
+The <code>ietestform</code> command generates a report in '''.csv''' format. The report flags errors in [[SurveyCTO Programming|coding]], as well as practices that are not strictly wrong, but which may indicate bad practices, and therefore need a manual review. The report generated by <code>ietestform</code> can be displayed in a number of software applications, and can also be used with collaboration tools like [https://github.com/ GitHub].
-This command has many different tests but only some of them are direct errors. The tests that are not errors are meant to highlight things that experienced ODK coders in SurveyCTO usually are looking for to spot potential errors or bad practices.
+If you think that the command incorrectly flagged issues in your '''SurveyCTO''' form, please report the case [https://github.com/worldbank/iefieldkit/issues here] to help [https://www.worldbank.org/en/research/dime/data-and-analytics DIME Analytics] improve the command. Refer to the following sections for a detailed explanation of the tests performed by <code>ietestform</code>. These tests are meant to flag errors that may interrupt [[Preparing for Field Data Collection|field work]]. Note that the <code>ietestform</code> should be used only after the form has passed the '''ODK''' syntax checks on the '''SurveyCTO server'''.
-It is important to note that it is not necessary the case that something is incorrect just because something was caught in a test in this command. Instead, think of the report as a tool that points to things you should investigate further.
+== Required Column ==
+'''Required fields''' ensure that the [[Survey_Pilot_Participants#Participant Roles|enumerators]] cannot proceed without entering a response to a particular field (each question is a field). This prevents submissions of incomplete forms, and helps ensure that '''enumerators''' complete forms in the right order. A field is '''required''' if it has the '''"Yes"''' value in the '''''required''''' column.
-In the end you will have changes some things in your form so that they do not longer show up in the report, but it is perfectly normal that several things are still in the report when you launch the form in the field. Just make sure that you understand why they were listed and make sure that the risk that was the reason why they were listed does not apply to your case, or that it is a risk you are willing to take.
+It is common that respondents do not have an answer, or do not want to share an answer, to a question, but a missing value should never be used to represent such non-answers. Instead, the questionnaire should allow non-answers, for example, "''I do not know''" or "''Decline to answer''" as valid answers.
+Therefore, almost all fields should be required in an ODK survey while still being able to handle non-answers.
-== Instructions ==
+Note that only column types that show up when filling the form are affected by that value. For example, fields like '''begin_group''', '''end_repeat''', '''text_audit''' do not show up while filling the form, and so tests related to the '''''required''''' columns ignore these fields.
-These instructions are meant to help you understand the tests that this command runs on your SurveyCTO questionnaire form. For technical instructions on how to run the command in Stata see the help file by typing <code>help '''ietestform'''</code> in Stata.
-This command is very simple to use in Stata, you only need to specify your SurveyCTO form and where on your computer you want the command to write the report.
+<code>ietestform</code> runs two tests related to the required columns depending on whether they are '''note''' type or non-'''note''' type. Fields which are of the '''note''' type are those for which the '''enumerator''' does not have to enter any input. Instead, the enumerator only needs to read out a specific text note.
+=== Non-note fields: ''required'' ===
+<code>ietestform</code> tests to make sure that all fields that are not of '''note''' type have the value '''"Yes"''' in the '''''required''''' column, that is, they are '''required'''. The final report then lists all those fields not of type '''note''', but are not required.
-    ietestform , surveyform("/path/to/surveyform.xlsx") report("/path/to/report.csv")
+Even when some type of non-response by a [[Survey_Pilot_Participants#Participant Roles|respondent]], such as '''“Declined to answer”''', is acceptable, there should always be a valid method to record the reason for no response. The '''enumerator''' should not leave the input field empty in this case. The absence of a recorded answer should only mean that the enumerator did not ask the  question during the survey. In cases where it is acceptable to skip a question, you should use an appropriate '''relevance''' condition.
-== Explanation of tests  used in this command ==
+Fields that record GPS coordinates for instance, are some of the fields that may intentionally have a ''' "No" ''' value under the '''''required''''' column. Such fields often have their type as '''geopoint''', '''geoshape''', or '''geotrace'''. If you know that you will have no problem collecting GPS
-'''ietestform''' only outputs results from tests that identified an error or a potential bad practice. If a test does not find anything, the test is not at all mentioned in the report.
+coordinates, then you should have a '''"Yes"''' value in the '''''required''''' column to ensure that you get valid data points.
-----
+However, if GPS coordinates are difficult to collect, then it might be a good idea to not have a '''"Yes"''' value under the '''''required''''' column. This will allow the enumerator to complete the other fields and submit the survey even if it is not possible to record GPS coordinates. In this case, <code>ietestform</code> will still report these fields, but you can still proceed with [[Field Surveys#Survey Launch|launching the survey]] if it was an active decision you are happy with.
-===  Coding Practices ===
-This section describes tests related to best practices on how to use and how to not use features in the ODK programming language to reduce the risks of error that interrupts the field work and to ensure data quality. Note that these tests does not test if the ODK syntax is valid since this command is intended to be used '''after''' the form has passed the ODK syntax test on SurveyCTO's server.
-==== Required Column ====
+=== Note fields: not ''required'' ===
-The required column makes sure that the enumerator cannot proceed before a response have been filled in for that field. This is a great feature as it prevents incomplete forms to be submitted, and it helps  making sure that enumerators fill in the forms in the right order. A field that is required can not be passed until data has been recorded for it.
+While fields of the '''note''' type can have a '''"Yes"''' value in the '''''required''''' column, they cannot record an input. Therefore, if an '''enumerator''' comes across such a field during a [[Computer-Assisted Personal Interviews (CAPI)|live survey]] , they cannot move past this field. In this case, there is no way to continue with the interview, and the enumerator will not be able to submit the data already collected from previous questions. <code>ietestform</code> therefore reports a list of all fields that are of the '''note''' type, and have a '''"Yes"''' value in the '''''required''''' column.
-Only field types with a ''view'', i.e. showing up when filling in a form, are affected by the value in the required column. Examples of fields without a view are begin_group, end_repeat, text_audit, caluclate, deviceid, caseid, etc. All fields without a view are ignored in the tests that related to the required column.
+Note that there are cases in which '''note''' fields which are required may be useful. Since enumerators cannot move past these fields, you may use them with a '''relevance''' condition so that these fields show up if an earlier entry in the form is incorrect. This will force the enumerator to go back and correct the error before continuing with the interview..
-There are two tests in '''ietestform''' related to the required column.
+For example, enumerators often enter respondent IDs twice to make sure there is no typo in the ID. You may name the two entry fields '''id1''' and '''id2'''. Then you can follow these fields with a '''required ''' '''note''' field which has the '''relevance''' expression as <code>${id1} != ${id2}</code>.  In this case, the '''note''' type field will only appear if the two entries are not identical. You can use the '''note''' text to inform the enumerator that the two ID fields are not identical, and that the enumerator must go back and change the values in order to continue.
-===== '''All Non-Note Fields Required''' =====
+== Matching begin_ and end_ ==
-This test tests that all fields that are not of type ''note'' (see the other Required Column test below) have the value "Yes" in the required column. This test outputs a list to the report for all fields that are not ''required'' and not of type ''note''.
+The <code>ietestform</code> command checks that all '''begin_group''' fields are matched by an '''end_group''', and that all '''begin_repeat''' fields are matched by an '''end_repeat'''. While the '''ODK syntax''' tester on the [[SurveyCTO Server Management|SurveyCTO server]] also tests for matching '''begin_''' and '''end_''' values, <code>ietestform</code> command provides additional information that makes it faster and easier to solve this problem, especially when the [[Questionnaire Design|survey form]] (or questionnaire) is very large.
-Even for questions where it is sometimes expected that there is no answer to be recorded, it is much better practice to have a answer option that represents "No Answer" rather than having the enumerators leaving the field unanswered.
+For example, [https://opendatakit.org/ ODK] does not require that the '''end_group''' and '''end_repeat''' fields should have field
+names ('''begin_group''' and '''begin_repeat''' are required to have names). This makes it difficult to identify where the error is in the underlying
+'''survey form'''. However, <code>ietestform</code> fills that gap because it requires also '''end_group''' and '''end_repeat''' fields should have names and that they should match the corresponding '''begin_group''' and '''begin_repeat''' field. <code>ietestform</code> lists these missing
+names in the report, along with the row number (in the Excel form) of other ''non-valid'' '''begin_''' and '''end_''' pairs.
-Some fields that are commonly left ''not'' required intentionally are fields that require the GPS. Those fields are geoppoint, geoshape and geotrace. If you know that your the devices that will be used for data collection will have no problem collecting GPS coordinates, then keep those fields required to ensure you will get valid data points. But if you are working in a context where GPS coordinates will be difficult to collect, then it could be a good ides to not require these fields, so that the enumerator can complete the other fields and be able to submit the form even when it was impossible to record GPS coordinates.
+For a '''begin_''' and '''end_''' pair to be considered '''valid''' by  <code>ietestform</code>, the following three criteria must be met:
+# For each '''begin_''' field, there must be an '''end_''' field.
+# The corresponding '''end_''' field must be of the correct type. That is, a '''begin_group''' should not be closed by an '''end_repeat''', and a '''begin_repeat''' should not closed by an '''end_group'''.
+# The names of the '''end_''' fields must match the names of '''begin_''' fields. The '''SurveyCTO server''' already tests to makes sure that the '''begin_''' names are unique, so each '''begin_''' and '''end_''' pair will also be unique if this condition is met.
-===== '''No Note Fields Required''' =====
+== Naming and Labeling ==
-For fields of type ''note'' there is no way to record data, and there is therefore no way to pass a required note-field. If this happens in the field there is no way to pass this field, and therefore no way to complete and submit the data. See the exception below for a use case when it is great practice to use this feature for data quality assurance. This test writes a list to the report of all fields that are of type ''note'' and are ''required''.
+[https://opendatakit.org/ ODK] applies very few restrictions to '''field names''' and other inputs. Therefore, datasets crated in '''ODK''' often contain variable names and labels that are not valid in [[Stata Coding Practices|Stata]] and will cause an error when the dataset is imported in Stata. For example, '''ODK''' only requires that all variable names must be unique, and does not allow the use of a few special characters. The '''ODK syntax test''' on the [[SurveyCTO Server Management|SurveyCTO server]] tests for only these restrictions. <code>ietestform</code> performs some additional tests which ensure that the datasets are valid, and optimized for being imported in Stata.
+=== Stata-specific labels ===
+<code>ietestform</code> returns a flag if your survey form is not [[Questionnaire Programming|programmed]] to display Stata-specific labels.
+In '''SurveyCTO''', for instance, you can [[SurveyCTO Programming|program]] your form to display questions in multiple languages. This is done by creating label columns named '''label:english''', '''label:swahili''', '''label:hindi''', and so on. You can then choose which language to use for labels when exporting the dataset to Stata from SurveyCTO.
-While required notes will always be listed, there are cases when they are really useful. Since they are not possible to pass, they can be used together with a relevance condition so that they show up if something earlier in the form is not correct and the enumerator should be forced to go back and correct before continuing the data collection.
+You can use the same feature to create Stata-specific labels, by adding a label "language" called '''label:stata'''. You can obviously add and modify labels after importing the dataset to Stata as well. However, this is the simplest way to add Stata-specific labels. If this practice is not used, the data set may end up being incorrectly labeled, or require labor intensive re-labeling after importing to Stata. <code>ietestform</code> applies the same test on the '''''choices''''' sheet as well, to ensure that all labels in the '''''choices''''' sheet are '''optimized''' for importing into Stata.
-For example, enumerators are often asked to enter respondent IDs twice to be extra careful that there is no typo in the ID. Let's say those two double entry fields are <code>id1</code> and <code>id2</code>. Then they can be followed by  required note-field that has the relevance expression <code>${id1} != ${id2}</code> so that the note only show if the two IDs are not identical. The note label can then inform the enumerator that the two ID fields are not identical and that the enumerator must go back and change the values in order to continue.
+===  Length of variable labels  ===
+In Stata, there is a restriction on the length of '''variable labels'''. Variable labels in Stata cannot be longer than 80 characters, and Stata truncates variable labels that are longer.  <code>ietestform</code> checks for this by listing all fields with entries in Stata's '''''label''''' column that are longer than 80 characters.
+=== Length of variable names ===
+Similarly, Stata also restricts the length of '''variable names''' to 32 characters. If the name is longer than that, Stata will either truncate the name, or replace the name with generic names like '''var1''', '''var2''', etc. if the truncated name is no longer unique. While you can make these changes in Stata as well, it is much easier to solve these issues before starting with the [[Preparing for Field Data Collection|data collection]]. <code>ietestform</code> therefore flags all fields with
+variable names longer than 32 characters.
+=== Length of field names in repeat groups ===
+With respect to '''field names''' in '''repeat''' groups, <code>ietestform</code> lists two kinds of fields in the report. Firstly, it  lists fields in '''repeat''' groups that have names that will be too long in the '''wide format''' after exporting to Stata. Secondly, it lists fields in '''repeat''' groups for which the risk of having names that are too long is high, but not certain.
-The same functionality could have been achieved using the constraint field when the ID is re-entered, but the label in the note field can be made more informative than the constraint message, and when the conditional test is more difficult than just testing that two fields are identical, then this method is easier by using intermediate calculate fields that are then used in the relevance column for the required note-field.
+It is important to remember that when you use the SurveyCTO-generated Stata
+'''do-file''', or export a dataset in wide format, a suffix is automatically added to the variable names that are created inside '''repeat''' groups. For example, if a group of questions is repeated three times, the wide version of the resulting dataset will contain three variables for each question in the '''repeat''' group. Each of these three variables will have the same name, followed by 1, 2 and 3, that is '''varname_1''', '''varname_2''', and '''varname_3'''. Therefore, variables created inside
+a single '''repeat''' group should not have a name that is longer than 30 characters so that final length is not longer than 32 characters.
-==== Numeric Ranges ====
+Similarly, if the field is in a '''nested''' '''repeat''' group (a '''repeat''' group inside another '''repeat''' group), a suffix will be added once for each '''repeat''' group. In this case, the actual restriction on the length that will be used by <code>ietestform</code> is given by this formula:
-''not implemented yet in beta version''
+'''32 − (2 × depth of nested repeats)'''. In this case, <code>ietestform</code> will list all variables that have names longer
+than the number given by this formula.
-All numeric fields, integer fields or decimal fields should have ranges for acceptable values in the constraint column. Make this range wider than what you expect it! The range in the constraint column should be used to prevent typos, to prevent illogical values (like negative age) but ''not to force the data to be within your preexisting expectations''. Your preexisting expectation is a good starting point for this range, but make it much wider than that as we do not yet know what special cases your data collection will encounter in the field, and these outliers are very important to understand for your research.
+However, these restrictions assume that there are no more than 9 questions in each '''repeat''' group. If there were more than 9 questions, the suffixes would be 10, 11, etc., which take up three characters. For example, for the 10th question of a '''repeat''' group , the variable name would be suffixed as '''varname_10'''. In this case, <code>ietestform</code> lists all fields with names that are  longer than '''32 − (3 × depth of nested repeats)'''. This is an example of the second test, since it is is uncertain whether this will create an issue with names that are too long. However, if you think that field names are so long that they might be reported by this test, you may consider reducing the length of the field names.
-==== Matching begin_/end_ ====
+===Repeat group naming conflicts ===
-The main aspect of this test is done by the ODK syntax tester on SurveyCTO's server, but the error message for this error are not always useful, especially when the form is very large. One of the main reasons for this might be that ODK does not require the end_group and end_repeat to have field names. So the first part of this test is that all end_group and end_repeat fields are required to have a value in the field name column, and for the second part of this test the name has to be identical to the corresponding begin_group and begin_repeat field.
+<code>ietestform</code> also flags name conflicts that could result from '''repeat''' suffixes (like '''_1''', '''_2''') that are added to field names inside a '''repeat''' group. The '''ODK syntax''' test in '''SurveyCTO''' checks whether field names are unique. For example, the names '''myvar''' and '''myvar_1''' are both unique according to the '''ODK syntax'''
+test. But if '''myvar''' appears as a variable in a '''repeat''' group, it will appear with a '''repeat''' suffix as '''myvar_1''' for the answer to the first question in the '''repeat''' group. This will then create a name conflict with the variable named '''myvar_1''' which lies outside the '''repeat''' group.
-The main part of this test is to test that all begin_group are matched by an end_group and that all begin_repeat are matched by an end_repeat. To be considered matched the following three criteria needs to be fulfilled:
+In such cases, <code>ietestform</code> flags all variables inside a '''repeat''' group that could possibly create such a naming conflict. For example, if there is a variable with the name '''myvar''', the command checks if there are any other variable names with the format '''myvar_#''',
-# for each begin_ there is an end_
+where '''#''' is one or more digits. Similarly, if the variable '''myvar''' is in a '''nested''' '''repeat''' group (a '''repeat''' group inside a '''repeat''' group), then <code>ietestform</code> checks for '''myvar_#''', '''myvar_#_#''' and so on.
-# that the corresponding _end is of the correct type so that a begin_group is not closed by a end_repeat
-# tests that the end_ names match the begin_ names. SurveyCTO's server makes sure that the begin_ names are unique, so each pair will be unique if this part of the test is passed
-----
+'''Note:''' If the variables '''myvar'''  and  '''myvar_1''' are both in non-'''nested repeat''' groups, there will be no naming conflicts. In this case, the '''repeat''' suffixes will generate '''myvar_1''' and '''myvar_1_1'''. However, <code>ietestform</code> will still list these fields as it may be not be clear to someone going through the dataset that '''myvar_1''' is from the field '''myvar''', and not from '''myvar_1'''.
-=== Naming and Labeling Practices ===
+=== Leading and trailing spaces ===
-ODK have very few restrictions on names apart from all names must be unique and that there are a few characters that are not allowed. All of those restrictions are tested by the ODK syntax test on SurveyCTO's server. The tests in this section are mainly due to the additional rules for names that Stata has that comes into effect when importing your data to Stata.
+<code>ietestform</code> also reports any fields that have leading ('''" ABC"''') or trailing ('''"ABC "''') spaces, as these can cause unexpected problems. For example, consider a list in the '''''choice''''' sheet called '''"village"''', but what is actually written is '''"village "'''. In Excel you will not see this extra space unless you look closely. While some tools will treat this as '''"village"''', others might treat it as '''"village "''', which are not the same. <code>ietestform</code> will flag these fields so you can prevent such errors.
-==== Field Name Length ====
+== Choice Lists ==
-Stata has a limit of 32 characters in the field name. Stata will truncate the name if the name is longer than that, and replace the name with a generic name on the format var1, var2 etc. if the name is no longer unique after being truncated. All of these cases can be resolved in Stata but it will be much simpler to solve this before starting to collect the data. This test list all fields with names longer than 32 characters.
+<code>ietestform</code> tests also deal with '''choice lists''', that is, lists that are created for '''select_one''' and '''select_multiple''' types of fields in the '''''choices''''' sheet on Excel. The '''''choices''''' sheet lists all response labels in a separate Excel sheet, along with corresponding integer values. The '''ODK syntax''' is very lenient when it comes to '''choice lists''' which are then translated into value labels in Stata. This can lead to a lot of errors such as typographical errors, missing values, and duplicate values which affect the datasets imported into Stata. <code>ietestform</code> flags issues like these that can arise due to coding errors in '''ODK-based''' platforms. For example, unused
+choice lists and duplicate labels could mean that the person [[SurveyCTO Coding Practices|coding the survey]] copied and
+pasted the elements of a list incompletely or incorrectly.
+=== Numeric ''value'' and ''name'' ===
+Stata usually stores categorical data by assigning '''integer''' (numeric) values to '''string''' (alphabetical) labels. For example, this means assigning a value of '''2''' to '''"Yes"''', '''1''' to '''"No"''', and '''0''' to '''"Declined to answer"'''. Although [[SurveyCTO Programming Work Flow|SurveyCTO]] allows string values for questions that have categorical responses, we recommend using integer labels instead. This is because string labels take up more memory, especially when importing large datasets, and many Stata functions that deal with categorical variables cannot handle string labels. <code>ietestform</code> therefore reports all list items that have a non-numeric value in the '''''value''''' or '''''name''''' column.
-==== Repeat Group Field Name Length ====
+===Unused choice lists===
-This test has two parts, where the first part list fields that will have too long names in the wide format when importing to Stata and the second part lists fields where the risk is high that that will happen but it is not certain.
+<code>ietestform</code> checks that all '''choice lists''' defined in the
+'''''choices''''' sheet are actually used in at least one '''select_one''' or '''select_multiple''' field in the '''''survey''''' sheet. While it is not incorrect to have some lists that are unused, it could still be a sign of choice lists that are not in sync with an updated version of the [[Questionnaire Design|survey form]]. In such cases, unused choice lists can cause errors, or contain items that will not be displayed during the [[Field Surveys|survey]].
-When using SurveyCTO's Stata import do-file or when exporting the data set in wide format, all variables in a repeat group will have a suffix added to the variable name. If a repeat group is repeated three times, then in the wide data set any variable in that repeat group will generate three variables, with the names suffixed by _1, _2 and _3 respectively. This suffix is required to fit within the 32 characters limitation for variable names in Stata discussed in the previous test. So any variable in a repeat group may only have a 30 characters long field name. If the field is in a nested repeat group (a repeat group inside a repeat group) then it will be suffixed once for each repeat group. So the actual constraint used in this test is given by this formula: <code>32 - (2 * number of nested repeat groups for the field)</code>. This test list all variables that have longer names then that constraint.
+For example, imagine you have 10 villages in a choice list called '''village''', but you incorrectly type '''vilage''' for one of them. Then, according to '''ODK syntax''' you will have two lists - one called '''village''' with 9 items, and one called '''vilage''' with 1 item. In this case, it is likely that there are no '''select_one''' or '''select_multiple''' fields that uses the choice list called '''vilage''', so <code>ietestform</code> is a good way to spot a typographical error like this.
-In the first test we assume that there are not more than 9 iteration in each repeat group, but if there would be more than 9 then the suffix will be _10 which takes up three characters. So the second test list all fields that have a field name that is longer than <code>32 - (3 * number of nested repeat groups for the field)</code>. It is not sure that this will create an issue with long names, but if your names are so long that they might be caught in this test, then there is probably a best practice to try to make the name shorter.
+=== Duplicate ''value'' and ''label'' ===
+<code>ietestform</code> makes sure that there are no duplicates in the names given to individual items in a '''choice list''' , and the codes (under the '''''value''''' column) assigned to each item in the '''''choices''''' sheet. This test will list all items of the choice list that have the same two values under the '''''name''''' and '''''value''''' columns.
-==== Repeat Group Name Conflict ====
+<code>ietestform</code> also makes sure that there is only one label in a given choice list for a given code. This test lists all list items that have the same two values in the '''name''' and '''label''' columns. For example, suppose that for the choice list called '''village''', '''"Village A"''' and '''"Village B"''' both have the same code, that is, '''1''', under the '''''code''''' column. Then <code>ietestform</code> will list both '''"Village A"''' and '''"Village B"''' along with the name of the  choice list, that is, '''village'''.
-This test for name conflicts that could be a result from the suffixes that are added to fields inside a repeat group. SurveyCTO's ODK syntax tester tests that all names are unique. The name ''myvar'' and ''myvar_1'' are not duplicates in the ODK syntax test, but if ''myvar'' is in a repeat field it will be suffixed with _1 for the first iteration of that variable, and that will create a name conflict with the variable created from field ''myvar_1''.
-This test lists all field inside a repeat group for which there is another field where there is a risk for this type of name conflict. For example, if there is a field with name ''myvar'' it tests if there is any variable on the format ''myvar_#'' where # is one or several digits.
+===Missing ''label'', ''value'', or ''name'' ===
+In the first part of this test, <code>ietestform</code> lists all items in a '''choice list''' that have an entry under the '''''label''''' column, but have nothing under the '''''value''''' or '''''name''''' column. In the second part of the test, it also lists cases where the exact opposite occurs. This can sometimes happen when the survey form is [[Questionnaire Programming|programmed]] in multiple languages, or when the coding is incomplete.
-If the variable ''myvar'' is in a nested repeat group (a repeat group inside a repeat group) then it is testing for ''myvar_#'', ''myvar_#_#'', ''myvar_#_#_#'' etc. for each level of nested repeat group, where # is one or several digits.
+== Outdated Syntax ==
+SurveyCTO updates their syntax of [https://docs.surveycto.com/02-designing-forms/01-core-concepts/09.expressions.html expressions] which tend to have advanced features compared with the previous versions of the syntax. It is recommended to use the latest syntax to ensure full functionality of the expression and avoid potential issues.
-'''Technical special case:''' If the fields ''myvar'' and ''myvar_1'' are both in a non-nested repeat group then there will be no name conflict as the first iteration of both fields will generate the variables ''myvar_1'' and ''myvar_1_1'' as the variables from both fields are suffixed. These fields are still listed by this test as it will be confusing that the variable ''myvar_1'' is from field ''myvar'' and not from the ''myvar_1'' that has the same name, even though this is technically not a name conflict.
+<code>ietestform</code> tests to make sure the latest syntax is being used in the survey form. This includes
-==== Stata Labels Columns ====
+# when the outdated syntax of '''position()''' is being used instead of the '''index()'''
-In a SurveyCTO for you can program your form so that multiple languages can be displayed when filling in a form. This is done by having multiple label columns named ''label:english'', ''label:swahili'', ''label:hindi'' etc. When you export your data using SurveyCTO Sync you can choose which language you want to use for labels.
+# when the outdated syntax of '''jr:choice-name()''' is being used instead of the '''choice-label()'''
-The same feature can be used to create Stata labels by adding a label language called ''label:stata''. Labels can obviously be added and modified once the data set has been imported to Stata. However, our experience is that this is the simplest way to add them, and if this practice is not used, the data set is often never properly labeled.
+== Encryption ==
+[https://dimewiki.worldbank.org/wiki/Encryption Encryption] of survey forms is an integral part of reducing the risk of exposing confidential or personally identifiable data. You can learn how to [https://dimewiki.worldbank.org/wiki/Encryption#Encryption_with_SurveyCTO_Data encrypt your form] on SurveyCTO [https://github.com/worldbank/dime-standards/blob/master/dime-research-standards/pillar-4-data-security/data-security-resources/surveycto-encryption-guidelines.md here].
-If you do not use this practice, but still use SurveyCTO's Stata code for importing data sets to Stata, you will end up having the labels displayed in the questionnaire as labels for your Stata variable. That is much better than nothing, but those labels will not be very good labels for labeling variables in Stata.
+== Related Pages ==
+[[Special:WhatLinksHere/Ietestform|Click here for pages that link to this topic.]]<br>
+This page is part of the topic <code>[[iefieldkit]]</code>.
-===== '''Survey Sheet Stata Labels''' =====
+==Additional Resources==
-In Stata there is a restriction that the variable label is not longer than 80 characters. If you are trying to apply a label longer than that it will be truncated. For that reason, this test lists all fields with a label in the Stata label column that is longer than 80 characters.
+Other frameworks for testing '''ODK'''-based or '''SurveyCTO''' forms include:
+* IPA, <code>[https://github.com/PovertyAction/ipacheckscto ipacheckscto]</code>
-===== '''Choice Sheet Stata Labels''' =====
+* PMA2020, <code>[http://xform-test-docs.pma2020.org/ xform-test]</code>
-There are not specific tests to the Stata label column in the choice sheet other than that it exists.
-==== Leading and Trailing Spaces ====
-In computer science there is a difference between the string <code>"ABC"</code> and <code>"ABC "</code>. This difference does not show in Excel and when uploading your form to SurveyCTO's server the form checker is programmed to handle this. However, when you import your form to Stata, as '''ietestform''' and several other commands does, it makes a difference.
-For example, if you have a list in the choice sheet called ''village'' but the actual content of the cell is <code>"village "</code>. In Excel you will not see this extra space unless you really look for it. This means that some tools, probably most of them, will treat this as <code>"village"</code>, but other tools might treat it as <code>"village "</code> which when compared are not the same.
-What would be even worse is if some list item in the ''village'' list has the list name value  <code>"village"</code> and some has the value  <code>"village "</code>. This is very difficult to spot in Excel but some tools might treat these as different.
-Leading (<code>" ABC"</code>) or trailing (<code>"ABC "</code>) spaces are not difficult to deal with and most tools, '''iestestform''' included, deals well with them, but there is no guarantee that all of them do, and to reduce the risk of errors in whatever tools you use on your data in the future, leading and trailing spaces should be removed.
-----
-=== Choice Lists ===
-These are all tests related to the choice lists used in select_one and in select_multiple types of fields.
-==== Unused Choice Lists ====
-This test makes sure that all lists defined in the choices list sheet are actually used in at least one select_one or select_multiple field in the survey sheet. It is not incorrect to have unused lists, but it is likely a sign of something that is not kept up to date in your choice lists and might therefore cause an error, an expected behavior, or list items not being displayed during the survey.
-==== Value/Name Numeric ====
-In Stata, categorical data is best and most efficiently stored as a number with a value label. The easiest way to ensure that is the case with data collected by SurveyCTO is to use the Stata data import file they provide through SurveyCTO Sync, but that only works if there values in the value/name column in the choices sheet are numeric. It is not incorrect to use string var, but you will have to spend more time cleaning your data set to follow Stata best practices. This test lists all list items that has a non-numeric value in the value/name column.
-==== Duplicated List Code ====
-This test makes sure that there are no duplicates in list names and codes in the choice sheet. This test lists all list items that have other list items with the same two values in the name and code columns.
-==== Duplicated List Labels ====
-This test that there is no labels in the same list that is identical, i.e. one label that is listed twice for the same choice list but with different codes. This test lists all list items that have other list items with the same two values in the name and label columns.
-==== Missing Labels or Value/Name in Choice Lists ====
-The first part of this test makes sure that there is no list items that have a value in the label column but no value in the value/name column. The second part of this tests makes sure the opposite does not happen. This is extra likely to occur when a form is programmed in multiple languages. This test lists all list items caught by either if these two tests.
-== Back to Parent ==
-This article is part of the topic [[Stata_Coding_Practices#iefieldkit|iefieldkit]]
 [[Category: Stata ]]

Navigation

Tools

Difference between revisions of "Ietestform"

Revision as of 21:40, 18 November 2020

Contents

Read First

Overview

Syntax

Required Column

Non-note fields: required

Note fields: not required

Matching begin_ and end_

Naming and Labeling

Stata-specific labels

Length of variable labels

Length of variable names

Length of field names in repeat groups

Repeat group naming conflicts

Leading and trailing spaces

Choice Lists

Numeric value and name

Unused choice lists

Duplicate value and label

Missing label, value, or name

Outdated Syntax

Encryption

Related Pages

Additional Resources

Difference between revisions of "Ietestform"

Revision as of 21:40, 18 November 2020

Read First

Overview

Syntax

Required Column

Non-note fields: required

Note fields: not required

Matching begin_ and end_

Naming and Labeling

Stata-specific labels

Length of variable labels

Length of variable names

Length of field names in repeat groups

Repeat group naming conflicts

Leading and trailing spaces

Choice Lists

Numeric value and name

Unused choice lists

Duplicate value and label

Missing label, value, or name

Outdated Syntax

Encryption

Related Pages

Additional Resources

follow us

newsletter