Difference between revisions of "Stata Coding Practices: Programming (Ado-files)"
Line 108: | Line 108: | ||
==The <syntaxhighlight lang="stata" inline>syntax</syntaxhighlight> command== | ==The <syntaxhighlight lang="stata" inline>syntax</syntaxhighlight> command== | ||
The <syntaxhighlight lang="stata" inline>syntax</syntaxhighlight> command takes a program block and allows its inputs to be customized based on the context it is being executed in. The <syntaxhighlight lang="stata" inline>syntax</syntaxhighlight> command enables all the main features of Stata that appear in ordinary commands, including input lists (such as variable lists or file names), <syntaxhighlight lang="stata" inline>if</syntaxhighlight> and <syntaxhighlight lang="stata" inline>in</syntaxhighlight> restrictions, <syntaxhighlight lang="stata" inline>using</syntaxhighlight> targets, <syntaxhighlight lang="stata" inline>=</syntaxhighlight> applications, weights, and options (after the option comma in the command). | |||
The help file for the <syntaxhighlight lang="stata" inline>syntax</syntaxhighlight> command is extensive and allows lots of automated checks and advanced features, particularly for modern features like factor variables and time series (<syntaxhighlight lang="stata" inline>fv</syntaxhighlight> and <syntaxhighlight lang="stata" inline>ts</syntaxhighlight>). For advanced applications, always consult the <syntaxhighlight lang="stata" inline>syntax</syntaxhighlight> help file to see how to accomplish your objective. For now, we will take a simple tour of how <syntaxhighlight lang="stata" inline>syntax</syntaxhighlight> creates an adaptive command. | |||
First, let's add simple syntax allowing the user to select the variables and observations they want to include. We might write: | |||
<syntaxhighlight lang="stata"> | |||
cap prog drop levelslist | |||
prog def levelslist | |||
syntax anything [if] | |||
// Implement [if] | |||
preserve | |||
marksample touse | |||
qui keep if `touse' == 1 | |||
// Main program loops | |||
foreach var of varlist `anything' { | |||
qui levelsof `var' , local(levels) | |||
di " " | |||
di "Levels of `var': `: var label `var''" | |||
foreach word in `levels' { | |||
di " `word'" | |||
} | |||
} | |||
end | |||
</syntaxhighlight> | |||
==The <syntaxhighlight lang="stata" inline>temp</syntaxhighlight> commands== | ==The <syntaxhighlight lang="stata" inline>temp</syntaxhighlight> commands== |
Revision as of 21:43, 24 November 2020
Programs and ado-files are the main methods by which Stata code is condensed and generalized. By writing versions of code that apply to arbitrary inputs and saving that code in a separate file, the application of the code is cleaner in the main do-file and it becomes easier to re-use the same analytical process on other datasets in the future. Stata has special commands that enable this functionality. All commands on SSC are written as ado-files by other programmers; it is also possible to embed programs in ordinary do-files to save space and improve organization of code.
Read First
This article will refer somewhat interchangeably to the concepts of "programming", "ado-files", and "user-written commands". This is in contrast to ordinary programming of do-files. The article does not assume that you are actually writing an ado-file (as opposed to a program
definition in an ordinary dofile); and it does not assume you are writing a command for distribution. That said, Stata programming functionality is achieved using several core features:
- The
program
command sets up the code environment for writing a program into memory. - The
syntax
command parses inputs into a program as macros that can be used within the scope of that program execution. - The
tempvar
,tempfile
, andtempname
commands all create objects that can be used within the scope of program execution to avoid any conflict with arbitrary data structures.
The program
command
The program
command defines the scope of a Stata program inside a do-file or ado-file. When a program
command block is executed, Stata stores (until the end of the session) the sequence of commands written inside the block and assigns them to the command name used in the program
command. Using program drop
before the block will ensure that the command space is available. For example, we might write the following program in an ordinary do-file:
cap prog drop
prog def autoreg
reg price mpg i.foreign
end
After executing this command block (note that end
tells Stata where to stop reading), we could run:
sysuse auto.dta , clear
autoreg
If we did this, Stata would output:
. autoreg
Source | SS df MS Number of obs = 74
-------------+---------------------------------- F(2, 71) = 14.07
Model | 180261702 2 90130850.8 Prob > F = 0.0000
Residual | 454803695 71 6405685.84 R-squared = 0.2838
-------------+---------------------------------- Adj R-squared = 0.2637
Total | 635065396 73 8699525.97 Root MSE = 2530.9
------------------------------------------------------------------------------
price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
mpg | -294.1955 55.69172 -5.28 0.000 -405.2417 -183.1494
|
foreign |
Foreign | 1767.292 700.158 2.52 0.014 371.2169 3163.368
_cons | 11905.42 1158.634 10.28 0.000 9595.164 14215.67
------------------------------------------------------------------------------
All this is to say is that Stata has taken the command reg price mpg i.foreign
and will execute it whenever autoreg
is run as if it were an ordinary command.
As a first extension, we might try writing a command that is not dependent on the data, such as one that would list all the values of each variable for us. Such a program might look like the following:
cap prog drop levelslist
prog def levelslist
foreach var of varlist * {
qui levelsof `var' , local(levels)
di "Levels of `var': `: var label `var''"
foreach word in `levels' {
di " `word'"
}
}
end
We could then run:
sysuse auto.dta , clear
autoreg
Similarly, we could use any other dataset in place of auto.dta
. This means we would now have a useful piece of code that we could execute with any dataset open, without re-writing what is a mildly complex loop each time. When we want to save such a snippet, we usually write an ado-file: we name the file levelslist.ado
and we add a starbang line and some comments with some metadata about the code. The full file would look something like this:
*! Version 0.1 published 24 November 2020
*! by Benjamin Daniels bbdaniels@gmail.com
// A program to print all levels of variables
cap prog drop levelslist
prog def levelslist
// Loop over variables
foreach var of varlist * {
// Get levels and display name and label of variable
qui levelsof `var' , local(levels)
di "Levels of `var': `: var label `var''"
// Print the value of each level for the current variable
foreach word in `levels' {
di " `word'"
}
}
end
The file would then just need to be run using run levelslist.ado
in the runfile for the reproducibility package to ensure that the command levelslist
would be available to all do-files in that package (since programs have a global scope in Stata). However, this command is not very useful at this stage: it outputs far too much useless information, particularly when variables take integer or continuous values with many levels. The next section will introduce code that allows such commands to be customizable within each context you want to use them.
The syntax
command
The syntax
command takes a program block and allows its inputs to be customized based on the context it is being executed in. The syntax
command enables all the main features of Stata that appear in ordinary commands, including input lists (such as variable lists or file names), if
and in
restrictions, using
targets, =
applications, weights, and options (after the option comma in the command).
The help file for the syntax
command is extensive and allows lots of automated checks and advanced features, particularly for modern features like factor variables and time series (fv
and ts
). For advanced applications, always consult the syntax
help file to see how to accomplish your objective. For now, we will take a simple tour of how syntax
creates an adaptive command.
First, let's add simple syntax allowing the user to select the variables and observations they want to include. We might write:
cap prog drop levelslist
prog def levelslist
syntax anything [if]
// Implement [if]
preserve
marksample touse
qui keep if `touse' == 1
// Main program loops
foreach var of varlist `anything' {
qui levelsof `var' , local(levels)
di " "
di "Levels of `var': `: var label `var''"
foreach word in `levels' {
di " `word'"
}
}
end