Stata Coding Practices: Debugging

Debugging is the process of fixing runtime errors or other unexpected behaviors in Stata code. Unlike normal code execution, debugging involves intentionally preventing code from running completely so the user can investigate the current state of data or memory and determine what code would produce the desired outputs in a complete execution of the code.

Read First

Stata code is typically debugged using four tools: trace, pause, display, and capture. Understanding how to use these tools interactively will allow you to pinpoint and correct errors in code.

The trace command requests that Stata display additional levels of detail as it executes code. Nearly all Stata code and commands use other commands as part of their execution, and an appropriately set trace readout will allow you to pinpoint the line or command where an issue is occurring.
The pause command freezes the execution of Stata code in the designated location and optionally displays information about Stata's memory state. While paused, Stata allows the user to interact with the data as it is at that point in code execution and to permit code execution to continue when finished. (Alternatively, exit can be used to simply halt execution in place, although this will not work in some circumstances, such as when preserve is active.)
The display command (or other output command) is used to extract specific information about Stata's current state without pausing or tracing all code execution. Since Stata has rich information on hand at all times about macros and data in memory, it is possible to detect problems when unexpected or inappropriate values are displayed during code execution.
The capture command is used to extract specific information about errors resulting from commands while still allowing code to continue executing past a point that would otherwise result in Stata breaking and refusing to continue.

Principles of code debugging in Stata

When code execution produces errors or unexpected results, it is typically very difficult to determine what is causing an issue simply by looking at the code and results. Such a bug usually results from an intermediate state arising from a complex combination of code, which is not easily foreseeable when writing the code. Typical examples occur when code is generalized or looped such that code that is written or tested in a given environment is then applied to situations where it was not originally envisioned or tested. For example, a loop that includes conditional statements or data subsetting may result in empty datasets or undefined macros or expressions and cause later code to be invalid.

Debugging Stata code follows a three-step process each time such an issue is encountered. First, the author must determine where the issue is occurring, by finding the literal line of code that is producing the error or unexpected result. This is usually done with trace or display. Next, the author must determine why the input to that line is different than what they had anticipated, which typically involves inserting a pause or display command to assess the state of various memory elements at that point. Finally, the author needs to address the problem, which typically involves following the same process upstream in the code to find the line where the incorrect input is being generated. This process is iterated until the root cause is found and a code correction can be made.

Locating code issues with trace

When a command is executed in Stata and cannot be successfully run through to its endpoint, the error messages that are displayed are often uninformative, particularly for user-written commands. Thankfully, Stata provides a useful set of commands for users to examine code execution line-by-line, and Stata makes the data and memory state accessible to users before and after the execution of every single command.

When there is an error in code, Stata will break during that command and report its error state, and refuse to continue executing any more code. This usually makes it easy to detect the exact command that is unsuccessful and begin to understand and correct the error. However, when do-files execute other do-files, or when commands are called, Stata by default suppresses reporting the line-by-line evaluation and execution of these commands (displaying line-by-line results is relatively slow in Stata and should be avoided by using quietly where possible).

To detect the location of errors in code that is called from a single do-file, the trace setting needs to be invoked. By running set trace on and set tracedepth # (where # is an integer), and then re-running the code, Stata will report detailed evaluations of the code up to the level requested. Using set tracedepth 1 does nothing. Using set tracedepth 2 will expand the evaluation of each command called by the main code, allowing detection of erroneous inputs to those commands by showing where those commands have stopped executing. Using set tracedepth 3 will expand those subcommands, and so on. This gets voluminous very quickly, so it is very important to set trace off when finished.

For example, running:

local bad BAD
sysuse auto`bad'.dta

returns:

. sysuse auto`bad'.dta
file "autoBAD.dta" not found
r(601);

and running it after set tracedepth 2 returns:

. sysuse auto`bad'.dta
  --------------------------------------------------------------------------- begin sysuse ---
  - version 8
  - gettoken first : 0, parse(" ,") quotes
  - if `"`first'"'=="dir" {
  = if `"autoBAD.dta"'=="dir" {
    gettoken first 0 : 0, parse(" ,")
    sysusedir `0'
    exit
    }
  - local 0 `"using `0'"'
  = local 0 `"using autoBAD.dta"'
  - syntax using/ [, CLEAR REPLACE]
  - local clear = cond("`replace'"!="", "clear", "`clear'")
  = local clear = cond(""!="", "clear", "")
  - if bsubstr(`"`using'"',-4,.)!=".dta" {
  = if bsubstr(`"autoBAD.dta"',-4,.)!=".dta" {
    local using `"`using'.dta"'
    }
  - quietly findfile `"`using'"'
  = quietly findfile `"autoBAD.dta"' file "autoBAD.dta" not found
  ----------------------------------------------------------------------------- end sysuse ---
r(601);

This result makes the error more explicit and, among other things, helps detect that the contents of the macro `bad' are the problem. When finished, set trace off will return Stata to its normal display behavior. The trace help file details various other functionalities that help quickly parse through the material that is returned. You may also need to check settings of any code that is executed quietly to ensure it is correctly displayed in the Results window.

Locating code issues with display

While trace is an invaluable tool for locating issues in code by manual inspection, its outputs are voluminous and can be hard to search and parse, regardless of whether you are doing so in the Results window directly or printing to a log file. In many situations, Stata's default error behavior of breaking at a failed command make trace unnecessary to find the location of the issue. In other cases, Stata code will break inside a loop, and while you could diagnose such an issue by having trace report all the calculations inside the loop, it is often much simpler only to report the information you need to find the bug, using display. (The workflow detailed here, by the way, is easily adapted for any output command, such as matlist.)

The display command is simple: it prints the requested information to the Results window. Thanks to the large amount of information stored by Stata, this is an invaluable access point to information about the state of the program just before the code breaks. For example, you can write display commands like:

display `" This is iteration `i'. The dependent variable is `yvar'. "' 
display `" It is labelled "`: var lab `yvar''". Output to {browse `filepath'}. "' 
display `" There are `c(N)' observations and `c(k)' variables in the dataset. "' 
display `" The last regression coefficient was `=_b[mpg]'. "' 
display `" Average cluster size was `=e(N)/e(N_clust)'. "'

Debugging with display should make liberal use of the built-in Stata features to access and handle information and results. These include macros, whether from loops or organically created; extended functions (documentation at extended_fcn); SMCL code to create clickable links (smcl); on-the-fly evaluation with `=' and `:'; and system information from return, ereturn, and creturn. Finally, display, like most other string-handling commands, should almost always be enclosed in compound double quotes `" "' to ensure it handles any possible input correctly.

Simple SMCL formatting may improve output:

di `"  This is a demonstration {break}"' /// 
   `"    of breaking lines in {help display}."'

The as err appendix will almost certainly ensure output is displayed, regardless of quietly settings in other code.

di as err `"  Ran through iteration `i'/`N'! "'

Diagnosing code issues with pause

The pause command is an atypical command from both the generic programming perspective and the Stata coding perspective, yet it is vitally important to debugging Stata code. The pause command, inserted in a do-file, temporarily stops code execution and allows the user to interact with the data state as it exists at that line of code. This allows you to test how different commands would work at the exact data state, in order to figure out how to resolve issues in the line you have identified.

The typical use of pause is done by inserting the following code directly before the line in which code breaks:

pause on
pause [optional message]

If you simply want to interact with the data, no optional message is required. However, it can be very useful to insert this message, because Stata will still not allow things like local macros to exceed the scope of do-file execution. This means that you will not be able to directly access their contents at this point. For example, if we run:

local bad BAD
  
pause on
pause `bad'

we get:

pause:  BAD
-> .

but if we then use the pause prompt (->) to enter:

-> . di "`bad'"

Stata returns nothing, because `bad' is not defined in the current scope.

Once Stata is paused, you can interact with the data as normal. When you are finished, you have two choices. The first option is to have Stata continue execution of the do-file where it left off. To do this, simply enter q at the pause prompt. This will continue to incorporate any interactive changes you made to the data -- it will not revert Stata to the pre-paused state -- so if you make changes that you want to permanently incorporate you will typically prefer to code those changes into the do-file in question and re-run it from the top.

To terminate the do-file's execution so it can be restarted, enter BREAK (in all caps) at the pause prompt. The usual --Break-- button will have no effect. If you forget to do this before re-running the do-file, you can end up with several pause states active at the same time and you will have to BREAK them all to get back to the ordinary state. Try to avoid entering this state, as it disables some key Stata features such as overwriting certain types of program inputs such as ado-files and can lead to unexpected behavior.

You can create a handy shortcut by coding:

pause on
di "{stata BREAK: Click to {red:BREAK}}"
pause

Note that BREAK is an unrecognized command in all other settings and will do nothing if you enter it in a non-pause state. For this reason, trying to circumvent the need to manually BREAK the pause state by, say, writing cap BREAK at the beginning of your do-file will not have any effect.

Diagnosing code issues with capture

The capture command has a unique place in Stata programming. Like quietly, it is invoked as a prefix to another command to modify its result. In particular, capture instructs Stata to continue the execution of code as if no error had occurred, even in case the underlying command failed to execute and returned an error.

First, a warning. While you can use capture in ordinary programming, and it may seem very convenient, it is highly discouraged because it often leads to extremely unexpected results. This occurs because Stata code, particularly user-written commands, is very inconsistent in whether or not it validates and executes functionality jointly or piecewise. In other words, if you capture a command which has multiple separable elements, and one or more of them fails, it will either: perform all the parts that it can even if some parts fail; or it will perform none of them; or it will perform some of the working parts and halt, perhaps arbitrarily, when it encounters the first breaking element and leave the rest undone. How a particular command will behave is almost impossible to predict without extensive knowledge and testing; your results should therefore never depend on this behavior.

In programming and debugging, however, capture is invaluable, because it allows you to log a series of errors without tediously documenting and resolving each one in turn. When a loop is returning an error or issue, for example, capture can be used to diagnose its cause much more effectively than the other methods.

This is done by taking advantage of the built-in variable _rc, which is the "return code" for the Stata state. It takes a value of 0 when the most recent command ran without error and it takes some other value otherwise. A capture prefix always results in the value of _rc being set to some nonzero value if the corresponding command fails, which we can reference in the next line of code. For example, we can write:

sysuse census.dta, clear
  
local errors ""
foreach var of varlist * {
  cap egen `var'_mean = mean(`var')
  if _rc != 0 local errors `" `errors' {break}  `var'"'
  cap lab var `var'_mean "Mean of `: var lab `var''"
}
  
di `"These variables failed: `errors'"'

and Stata will return:

These variables failed:    
  state 
  state2

This functionality allows us to quickly understand commonalities across errors in iterations of our loops and diagnose what causes those cases to fail as a group, rather than on an individual basis.

You could, theoretically, use the exact error codes described in the manual to diagnose issues even more explicitly by writing something like:

if _rc != 0 local errors `" `errors' {break}  {red:(`=_rc')} `var' "'

Unfortunately, this is not very helpful in most cases. The error codes included in Stata are not very specific, user-written commands are typically lax in coding these carefully, and it is not possible to record an error message from a program to review later. It is much more useful to simply record additional information. Note, as below, that you must use if _rc != 0 { ... } to do this, as repeated calls will not work (_rc will be reset to 0 when the next command succeeds).

if _rc != 0 {
  local type : type `var'
  local errors `" `errors' {break}  {red:(`type')} {tab} `var' "'
}

This returns:

These variables failed:    
  (str14)        state  
  (str2)         state2