Difference between revisions of "Stata Coding Practices: Debugging"
Line 2: | Line 2: | ||
==Read First== | ==Read First== | ||
Stata code is typically debugged using | Stata code is typically debugged using four tools: <code>trace</code>, <code>pause</code>, <code>display</code>, and <code>capture</code>. Understanding how to use these tools interactively will allow you to pinpoint and correct errors in code. | ||
* The <code>trace</code> command requests that Stata display additional levels of detail as it executes code. Nearly all Stata code and commands use other commands as part of their execution, and an appropriately set <code>trace</code> readout will allow you to pinpoint the line or command where an issue is occurring. | * The <code>trace</code> command requests that Stata display additional levels of detail as it executes code. Nearly all Stata code and commands use other commands as part of their execution, and an appropriately set <code>trace</code> readout will allow you to pinpoint the line or command where an issue is occurring. | ||
* The <code>pause</code> command freezes the execution of Stata code in the designated location and optionally displays information about Stata's memory state. While paused, Stata allows the user to interact with the data as it is at that point in code execution and to permit code execution to continue when finished. | * The <code>pause</code> command freezes the execution of Stata code in the designated location and optionally displays information about Stata's memory state. While paused, Stata allows the user to interact with the data as it is at that point in code execution and to permit code execution to continue when finished. | ||
* The <code>display</code> command is used to extract specific information about Stata's current state without pausing or tracing all code execution. Since Stata has rich information on hand at all times about macros and data in memory, it is possible to detect problems when unexpected or inappropriate values are displayed during code execution. | * The <code>display</code> command is used to extract specific information about Stata's current state without pausing or tracing all code execution. Since Stata has rich information on hand at all times about macros and data in memory, it is possible to detect problems when unexpected or inappropriate values are displayed during code execution. | ||
* The <code>capture</code> command is used to extract specific information about errors resulting from commands while still allowing code to continue executing past a point that would otherwise result in Stata breaking and refusing to continue. | |||
==Principles of | ==Principles of code debugging in Stata== | ||
When code execution produces errors or unexpected results, it is typically very difficult to determine what is causing an issue simply by looking at the code and results. Such a bug usually results from an intermediate state arising from a complex combination of code, which is not easily foreseeable when writing the code. Typical examples occur when code is generalized or looped such that code that is written or tested in a given environment is then applied to situations where it was not originally envisioned or tested. For example, a loop that includes conditional statements or data subsetting may result in empty datasets or undefined macros or expressions and cause later code to be invalid. | When code execution produces errors or unexpected results, it is typically very difficult to determine what is causing an issue simply by looking at the code and results. Such a bug usually results from an intermediate state arising from a complex combination of code, which is not easily foreseeable when writing the code. Typical examples occur when code is generalized or looped such that code that is written or tested in a given environment is then applied to situations where it was not originally envisioned or tested. For example, a loop that includes conditional statements or data subsetting may result in empty datasets or undefined macros or expressions and cause later code to be invalid. | ||
Debugging Stata code follows a three-step process each time such an issue is encountered. First, the author must determine where the issue is occurring, by finding the literal line of code that is producing the error or unexpected result. This is usually done with <code>trace</code> or <code>display</code>. Next, the author must determine why the input to that line is different than what they had anticipated, which typically involves inserting a <code>pause</code> or <code>display</code> command to assess the state of various memory elements at that point. Finally, the author needs to address the problem, which typically involves following the same process upstream in the code to find the line where the incorrect input is being generated. This process is iterated until the root cause is found and a code correction can be made. | Debugging Stata code follows a three-step process each time such an issue is encountered. First, the author must determine where the issue is occurring, by finding the literal line of code that is producing the error or unexpected result. This is usually done with <code>trace</code> or <code>display</code>. Next, the author must determine why the input to that line is different than what they had anticipated, which typically involves inserting a <code>pause</code> or <code>display</code> command to assess the state of various memory elements at that point. Finally, the author needs to address the problem, which typically involves following the same process upstream in the code to find the line where the incorrect input is being generated. This process is iterated until the root cause is found and a code correction can be made. | ||
==Detecting the location of code errors== | |||
When a command is executed in Stata and cannot be successfully run through to its endpoint, the error messages that are displayed are often uninformative, particularly for user-written commands. Thankfully, Stata provides a useful set of commands for users to examine code execution line-by-line, and Stata makes the data and memory state accessible to users before and after the execution of every single command. | |||
When there is an error in code, Stata will break during that command and report its error state, and refuse to continue executing any more code. This usually makes it easy to detect the exact command that is unsuccessful and begin to understand and correct the error. However, when do-files execute other do-files, or when user-written commands are called, Stata by default suppresses reporting the line-by-line execution of these commands (in fact, displaying line-by-line results is relatively slow in Stata and should be avoided when possible by using <code>quietly</code> where possible). |
Revision as of 21:18, 21 October 2020
Debugging is the process of fixing runtime errors or other unexpected behaviors in Stata code. Unlike normal code execution, debugging involves intentionally preventing code from running completely so the user can investigate the current state of data or memory and determine what code would produce the desired outputs in a complete execution of the code.
Read First
Stata code is typically debugged using four tools: trace
, pause
, display
, and capture
. Understanding how to use these tools interactively will allow you to pinpoint and correct errors in code.
- The
trace
command requests that Stata display additional levels of detail as it executes code. Nearly all Stata code and commands use other commands as part of their execution, and an appropriately settrace
readout will allow you to pinpoint the line or command where an issue is occurring. - The
pause
command freezes the execution of Stata code in the designated location and optionally displays information about Stata's memory state. While paused, Stata allows the user to interact with the data as it is at that point in code execution and to permit code execution to continue when finished. - The
display
command is used to extract specific information about Stata's current state without pausing or tracing all code execution. Since Stata has rich information on hand at all times about macros and data in memory, it is possible to detect problems when unexpected or inappropriate values are displayed during code execution. - The
capture
command is used to extract specific information about errors resulting from commands while still allowing code to continue executing past a point that would otherwise result in Stata breaking and refusing to continue.
Principles of code debugging in Stata
When code execution produces errors or unexpected results, it is typically very difficult to determine what is causing an issue simply by looking at the code and results. Such a bug usually results from an intermediate state arising from a complex combination of code, which is not easily foreseeable when writing the code. Typical examples occur when code is generalized or looped such that code that is written or tested in a given environment is then applied to situations where it was not originally envisioned or tested. For example, a loop that includes conditional statements or data subsetting may result in empty datasets or undefined macros or expressions and cause later code to be invalid.
Debugging Stata code follows a three-step process each time such an issue is encountered. First, the author must determine where the issue is occurring, by finding the literal line of code that is producing the error or unexpected result. This is usually done with trace
or display
. Next, the author must determine why the input to that line is different than what they had anticipated, which typically involves inserting a pause
or display
command to assess the state of various memory elements at that point. Finally, the author needs to address the problem, which typically involves following the same process upstream in the code to find the line where the incorrect input is being generated. This process is iterated until the root cause is found and a code correction can be made.
Detecting the location of code errors
When a command is executed in Stata and cannot be successfully run through to its endpoint, the error messages that are displayed are often uninformative, particularly for user-written commands. Thankfully, Stata provides a useful set of commands for users to examine code execution line-by-line, and Stata makes the data and memory state accessible to users before and after the execution of every single command.
When there is an error in code, Stata will break during that command and report its error state, and refuse to continue executing any more code. This usually makes it easy to detect the exact command that is unsuccessful and begin to understand and correct the error. However, when do-files execute other do-files, or when user-written commands are called, Stata by default suppresses reporting the line-by-line execution of these commands (in fact, displaying line-by-line results is relatively slow in Stata and should be avoided when possible by using quietly
where possible).