Difference between revisions of "Stata Coding Practices: Debugging"

Jump to: navigation, search
(Created page with "Debugging is the process of fixing runtime errors or other unexpected behaviors in Stata code. Unlike normal code execution, debugging involves intentionally preventing code f...")
 
Line 6: Line 6:
* The <code>pause</code> command freezes the execution of Stata code in the designated location and optionally displays information about Stata's memory state. While paused, Stata allows the user to interact with the data as it is at that point in code execution and to permit code execution to continue when finished.
* The <code>pause</code> command freezes the execution of Stata code in the designated location and optionally displays information about Stata's memory state. While paused, Stata allows the user to interact with the data as it is at that point in code execution and to permit code execution to continue when finished.
* The <code>display</code> command is used to extract specific information about Stata's current state without pausing or tracing all code execution. Since Stata has rich information on hand at all times about macros and data in memory, it is possible to detect problems when unexpected or inappropriate values are displayed during code execution.
* The <code>display</code> command is used to extract specific information about Stata's current state without pausing or tracing all code execution. Since Stata has rich information on hand at all times about macros and data in memory, it is possible to detect problems when unexpected or inappropriate values are displayed during code execution.
==Principles of Code Debugging in Stata==
When code execution produces errors or unexpected results, it is typically very difficult to determine what is causing an issue simply by looking at the code and results. Such a bug usually results from an intermediate state arising from a complex combination of code, which is not easily foreseeable when writing the code. Typical examples occur when code is generalized or looped such that code that is written or tested in a given environment is then applied to situations where it was not originally envisioned or tested. For example, a loop that includes conditional statements or data subsetting may result in empty datasets or undefined macros or expressions and cause later code to be invalid.
Debugging Stata code follows a three-step process each time such an issue is encountered. First, the author must determine where the issue is occurring, by finding the literal line of code that is producing the error or unexpected result. This is usually done with <code>trace</code> or <code>display</code>. Next, the author must determine why the input to that line is different than what they had anticipated, which typically involves inserting a <code>pause</code> or <code>display</code> command to assess the state of various memory elements at that point. Finally, the author needs to address the problem, which typically involves following the same process upstream in the code to find the line where the incorrect input is being generated. This process is iterated until the root cause is found and a code correction can be made.

Revision as of 21:00, 21 October 2020

Debugging is the process of fixing runtime errors or other unexpected behaviors in Stata code. Unlike normal code execution, debugging involves intentionally preventing code from running completely so the user can investigate the current state of data or memory and determine what code would produce the desired outputs in a complete execution of the code.

Read First

Stata code is typically debugged using three tools: trace, pause, and display. Understanding how to use these tools interactively will allow you to pinpoint and correct errors in code.

  • The trace command requests that Stata display additional levels of detail as it executes code. Nearly all Stata code and commands use other commands as part of their execution, and an appropriately set trace readout will allow you to pinpoint the line or command where an issue is occurring.
  • The pause command freezes the execution of Stata code in the designated location and optionally displays information about Stata's memory state. While paused, Stata allows the user to interact with the data as it is at that point in code execution and to permit code execution to continue when finished.
  • The display command is used to extract specific information about Stata's current state without pausing or tracing all code execution. Since Stata has rich information on hand at all times about macros and data in memory, it is possible to detect problems when unexpected or inappropriate values are displayed during code execution.

Principles of Code Debugging in Stata

When code execution produces errors or unexpected results, it is typically very difficult to determine what is causing an issue simply by looking at the code and results. Such a bug usually results from an intermediate state arising from a complex combination of code, which is not easily foreseeable when writing the code. Typical examples occur when code is generalized or looped such that code that is written or tested in a given environment is then applied to situations where it was not originally envisioned or tested. For example, a loop that includes conditional statements or data subsetting may result in empty datasets or undefined macros or expressions and cause later code to be invalid.

Debugging Stata code follows a three-step process each time such an issue is encountered. First, the author must determine where the issue is occurring, by finding the literal line of code that is producing the error or unexpected result. This is usually done with trace or display. Next, the author must determine why the input to that line is different than what they had anticipated, which typically involves inserting a pause or display command to assess the state of various memory elements at that point. Finally, the author needs to address the problem, which typically involves following the same process upstream in the code to find the line where the incorrect input is being generated. This process is iterated until the root cause is found and a code correction can be made.