Difference between revisions of "Ieboilstart"

Jump to: navigation, search
 
(6 intermediate revisions by 3 users not shown)
Line 1: Line 1:
'''ieboilstart''' is used help the user to set settings recommended to set at the top of do-files and to harmonize settings in the beginning of do-files between users. '''Disclaimer''': The only way to make sure that the code behave identical for two users in Stata is to for the users to run the same version of Stata. This command reduces the chances that the code behaves differently, but there is no way to completely eliminate that risk.
<code>ieboilstart</code> is a Stata command that standardizes version, memory, and other Stata settings across all users for a project. Such code is usually referred to as boilerplate code, hence the command name. Research teams should standardize settings via boilerplate code throughout the course of a project – and especially during [[Randomization in Stata | randomization]] – to ensure that code behaves in a [[Reproducible Research | replicable]] manner between users. This page describes how to use <code>ieboilstart</code>.


This article is means to describe use cases, work flow and the reasoning used when developing the commands. For instructions on how to use the command specifically in Stata and for a complete list of the options available, see the help files by typing <code>help ieboilstart</code> in Stata. This command is a part of the package [[Stata_Coding_Practices#ietoolkit|ietoolkit]], to install all the commands in this package including this command, type <code>ssc install ietoolkit</code> in Stata.
==Read First==
*<code>ieboilstart</code> does not guarantee against any discrepancies of computer setup or Stata type. It is the users’ responsibility to ensure that they are running the same type of Stata on the same computer setup (i.e. Small/IC/SE/MP or PC/Mac/Linux) .
*<code>ieboilstart</code> standardizes some settings by default (i.e. <code>set more off</code>, <code>set varabbrev off</code>) and allows users to specify additional settings through options. For detailed instructions on how to implement the command and its options in Stata, type <code>help ieboilstart</code> in Stata.
*This command is part of the package <code>[[Stata Coding Practices#ietoolkit | ietoolkit]]</code>. To install all commands in this package, including <code>ieboilstart</code>, type <code>ssc install ietoolkit</code> in Stata.


== Intended use cases ==
==Overview==
''Describe use case here''
<code>ieboilstart</code> is a Stata command that standardizes version, memory, and other Stata settings across all users for a project. The code standardizes some settings by default and also allows users to specify additional settings through options. <code>ieboilstart</code> should be run at the top of all do-files to ensure identical results for all users. If a project consists of many do-files that are run from a [[Master Do-files | master do-file]], then it is only necessary to run <code>ieboilstart</code> 1) at the top of master do-files that run other do-files, and 2) at the top of do-files that includes randomization.
==Disclaimer: Ensuring Identical Results==
Due to technical reasons, <code>ieboilstart</code> cannot guarantee that different types of Stata (Small/IC/SE/MP or PC/Mac/Linux) work exactly the same in every possible context. The command does not guarantee against any discrepancies in Stata or in under-contributed commands: it is solely a collection of common practices to reduce the risk of the same code running differently for different users.
== Implementation ==
<code>ieboilstart</code> sets three types of settings: ''version settings'', ''memory settings'' and ''other settings''.
===Version Settings===
As impact evaluations and other research projects often span the course of many years, users over time will likely run the same code in different versions of Stata. This may introduce discrepancies. For example, randomization is extremely sensitive to different Stata versions, since the randomization algorithm is often updated between each version of Stata. As such, research teams must use the same version of Stata and – as discussed above – ideally on the same computer setup to ensure that the code behaves identically across users.
To correctly set the version:
#Run <code>ieboilstart</code> with the <code>version</code> option (line 1)
#Call one of the returned values (line 2)
<pre>ieboilstart, version(14.0)
`r(version)'</pre>
Setting the version is best practice and perhaps the most important setting to establish via <code>ieboilstart</code>.
===Memory Settings===
In Stata versions before Stata 12, memory is assigned statically. In other words, there is a fixed amount of memory assigned to Stata; if exceeding this amount when, for example, expanding a dataset or running a complex calculation, Stata crashes. In Stata 12 and later, memory is assigned dynamically. In other words, a little bit of memory is assigned to Stata when it is starts and is increased as needed. The only memory limit for Stata 12 and above is that dictated by the computer's hardware limits.


=== Disclaimer ===
<code>ieboilstart</code> can set the fixed memory in Stata 11 with the option <code>setmem</code>. This option is simply ignored in Stata 12 or later. For Stata 12 and later, the dynamic memory can be fine-tuned through the commands <code>set  min_memory</code>, <code>set max_memory</code>, <code>set niceness</code>, and <code>set segmentsize</code>. However, even highly advanced users rarely have to worry about these settings as long as they are set to the recommended default values -- which <code>ieboilstart</code> ensures.
The only way to guarantee that your Stata code works idnein
===Other Settings===
Other settings are standardized via <code>ieboilstart</code> as they are either very commonly preferred or reduce the risk of errors between users. These settings can be reverted to personal preferences after running <code>ieboilstart</code> or by using the <code>custom()</code> option.
==== set more off ====
In <code>set more off</code>, which is <code>ieboilstart</code>’s default, Stata continues running until the results are complete rather than requiring the user to manually tell Stata to resume the pause each time the results reach the end of the window.
==== pause on ====
In <code>pause on</code>, which is <code>ieboilstart</code> 's default, users can take advantage of Stata’s <code>pause</code> command. This is a great de-bugging tool. Type <code>help pause</code> in Stata for more details.
==== set varabbrev off ====
In <code>varabbrev off</code>, which is <code>ieboilstart</code> 's default, variable abbreviation is set off to avoid errors. Otherwise, Stata allows variable abbreviation, meaning that if you have a variable called ''harvest,'' then you can call that variable by just typing <code>harv</code>, given no other variable starts with the letters ''h'', ''a'', ''r'', and ''v''. Copy and paste the code below and run it in Stata to see for yourself.


=== Intended Work Flow ===
<pre>
''Describe work flow here (remove if obvious from use case)''
clear
set obs 100
set varabbrev on


== Instructions ==
//Generate a tomato harvest variable and sum it using variable abbreviation
These instructions are meant to help you understand how to use the command. For technical instructions on how to implement the command in Stata see the help files by typing <code>help  '''commandName'''</code> in Stata.
generate harvest_tomato  = uniform()
summarize harv


''Describe best practices related to this command here.''
//Generate a potato harvest variable and try to sum it using variable abbreviation
generate harvest_potato = uniform()
summarize harv
</pre>


== Reasoning used during development ==
Variable abbreviation is prone to strange errors, especially when several people collaborate on code. Thus, <code>ieboilstart</code> by default sets this variable abbreviation off.  
''Describe any non obvious decisions made during development of this command. This can help explain restrictions and requirements''


== Back to Parent ==
== Back to Parent ==
This article is part of the topic [[Stata_Coding_Practices#ietoolkit|ietoolkit]]
This article is part of the topic [[Stata_Coding_Practices#ietoolkit|ietoolkit]]
 
==Additional Resources==
[[Category: Stata ]]
[[Category: Software Tools]]

Latest revision as of 20:15, 4 June 2019

ieboilstart is a Stata command that standardizes version, memory, and other Stata settings across all users for a project. Such code is usually referred to as boilerplate code, hence the command name. Research teams should standardize settings via boilerplate code throughout the course of a project – and especially during randomization – to ensure that code behaves in a replicable manner between users. This page describes how to use ieboilstart.

Read First

  • ieboilstart does not guarantee against any discrepancies of computer setup or Stata type. It is the users’ responsibility to ensure that they are running the same type of Stata on the same computer setup (i.e. Small/IC/SE/MP or PC/Mac/Linux) .
  • ieboilstart standardizes some settings by default (i.e. set more off, set varabbrev off) and allows users to specify additional settings through options. For detailed instructions on how to implement the command and its options in Stata, type help ieboilstart in Stata.
  • This command is part of the package ietoolkit. To install all commands in this package, including ieboilstart, type ssc install ietoolkit in Stata.

Overview

ieboilstart is a Stata command that standardizes version, memory, and other Stata settings across all users for a project. The code standardizes some settings by default and also allows users to specify additional settings through options. ieboilstart should be run at the top of all do-files to ensure identical results for all users. If a project consists of many do-files that are run from a master do-file, then it is only necessary to run ieboilstart 1) at the top of master do-files that run other do-files, and 2) at the top of do-files that includes randomization.

Disclaimer: Ensuring Identical Results

Due to technical reasons, ieboilstart cannot guarantee that different types of Stata (Small/IC/SE/MP or PC/Mac/Linux) work exactly the same in every possible context. The command does not guarantee against any discrepancies in Stata or in under-contributed commands: it is solely a collection of common practices to reduce the risk of the same code running differently for different users.

Implementation

ieboilstart sets three types of settings: version settings, memory settings and other settings.

Version Settings

As impact evaluations and other research projects often span the course of many years, users over time will likely run the same code in different versions of Stata. This may introduce discrepancies. For example, randomization is extremely sensitive to different Stata versions, since the randomization algorithm is often updated between each version of Stata. As such, research teams must use the same version of Stata and – as discussed above – ideally on the same computer setup to ensure that the code behaves identically across users. To correctly set the version:

  1. Run ieboilstart with the version option (line 1)
  2. Call one of the returned values (line 2)
ieboilstart, version(14.0)
`r(version)'

Setting the version is best practice and perhaps the most important setting to establish via ieboilstart.

Memory Settings

In Stata versions before Stata 12, memory is assigned statically. In other words, there is a fixed amount of memory assigned to Stata; if exceeding this amount when, for example, expanding a dataset or running a complex calculation, Stata crashes. In Stata 12 and later, memory is assigned dynamically. In other words, a little bit of memory is assigned to Stata when it is starts and is increased as needed. The only memory limit for Stata 12 and above is that dictated by the computer's hardware limits.

ieboilstart can set the fixed memory in Stata 11 with the option setmem. This option is simply ignored in Stata 12 or later. For Stata 12 and later, the dynamic memory can be fine-tuned through the commands set min_memory, set max_memory, set niceness, and set segmentsize. However, even highly advanced users rarely have to worry about these settings as long as they are set to the recommended default values -- which ieboilstart ensures.

Other Settings

Other settings are standardized via ieboilstart as they are either very commonly preferred or reduce the risk of errors between users. These settings can be reverted to personal preferences after running ieboilstart or by using the custom() option.

set more off

In set more off, which is ieboilstart’s default, Stata continues running until the results are complete rather than requiring the user to manually tell Stata to resume the pause each time the results reach the end of the window.

pause on

In pause on, which is ieboilstart 's default, users can take advantage of Stata’s pause command. This is a great de-bugging tool. Type help pause in Stata for more details.

set varabbrev off

In varabbrev off, which is ieboilstart 's default, variable abbreviation is set off to avoid errors. Otherwise, Stata allows variable abbreviation, meaning that if you have a variable called harvest, then you can call that variable by just typing harv, given no other variable starts with the letters h, a, r, and v. Copy and paste the code below and run it in Stata to see for yourself.

clear
set obs 100
set varabbrev on

//Generate a tomato harvest variable and sum it using variable abbreviation
generate harvest_tomato  = uniform()
summarize harv

//Generate a potato harvest variable and try to sum it using variable abbreviation
generate harvest_potato = uniform()
summarize harv

Variable abbreviation is prone to strange errors, especially when several people collaborate on code. Thus, ieboilstart by default sets this variable abbreviation off.

Back to Parent

This article is part of the topic ietoolkit

Additional Resources