# Difference between revisions of "Stata Coding Practices: Visualization"

Line 60: | Line 60: | ||

(scatteri 0.8 20193 "Round 1" , mlabcolor(black) m(none) mlabpos(1)) /// | (scatteri 0.8 20193 "Round 1" , mlabcolor(black) m(none) mlabpos(1)) /// | ||

(function 0.8 , lc(black) range(20814 20877)) /// | (function 0.8 , lc(black) range(20814 20877)) /// | ||

− | (scatteri 0.8 20814 "Round 2" , mlabcolor(black) m(none) mlabpos(1)) /// | + | (scatteri 0.8 20814 "Round 2" , mlabcolor(black) m(none) mlabpos(1)) /// |

+ | /// Overall options | ||

, legend(on size(vsmall) pos(12) /// | , legend(on size(vsmall) pos(12) /// | ||

order( /// | order( /// |

## Revision as of 21:52, 9 November 2020

(This page is under construction.)

Modern Stata versions have extremely powerful graphics capabilities which allow the rapid creation of publication-quality graphics from almost any kind of tabular data. Although the default graphical commands and settings leave much to be desired, the customizability and interoperability of Stata's visualization tools mean that almost any imaginable output can be rendered using Stata's built-in graphics engine.

## Read First

Stata graphics are typically created using one of four command types. Each has specific use cases, strengths, and weaknesses, and it is important to be familiar with the abilities and limitations of each when considering which to use to create a particular visualization. All four methods (except some user-written commands) use the same basic styling syntax discussed in this article.

- The
`graph`

command suite creates pre-packaged visualizations, typically based on Stata's native`collapse`

syntax and statistics. - The
`twoway`

suite, which is the most commonly used tool, allows a flexible and open-ended approach to visualizing any amount of information in an abstract set of axes. - Built-in graphical commands (such as
`lowess`

) offer pre-packaged visualizations that do not follow the`graph`

style. These commands are typically better used within a`twoway`

environment and may behave differently when used independently. - User-written commands (such as
`iegraph`

or`spmap`

) create custom visualizations, but typically have unique purpose-built syntaxes and cannot be integrated in a`twoway`

environment.

## General Graphics Tools

### Graphics options

```
local bad BAD
sysuse auto`bad'.dta
```

### Graphical schemes

Graphical schemes apply a large number of these options simultaneously, and in doing so they provide one of the highest degrees of cross-system consistency that is possible in creating graphs. Stata includes several built-in graphical schemes; the familiar "Stata blue" graphs are created using the `s2color`

scheme.

The graph scheme can be changed using the `set scheme`

command. Stata will use the `sysdir`

path to search for matching graph schemes, so for example a third-party scheme file (like Uncluttered) might be included in the top-level directory of a repository and applied in the run file by writing:

```
sysdir set PERSONAL "${directory}/"
set scheme uncluttered
```

This directs Stata to search for `scheme-uncluttered.scheme`

and apply it to all graphics created while Stata remains open. This is a simple scheme which incorporates many of the universally-applicable options above for all graphs, particularly region coloring and axis marking. As with any third-party scheme, you should read the documentation; notably, this scheme provides a specific color palette and turns off the legend by default.

One thing that schemes cannot do, apparently, is control the default graphics font. This can be done using `graph set`

, as in `graph set window fontface "Helvetica"`

.

### Combining Stata graphics

Combining multiple graphs into a single image is an excellent way to present various elects of a single analysis at the same time. Combining graphs is especially useful when facing constraints on the number of allowable exhibits, or when one or more graphical elements are very simple but important.

There are two main approaches to combing graphs: overlaying multiple pieces of information on the same set of axes, or combining multiple visualizations into a single image with multiple panels (either aligned or not, although Stata handles alignment somewhat poorly).

Overlaying graphics is accomplished using `twoway`

syntax. In `twoway`

, the graph axes are abstract, so with some abuse of notation it is possible to draw just about anything. Starting from the first axis, and proceeding in order of the commands written, Stata will layer graphs on top of each other on the same set of axes. Including a second (possibly invisible) axis allows further possibilities. For example, with the Uncluttered scheme applied and Helvetica set as the graph font, we might write the following `twoway`

command:

```
twoway ///
/// Stacked histogram using total/subset approach
(histogram date ///
, freq yaxis(2) fc(gs14) ls(none) start(19997) width(7) barwidth(6) ) ///
(histogram date if voucher_use == 0 ///
, freq yaxis(2) fc(gs10) ls(none) start(19997) width(7) barwidth(6) ) ///
/// Positivity
(lpoly mtb date if voucher_use == 0 , lc(black) lw(thick) lp(solid)) ///
(lpoly mtb date if voucher_use == 1 , lc(red) lw(thick) lp(solid)) ///
(lpoly rifres date if voucher_use == 0 , lc(black) lw(thick) lp(dash)) ///
(lpoly rifres date if voucher_use == 1 , lc(red) lw(thick) lp(dash)) ///
/// Data collection
(function 0.8 , lc(black) range(20193 20321)) ///
(scatteri 0.8 20193 "Round 1" , mlabcolor(black) m(none) mlabpos(1)) ///
(function 0.8 , lc(black) range(20814 20877)) ///
(scatteri 0.8 20814 "Round 2" , mlabcolor(black) m(none) mlabpos(1)) ///
/// Overall options
, legend(on size(vsmall) pos(12) ///
order( ///
2 "TB Tests Done, non-PPIA" ///
1 "TB Tests Done, PPIA" ///
3 "TB Positive Rate, non-PPIA" ///
4 "TB Positive Rate, PPIA" ///
5 "Rifampicin Resistance, non-PPIA" ///
6 "Rifampicin Resistance, PPIA" )) ///
${hist_opts} xoverhang ///
ylab(${pct}) ytit("Weekly Tests (Histogram)", axis(2)) ///
xtit(" ") xlab(,labsize(small) format(%tdMon_CCYY))
```

If we did, we would obtain something like:

Alternatively, we might like to display information in panels that would not layer well together, or from commands which cannot be combined by `twoway`

. For example, after creating some graphs with user-written commands (and including their panel titles), we might write:

```
graph combine ///
"${git}/outputs/f-discontinuity-1.gph" ///
"${git}/outputs/f-discontinuity-2.gph" ///
"${git}/outputs/f-discontinuity-3.gph" ///
"${git}/outputs/f-discontinuity-4.gph" ///
, altshrink
```

And we would obtain something like:

The `graph combine`

command provides many options for customizing the layout and alignment of the graphs included. The user-written `grc1leg`

command may also be useful when all of the visualizations included in the final image are intended to share a common legend.

To save processing time when combining graphs, consider rendering the underlying graphs using the `nodraw`

option, which saves graph rendering until the combined graph is drawn. Rendering the Graph window is computationally costly in Stata and is best avoided whenever possible.