1-D Scatter Plots with Confidence Intervals

stripChart is a modification of the R function stripchart. It is a generic function used to produce one dimensional scatter plots (or dot plots) of the given data, along with text indicating sample size and estimates of location (mean or median) and scale (standard deviation or interquartile range), as well as confidence intervals for the population location parameter. One dimensional scatterplots are a good alternative to boxplots when sample sizes are small or moderate. The function invokes particular methods which depend on the class of the first argument.

stripChart(x, ...)

# S3 method for class 'formula'
stripChart(x, data = NULL, dlab = NULL, 
    subset, na.action = NULL, ...)

# Default S3 method
stripChart(x, 
    method = ifelse(paired && paired.lines, "overplot", "stack"), 
    seed = 47, jitter = 0.1 * cex, offset = 1/2, vertical = TRUE, 
    group.names, group.names.cex = cex, drop.unused.levels = TRUE, 
    add = FALSE, at = NULL, xlim = NULL, ylim = NULL, ylab = NULL, 
    xlab = NULL, dlab = "", glab = "", log = "", pch = 1, col = par("fg"), 
    cex = par("cex"), points.cex = cex, axes = TRUE, frame.plot = axes, 
    show.ci = TRUE, location.pch = 16, location.cex = cex, 
    conf.level = 0.95, min.n.for.ci = 2, 
    ci.offset = 3/ifelse(n > 2, (n-1)^(1/3), 1), ci.bar.lwd = cex, 
    ci.bar.ends = TRUE, ci.bar.ends.size = 0.5 * cex, ci.bar.gap = FALSE, 
    n.text = "bottom", n.text.line = ifelse(n.text == "bottom", 2, 0), 
    n.text.cex = cex, location.scale.text = "top", 
    location.scale.digits = 1, nsmall = location.scale.digits, 
    location.scale.text.line = ifelse(location.scale.text == "top", 0, 3.5), 
    location.scale.text.cex = 
      cex * 0.8 * ifelse(n > 6, max(0.4, 1 - (n-6) * 0.06), 1), 
    p.value = FALSE, p.value.digits = 3, p.value.line = 2, p.value.cex = cex, 
    group.difference.ci = p.value, group.difference.conf.level = 0.95, 
    group.difference.digits = location.scale.digits, 
    ci.and.test = "parametric", ci.arg.list = NULL, test.arg.list = NULL, 
    alternative = "two.sided", plot.diff = FALSE, diff.col = col[1], 
    diff.method = "stack", diff.pch = pch[1], paired = FALSE, paired.lines = paired, 
    paired.lty = 1:6, paired.lwd = 1, paired.pch = 1:14, paired.col = NULL, 
    diff.name = NULL, diff.name.cex = group.names.cex, sep.line = TRUE, 
    sep.lty = 2, sep.lwd = cex, sep.col = "gray", diff.lim = NULL, 
    diff.at = NULL, diff.axis.label = NULL, 
    plot.diff.mar = c(5, 4, 4, 4) + 0.1, ...)

Arguments

x

the data from which the plots are to be produced. In the default method the data can be specified as a list or data frame where each component is numeric, a numeric matrix, or a numeric vector. In the formula method, a symbolic specification of the form y ~ g can be given, indicating the observations in the vector y are to be grouped according to the levels of the factor g (the form y ~ 1 indicates no grouping). NAs are allowed in the data.
NOTE: When the formula method is used and the argument paired=TRUE (see below), the data in the vector y must have the same number of observations for each level of the factor g and for each level sorted in the same way according to the pairing variable.

data

for the formula method, a data.frame (or list) from which the variables in x should be taken.

subset

for the formula method, an optional vector specifying a subset of observations to be used for plotting.

na.action

for the formula method, a function which indicates what should happen when the data contain NAs. The default is to ignore missing values in either the response or the group.

...

additional parameters passed to the default method, or by it to plot, points, axis, and title to control the appearance of the plot.

method

the method to be used to separate coincident points. When method="stack" coincident points are stacked, when method="jitter" coincident points are jittered, and when method="overplot" coincident points are overplotted. When there are 2 groups and paired=TRUE and paired.lines=TRUE the default value is method="overplot", otherewise the default method is method="stack" (which differs from the default value for the R function stripchart, which uses
method="overplot" by default).

seed

when method="jitter" is used, the argument seed is passed to the R function set.seed. Since jittering depends on the R random number generator, using the same value of seed each time the same data are plotted with stripChart ensures that the resulting plot is the same.

jitter

when method="jitter" is used, jitter gives the amount of jittering applied.

offset

when stacking is used, points are stacked this many line-heights (symbol widths) apart.

vertical

when vertical=TRUE (the default), the plots are drawn vertically rather than horizontally.

group.names

Optional argument (forced to be a character string) explicitly providing the group labels that will be printed alongside (or underneath) each plot. When group.names is provided, it must have the same number of elements as the number of groups. When group.names is not provided, if the groups created based on the argument x have a names attribute, then that is used for the group names. Otherwise, the group names are set to the integers 1 to the number of groups (e.g., 1:3 for three groups).

group.names.cex

numeric scalar indicating the amount by which the group labels should be scaled relative to the default (see the help file for plot.default). The default is the current value of the graphics parameter cex.

drop.unused.levels

when drop.unused.levels=TRUE, groups with no observations are dropped.

add

logical, if true add the chart to the current plot.

at

numeric vector giving the locations where the charts should be drawn, particularly when add=TRUE; defaults to 1:n where n is the number of groups.

xlim, ylim

plot limits: see plot.window.

ylab, xlab

labels: see title.

dlab, glab

alternate way to specify axis labels. The dlab and glab labels may be used instead of xlab and ylab if those are not specified. dlab applies to the continuous data axis (the \(y\)-axis unless vertical=FALSE), and glab to the group axis.

log

on which axes to use a log scale: see plot.default.

pch, col, cex

Graphical parameters: see par.

points.cex

Sets the cex value for the points plotted.

axes, frame.plot

Axis control: see plot.default.

show.ci

logical scalar indicating whether to plot the confidence interval. The default is show.ci=TRUE.

location.pch

integer indicating which plotting character to use to indicate the estimate of location (mean or median) for each group (see the help file for plot.default). The default is location.pch=16, a filled circle.

location.cex

numeric scalar giving the amount by which the plotting characters indicating the estimate of location for each group should be scaled relative to the default (see the help file for plot.default). The default is the current value of the graphics parameter cex.

conf.level

numeric scalar between 0 and 1 indicating the confidence level associated with the confidence interval for the group location (population mean or median). The default value is conf.level=0.95.

min.n.for.ci

integer indicating the minimum sample size required in order to plot a confidence interval for the group location. The default value is min.n.for.ci=2.

ci.offset

numeric scalar or vector of length equal to the number of groups (n) in units of cex indicating the amount of space between the line showing the confidence interval and tick mark associated with a particular group. The default value depends on the number of groups and is given by
3/ifelse(n > 2, (n-1)^(1/3), 1).

ci.bar.lwd

numeric scalar indicating the line width for the confidence interval bars. The default is the current value of the graphics parameter cex.

ci.bar.ends

logical scalar indicating whether to add flat ends to the confidence interval bars. The default value is ci.bar.ends=TRUE.

ci.bar.ends.size

numeric scalar in units of cxy indicating the size of confidence interval bar ends. The default value is half of the current value of cex.

ci.bar.gap

logical scalar indicating with to add a gap between the estimate of group location and the confidence interval bar. The default value is ci.bar.gap=FALSE.

n.text

character string indicating whether and where to indicate the sample size for each group. Possible values are "bottom" (the default), "top", and "none".

n.text.line

integer indicating on which plot margin line to show the sample sizes for each group. The default value is n.text.line=2 when n.text="bottom" and 0 otherwise.

n.text.cex

numeric scalar giving the amount by which the text indicating the sample size for each group should be scaled relative to the default (see the help file for
plot.default). The default is the current value of the graphics parameter cex.

location.scale.text

character string indicating whether and where to indicate the estimates of location (mean or median) and scale (standard deviation or interquartile range) for each group. Possible values are "top" (the default), "bottom", and "none".

location.scale.digits

integer indicating the number of digits to round the estimates of location and scale. The default value is location.scale.digits=1.

nsmall

integer passed to the function format indicating the the minimum number of digits to the right of the decimal point for the estimates of location and scale. The default value is the value of location.scale.digits, which forces all estimates of location and scale have the same number of digits to the right of the decimal point (including, possibly, trailing zeros). To omit trailing zeros, set nsmall=0.

location.scale.text.line

integer indicating on which plot margin line to show the estimates of location and scale for each group. The default value is
location.scale.text.line=0 when n.text="top" and 3.5 otherwise.

location.scale.text.cex

numeric scalar giving the amount by which the text indicating the estimates of location and scale for each group should be scaled relative to the default (see the help file for plot.default). The default depends on the number of groups and is given by cex * 0.8 * ifelse(n > 6, max(0.4, 1 - (n-6) * 0.06), 1), where cex denotes the current value of the graphics parameter cex.

p.value

logical scalar indicating whether to show the p-value associated with testing whether all groups have the same population location. The default value is p.value=FALSE. The p-value is displayed at the top of the graph.

p.value.digits

integer indicating the number of digits to round to when displaying the p-value associated with the test of equal group locations. The default value is
p.value.digits=3.

p.value.line

integer indicating on which plot margin line to show the p-value associated with the test of equal group locations. The default value is p.value.line=2.

p.value.cex

numeric scalar giving the amount by which the text indicating the p-value associated with the test of equal group locations should be scaled relative to the default (see the help file for plot.default). The default is the current value of the graphics parameter cex.

group.difference.ci

for the case when there are just 2 groups, a logical scalar indicating whether to display the confidence interval for the difference between group locations. The default is the value of the p.value argument. The confidence interval is displayed at the top of the graph in the format [Lower CI, Upper CI].

group.difference.conf.level

for the case when there are just 2 groups, a numeric scalar between 0 and 1 indicating the confidence level associated with the confidence interval for the difference between group locations. The default is conf.level=0.95.

group.difference.digits

for the case when there are just 2 groups, an integer indicating the number of digits to round to when displaying the confidence interval for the difference between group locations. The default value is
group.difference.digits=location.scale.digits.

ci.and.test

character string indicating whether confidence intervals and tests should be based on parametric or nonparametric (ci.and.test="nonparametric") methods. When ci.and.test="parametric" (the default), confidence intervals for the population mean are based on the one-sample t-test (see t.test), and the test of group differences is based on the two-sample t-test if there are two groups and the F-test (i.e., one-way analysis of variance, see aov) if there are three or more groups. When ci.and.test="nonparametric", confidence intervals for the population pseudo-median are based on the Wilcoxon signed rank test (see wilcox.test and page 56 of Hollander and Wolfe, 1999), and the test of group differences is based on the Wilcoxon rank sum test if there are two groups (see wilcox.test) and the Kruskal-Wallis test (see kruskal.test) if there are three or more groups.

ci.arg.list

an optional list of arguments to pass to the function used to compute confidence intervals. The default value is ci.arg.list=NULL.

test.arg.list

an optional list of arguments to pass to the function used to test for group differences in location. The default value is test.arg.list=NULL. In particular, in the case when there are two groups, ci.and.test="parametric", and ci.arg.list is NULL or does not contain a component specifying the value for var.equal, this argument is updated to include the component var.equal=TRUE, which is not the default behavior of t.test.
NOTE: If test.arg.list contains a component named "paired", the value of that component is set to the value of the argument paired (see below).

alternative

character string describing the alternative hypothesis for the test of group differences in the case when there are two groups. Possible values are "two.sided" (the default), "less", and "greater".

plot.diff

applicable only to the case when there are two groups:
logical scalar indicating whether to plot the confidence interval for the difference between the groups. The default is plot.diff=FALSE.

When plot.diff=TRUE and paired=FALSE, the confidence interval for the difference between the two locations is displayed and the right-hand axis (when vertical=TRUE) or top axis (when vertical=FALSE) is labeled in units of the confidence interval for the difference between the two locations. If
ci.and.test="parametric", the confidence interval for the difference between the two means is displayed. If ci.and.test="nonparametric", the confidence interval for the median of the difference between a sample from the first group and a sample from the second group is displayed (see the help file for wilcox.test.

When plot.diff=TRUE and paired=TRUE, the paired differences are displayed and the right-hand axis (when vertical=TRUE) or top axis (when
vertical=FALSE) is labeled in units of the paired differences. In addition, if show.ci=TRUE, the confidence interval based on the paired differences is displayed. In this case, if ci.and.test="parametric" the confidence interval for the mean of the paired differences is displayed, and if
ci.and.test="nonparametric" the confidence interval for the pseudomedian is displayed (see the help file for wilcox.test.

diff.col

applicable only to the case when there are two groups and plot.diff=TRUE:
numeric or character scalar indicating what color to use for the confidence interval for the difference in locations between the two groups. When paired=TRUE, this argument also controls the color of the paired differences. The default is diff.col=col[1].

diff.method

applicable only to the case when there are two groups, plot.diff=TRUE, and paired=TRUE:
the method to be used to separate coincident points for the paired differences. The default value is diff.method="stack". Other options are
diff.method="jitter" and diff.method="overplot". See the explanation for the argument method above.

diff.pch

applicable only to the case when there are two groups, plot.diff=TRUE, and paired=TRUE:
numeric or character scalar indicating what plotting symbol to use for the paired differences. The default is diff.pch=pch[1].

paired

applicable only to the case when there are two groups:
logical scalar indicating whether the observations in the first group are paired with those in the second group. The default is paired=FALSE.
NOTE 1: When the formula method for the argument x is used (see above) and the argument paired=TRUE, the data in the vector y must have the same number of observations for each level of the factor g and for each level sorted in the same way according to the pairing variable.
NOTE 2: If the argument test.arg.list (see above) contains a component named "paired", the value of that component is set to the value of the argument paired.

paired.lines

applicable only to the case when there are two groups and paired=TRUE:
logical scalar indicating whether to join the paired observations with lines. The default value is the value of the argument paired.

paired.lty

applicable only to the case when there are two groups, paired=TRUE, and
paired.lines=TRUE:
numeric vector indicating the line types to use to join the paired observations with lines. The default value is paired.lty=1:6.

paired.lwd

applicable only to the case when there are two groups, paired=TRUE, and
paired.lines=TRUE:
numeric vector indicating the widths of the lines used to join the paired observations with lines. The default value is paired.lwd=1.

paired.pch

applicable only to the case when there are two groups, paired=TRUE, and
paired.lines=TRUE:
numeric vector indicating the plotting characters to use at each end of the lines used to join the paired observations with lines. The default value is
paired.pch=1:14.

paired.col

applicable only to the case when there are two groups, paired=TRUE, and
paired.lines=TRUE:
character or numeric vector indicating the colors for the lines (and plotting characters) used to join the paired observations with lines. The default value is paired.col=NULL, in which case the vector becomes
c("black", "red", "green3", "blue", "magenta", "darkgreen",
"purple", "orange", "darkolivegreen", "steelblue", "darkgray").

diff.name

applicable only to the case when there are two groups and plot.diff=TRUE:
character scalar indicating the label to use for the confidence interval for the difference between groups. For the case when paired=TRUE, this label also describes the paired differences. The default value is diff.name=NULL, in which case the label is "group 2 - group 1", where group 1 and group 2 denote the names for the each group. For example, if group 1 is labeled "A" and group 2 is labeled "B", then the default value is diff.name="B-A".

diff.name.cex

applicable only to the case when there are two groups and plot.diff=TRUE:
numeric scalar indicating the amount by which the label for group differences should be scaled relative to the default (see the help file for plot.default). The default value is diff.name.cex=group.names.cex.

sep.line

applicable only to the case when there are two groups and plot.diff=TRUE:
logical scalar indicating whether to draw a line between the strip charts for the two groups and the confidence interval for the difference between the two groups (and paired differences when paired=TRUE). The default value is
sep.line=TRUE.

sep.lty

applicable only to the case when there are two groups, plot.diff=TRUE, and sep.line=TRUE:
numeric scalar indicating the line type to use for the line drawn between the strip charts for the two groups and the confidence interval for the difference between the two groups. The default value is sep.lty=2.

sep.lwd

applicable only to the case when there are two groups, plot.diff=TRUE, and sep.line=TRUE:
numeric scalar indicating the line width to use for the line drawn between the strip charts for the two groups and the confidence interval for the difference between the two groups. The default value is the current value of the graphics parameter cex.

sep.col

applicable only to the case when there are two groups, plot.diff=TRUE, and sep.line=TRUE:
numeric or character scalar indicating the color of the line drawn between the strip charts for the two groups and the confidence interval for the difference between the two groups. The default value is sep.col="gray".

diff.lim

applicable only to the case when there are two groups and plot.diff=TRUE:
numeric vector of length 2 indicating the limits to use for the axis associated with the confidence interval for the difference between the two groups. When paired=FALSE, the default value is the range of the y-axis, but centered at the mean of the confidence interval for the difference in locations. When
paired=TRUE, the default value is range(pretty(c(X, range(CI)))) where X denotes the vector containing the paired differences.

diff.at

applicable only to the case when there are two groups and plot.diff=TRUE:
numeric vector indicating the locations of the tick marks for the axis associated with the confidence interval for the difference between groups (see the explanation for the argument at in the help file for axis). The default value is diff.at=NULL, in which case default values are used for the locations of the tick marks.

diff.axis.label

applicable only to the case when there are two groups and plot.diff=TRUE:
character string indicating the label to use for the axis associated with the confidence interval for the difference between groups. When paired=FALSE the default value is "Difference Between Groups", and when paired=TRUE the default value is "Paired Difference".

plot.diff.mar

applicable only to the case when there are two groups, plot.diff=TRUE, and add=FALSE:
numeric vector of length 4 indicating the number of lines in the plotting margins (see the explanation for the argument mar in the help file for par). The default value is plot.diff.mar = c(5, 4, 4, 4) + 0.1.

Value

stripChart invisibly returns a list with the following components:

group.centers: numeric vector of values on the group axis (the \(x\)-axis unless vertical=FALSE) indicating the centers of the groups.
group.stats: a matrix with the number of rows equal to the number of groups and six columns indicating the sample size of the group (N), the estimate of the group location parameter (Mean or Median), the estimate of the group scale (SD or IQR), the lower confidence limit for the group location parameter (LCL), the upper confidence limit for the group location parameter (UCL), and the confidence level associated with the confidence interval (Conf.Level)

In addition, if the argument p.value=TRUE and/or 1) there are two groups and 2) plot.diff=TRUE, the list also includes these components:

group.difference.p.value: numeric scalar indicating the p-value associated with the test of equal group locations.
group.difference.conf.int: numeric vector of two elements indicating the confidence interval for the difference between the group locations. Only present when there are two groups.

References

Hollander, M., and D.A. Wolfe. (1999). Nonparametric Statistical Methods. Second Edition. John Wiley and Sons, New York.

Millard, S.P., and N.K. Neerchal. (2001). Environmental Statistics with S-PLUS. CRC Press, Boca Raton, FL.

Zar, J.H. (2010). Biostatistical Analysis. Fifth Edition. Prentice-Hall, Upper Saddle River, NJ.

Author

Steven P. Millard (EnvStats@ProbStatInfo.com)

Examples

  #------------------------
  # Two Independent Samples
  #------------------------

  # The guidance document USEPA (1994b, pp. 6.22--6.25) 
  # contains measures of 1,2,3,4-Tetrachlorobenzene (TcCB) 
  # concentrations (in parts per billion) from soil samples 
  # at a Reference area and a Cleanup area.  These data are strored 
  # in the data frame EPA.94b.tccb.df.  
  #
  # First create one-dimensional scatterplots to compare the 
  # TcCB concentrations between the areas and use a nonparametric 
  # test to test for a difference between areas.

  dev.new()
  stripChart(TcCB ~ Area, data = EPA.94b.tccb.df, col = c("red", "blue"), 
    p.value = TRUE, ci.and.test = "nonparametric", 
    ylab = "TcCB (ppb)")
#> Warning: cannot compute exact p-value with ties
#> Warning: cannot compute exact confidence interval with ties

  #----------

  # Now log-transform the TcCB data and use a parametric test
  # to compare the areas.

  dev.new()
  stripChart(log10(TcCB) ~ Area, data = EPA.94b.tccb.df, col = c("red", "blue"), 
    p.value = TRUE, ylab = "log10 [ TcCB (ppb) ]")

  #----------

  # Repeat the above procedure, but also plot the confidence interval  
  # for the difference between the means.

  dev.new()
  stripChart(log10(TcCB) ~ Area, data = EPA.94b.tccb.df, col = c("red", "blue"), 
    p.value = TRUE, plot.diff = TRUE, diff.col = "black", 
    ylab = "log10 [ TcCB (ppb) ]")

  #----------

  # Repeat the above procedure, but allow the variances to differ.

  dev.new()
  stripChart(log10(TcCB) ~ Area, data = EPA.94b.tccb.df, col = c("red", "blue"),
    p.value = TRUE, plot.diff = TRUE, diff.col = "black", 
    ylab = "log10 [ TcCB (ppb) ]", test.arg.list = list(var.equal = FALSE))

  #----------

  # Repeat the above procedure, but jitter the points instead of 
  # stacking them.

  dev.new()
  stripChart(log10(TcCB) ~ Area, data = EPA.94b.tccb.df, col = c("red", "blue"),
    p.value = TRUE, plot.diff = TRUE, diff.col = "black", 
    ylab = "log10 [ TcCB (ppb) ]", test.arg.list = list(var.equal = FALSE), 
    method = "jitter", ci.offset = 4)

  #---------- 

  # Clean up
  #---------
  graphics.off()

  #====================

  #--------------------
  # Paired Observations
  #--------------------

  # The data frame ACE.13.TCE.df contians paired observations of 
  # trichloroethylene (TCE; mg/L) at 10 groundwater monitoring wells 
  # before and after remediation.
  #
  # Create one-dimensional scatterplots to compare TCE concentrations 
  # before and after remediation and use a paired t-test to 
  # test for a difference between periods.

  ACE.13.TCE.df
#>    TCE.mg.per.L Well Period
#> 1        20.900    1 Before
#> 2         9.170    2 Before
#> 3         5.960    3 Before
#> 4        41.500    4 Before
#> 5        34.300    5 Before
#> 6        19.700    6 Before
#> 7        38.900    7 Before
#> 8         8.180    8 Before
#> 9         9.130    9 Before
#> 10       28.500   10 Before
#> 11        0.917    1  After
#> 12        8.770    2  After
#> 13        4.370    3  After
#> 14        4.340    4  After
#> 15       10.700    5  After
#> 16        1.480    6  After
#> 17        0.272    7  After
#> 18        0.520    8  After
#> 19        3.060    9  After
#> 20        1.900   10  After
  #   TCE.mg.per.L Well Period
  #1        20.900    1 Before
  #2         9.170    2 Before
  #3         5.960    3 Before
  #...      ......   .. ......
  #18        0.520    8  After
  #19        3.060    9  After
  #20        1.900   10  After

  dev.new()
  stripChart(TCE.mg.per.L ~ Period, data = ACE.13.TCE.df, 
    col = c("brown", "green"), p.value = TRUE, paired = TRUE, 
    ylab = "TCE (mg/L)")

  #----------

  # Repeat the above procedure, but also plot the confidence interval  
  # for the mean of the paired differences.

  dev.new()
  stripChart(TCE.mg.per.L ~ Period, data = ACE.13.TCE.df, 
    col = c("brown", "green"), p.value = TRUE, paired = TRUE, 
    ylab = "TCE (mg/L)", plot.diff = TRUE, diff.col = "blue")


  #==========

  # Repeat the last two examples, but use a one-sided alternative since 
  # remediation should decrease TCE concentration.

  dev.new()
  stripChart(TCE.mg.per.L ~ Period, data = ACE.13.TCE.df, 
    col = c("brown", "green"), p.value = TRUE, paired = TRUE, 
    ylab = "TCE (mg/L)", alternative = "less", 
    group.difference.digits = 2)

  #----------

  # Repeat the above procedure, but also plot the confidence interval  
  # for the mean of the paired differences.
  #
  # NOTE: Although stripChart can *report* one-sided confidence intervals 
  #       for the difference between two groups (see above example), 
  #       when *plotting* the confidence interval for the difference, 
  #       only two-sided CIs are allowed.  
  #       Here, we will set the confidence level of the confidence 
  #       interval for the mean of the paired differences to 90%, 
  #       so that the upper bound of the CI corresponds to the upper 
  #       bound of a 95% one-sided CI.

  dev.new()
  stripChart(TCE.mg.per.L ~ Period, data = ACE.13.TCE.df, 
    col = c("brown", "green"), p.value = TRUE, paired = TRUE, 
    ylab = "TCE (mg/L)", group.difference.digits = 2, 
    plot.diff = TRUE, diff.col = "blue", group.difference.conf.level = 0.9)

 #---------- 

  # Clean up
  #---------
  graphics.off()

  #==========

  # The data frame Helsel.Hirsch.02.Mayfly.df contains paired counts
  # of mayfly nymphs above and below industrial outfalls in 12 streams.  
  #
  # Create one-dimensional scatterplots to compare the 
  # counts between locations and use a nonparametric test 
  # to compare counts above and below the outfalls.

  Helsel.Hirsch.02.Mayfly.df
#>    Mayfly.Count Stream Location
#> 1            12      1    Above
#> 2            15      2    Above
#> 3            11      3    Above
#> 4            41      4    Above
#> 5           106      5    Above
#> 6            63      6    Above
#> 7           296      7    Above
#> 8            53      8    Above
#> 9            20      9    Above
#> 10          110     10    Above
#> 11          429     11    Above
#> 12          185     12    Above
#> 13            9      1    Below
#> 14            9      2    Below
#> 15           38      3    Below
#> 16           24      4    Below
#> 17           48      5    Below
#> 18           17      6    Below
#> 19           11      7    Below
#> 20           41      8    Below
#> 21           14      9    Below
#> 22           60     10    Below
#> 23           53     11    Below
#> 24          124     12    Below
  #   Mayfly.Count Stream Location
  #1            12      1    Above
  #2            15      2    Above
  #3            11      3    Above
  #...         ...     ..    .....
  #22           60     10    Below
  #23           53     11    Below
  #24          124     12    Below

  dev.new()
  stripChart(Mayfly.Count ~ Location, data = Helsel.Hirsch.02.Mayfly.df, 
    col = c("green", "brown"), p.value = TRUE, paired = TRUE, 
    ci.and.test = "nonparametric", ylab = "Number of Mayfly Nymphs")
#> Warning: cannot compute exact p-value with ties
#> Warning: cannot compute exact confidence interval with ties
#> Warning: cannot compute exact p-value with ties
#> Warning: cannot compute exact confidence interval with ties

  #---------- 
 
  # Repeat the above procedure, but also plot the confidence interval  
  # for the pseudomedian of the paired differences.

  dev.new()
  stripChart(Mayfly.Count ~ Location, data = Helsel.Hirsch.02.Mayfly.df, 
    col = c("green", "brown"), p.value = TRUE, paired = TRUE, 
    ci.and.test = "nonparametric", ylab = "Number of Mayfly Nymphs", 
    plot.diff = TRUE, diff.col = "blue")
#> Warning: cannot compute exact p-value with ties
#> Warning: cannot compute exact confidence interval with ties
#> Warning: cannot compute exact p-value with ties
#> Warning: cannot compute exact confidence interval with ties

 #---------- 

  # Clean up
  #---------
  graphics.off()