stat_test_text.RdFor a strip plot or scatterplot produced using the package ggplot2
(e.g., with geom_point),
add text indicating the results of a hypothesis test comparing locations
betweeen groups, where the groups are defined based on the unique \(x\)-values.
stat_test_text(mapping = NULL, data = NULL,
geom = ifelse(text.box, "label", "text"), position = "identity",
na.rm = FALSE, show.legend = NA, inherit.aes = TRUE,
y.pos = NULL, y.expand.factor = 0.35, test = "parametric",
paired = FALSE, test.arg.list = list(), two.lines = TRUE,
p.value.digits = 3, p.value.digit.type = "round",
location.digits = 1, location.digit.type = "round",
nsmall = ifelse(location.digit.type == "round", location.digits, 0),
text.box = FALSE, alpha = 1, angle = 0, color = "black",
family = "", fontface = "plain", hjust = 0.5,
label.padding = ggplot2::unit(0.25, "lines"),
label.r = ggplot2::unit(0.15, "lines"), label.size = 0.25,
lineheight = 1.2, size = 4, vjust = 0.5, ...)See the help file for geom_text.
Character string indicating which geom to use to display the text.
Setting geom="text" will use geom_text to display the text, and
setting geom="label" will use geom_label to display the text.
The default value is geom="text" unless the user sets text.box=TRUE.
Numeric scalar indicating the \(y\)-position of the text (i.e., the value of the
argument y that will be used in the call to geom_text or
geom_label). The default value is y.pos=NULL, in which
case y.pos is set to the maximum value of all \(y\)-values plus
a proportion of the range of all \(y\)-values, where the proportion is
determined by the argument y.expand.factor (see below).
For the case when y.pos=NULL, a numeric scalar indicating the proportion
by which the range of all \(y\)-values should be multiplied by before adding
this value to the maximum value of all \(y\)-values in order to compute the
value of the argument y.pos (see above).
The default value is y.expand.factor=0.35.
A character string indicating whether to use a standard parametric test
(test="parametric", the default) or nonparametric test
(test="nonparametric") to compare groups.
For the case of two groups, a logical scalar indicating whether the data
should be considered to be paired. The default value is paired=FALSE.
NOTE: if the argument test.arg.list is supplied and it includes a
component named paired, the value of that component
is overriden by the value of the argument paired.
An optional list of arguments to pass to the function used to test for
group differences in location. The default value is an empty list:
test.arg.list=list(). In particular, when there are two groups,
ci.and.test="parametric", and ci.arg.list does not contain
a component specifying the value for var.equal, this argument is
updated to include the component var.equal=TRUE, which is not
the default behavior of t.test.
NOTE: If test.arg.list contains a component named "paired",
the value of that component is set to the value of the argument paired
(see above).
For the case of one or two groups, a logical scalar indicating whether the
associated confidence interval should be be displayed on a second line
instead of on the same line as the p-value. The default is two.lines=TRUE.
An integer indicating the number of digits to use for displaying the
p-value. When p.value.digit.type="round" (see below)
the argument p.value.digits indicates the number of digits to round to,
and when p.value.digit.type="signif" the argument p.value.digits
indicates the number of significant digits to display.
The default value is p.value.digits=3.
A character string indicating whether the p.value.digits argument (see above)
refers to significant digits (p.value.digit.type="signif"), or how many decimal
places to round to (p.value.digit.type="round", the default).
For the case of one or two groups, an integer indicating the number of digits
to use for displaying the associated confidence interval.
When location.digit.type="round" (see below)
the argument location.digits indicates the number of digits to round to,
and when location.digit.type="signif" the argument location.digits
indicates the number of significant digits to display.
The default value is location.digits=1.
For the case of one or two groups, a character string indicating
whether the location.digits argument (see above)
refers to significant digits
(location.digit.type="signif"), or how many decimal
places to round to (location.digit.type="round"; the default).
For the case of one or two groups, an integer passed to the function
format indicating the the minimum number of digits to use
to the right of the decimal point for the associated confidence interval.
The default value is nsmall=digits when digit.type="round" and
nsmall=0 when digit.type="signif". When nsmall is greater than 0,
the two confidence limits will have the same number of digits to the
right of the decimal point (including, possibly, trailing zeros).
To omit trailing zeros, set nsmall=0.
Logical scalar indicating whether to surround the text with a text box (i.e.,
whether to use geom_label instead of
geom_text). This argument can be overridden by simply
specifying the argument geom.
See the help file for geom_text and
the vignette Aesthetic specifications at
https://cran.r-project.org/package=ggplot2/vignettes/ggplot2-specs.html.
See the help file for geom_text.
Other arguments passed on to layer.
The table below shows which hypothesis tests are performed based on the number of groups
and the values of the arguments test and paired.
| Function | ||||
| # Groups | test | paired | Name | Called |
| 1 | "parametric" | One-Sample t-test | t.test | |
"nonparametric" | Wilcoxon Signed Rank Test | wilcox.test | ||
| 2 | "parametric" | FALSE | Two-Sample t-test | t.test |
TRUE | Paired t-test | t.test | ||
"nonparametric" | FALSE | Wilcoxon Rank Sum Test | wilcox.test | |
TRUE | Wilcoxon Signed Rank Test | wilcox.test | ||
| on Paired Differences | ||||
| \(\ge\) 3 | "parametric" | Analysis of Variance | aov | |
summary.aov | ||||
"nonparametric" | Kruskal-Wallis Test | kruskal.test |
See the help file for geom_text for details about how
geom_text and geom_label work.
See the vignette Extending ggplot2 at https://cran.r-project.org/package=ggplot2/vignettes/extending-ggplot2.html for information on how to create a new stat.
Wickham, H. (2016). ggplot2: Elegant Graphics for Data Analysis (Use R!). Second Edition. Springer.
The function stat_test_text is called by the function geom_stripchart.
# First, load and attach the ggplot2 package.
#--------------------------------------------
library(ggplot2)
#==========
# Example 1:
# Using the built-in data frame mtcars,
# plot miles per gallon vs. number of cylinders
# using different colors for each level of the number of cylinders.
#------------------------------------------------------------------
p <- ggplot(mtcars, aes(x = factor(cyl), y = mpg, color = factor(cyl))) +
theme(legend.position = "none")
p + geom_point(show.legend = FALSE) +
labs(x = "Number of Cylinders", y = "Miles per Gallon")
# Now add text indicating the sample size and
# mean and standard deviation for each level of cylinder, and
# test for the difference in means between groups.
#------------------------------------------------------------
dev.new()
p + geom_point() +
stat_n_text() + stat_mean_sd_text() +
stat_test_text() +
labs(x = "Number of Cylinders", y = "Miles per Gallon")
#==========
# Example 2:
# Repeat Example 1, but show text indicating the median and IQR,
# and use the nonparametric test.
#---------------------------------------------------------------
dev.new()
p + geom_point() +
stat_n_text() + stat_median_iqr_text() +
stat_test_text(test = "nonparametric") +
labs(x = "Number of Cylinders", y = "Miles per Gallon")
#==========
# Example 3:
# Repeat Example 1, but use only the groups with
# 4 and 8 cylinders.
#-----------------------------------------------
p <- ggplot(subset(mtcars, cyl %in% c(4, 8)),
aes(x = factor(cyl), y = mpg, color = cyl)) +
theme(legend.position = "none")
dev.new()
p + geom_point() +
stat_n_text() + stat_mean_sd_text() +
stat_test_text() +
labs(x = "Number of Cylinders", y = "Miles per Gallon")
#==========
# Example 4:
# Repeat Example 3, but
# 1) facet by transmission type,
# 2) make the text smaller,
# 3) put the text for the test results in a text box
# and make them blue.
#---------------------------------------------------
dev.new()
p + geom_point() +
stat_n_text(size = 3) + stat_mean_sd_text(size = 3) +
stat_test_text(size = 3, text.box = TRUE, color = "blue") +
facet_wrap(~ am, labeller = label_both) +
labs(x = "Number of Cylinders", y = "Miles per Gallon")
#==========
# Clean up
#---------
graphics.off()
rm(p)