tTestN.Rd
Compute the sample size necessary to achieve a specified power for a one- or two-sample t-test, given the scaled difference and significance level.
numeric vector specifying the ratio of the true difference \(\delta\) (\(\delta = \mu - \mu_0\) for the one-sample case and \(\delta = \mu_1 - \mu_2\) for the two-sample case) to the population standard deviation (\(\sigma\)). This is also called the “scaled difference”.
numeric vector of numbers between 0 and 1 indicating the Type I error level
associated with the hypothesis test. The default value is alpha=0.05
.
numeric vector of numbers between 0 and 1 indicating the power
associated with the hypothesis test. The default value is power=0.95
.
character string indicating whether to compute power based on a one-sample or
two-sample hypothesis test. When sample.type="one.sample"
, the computed
power is based on a hypothesis test for a single mean. When sample.type="two.sample"
, the computed power is based on a hypothesis test
for the difference between two means. The default value is sample.type="one.sample"
unless the argument n2
is supplied.
character string indicating the kind of alternative hypothesis. The possible values are:
"two.sided"
(the default). \(H_a: \mu \ne \mu_0\) for the one-sample case and
\(H_a: \mu_1 \ne \mu_2\) for the two-sample case.
"greater"
. \(H_a: \mu > \mu_0\) for the one-sample case and
\(H_a: \mu_1 > \mu_2\) for the two-sample case.
"less"
. \(H_a: \mu < \mu_0\) for the one-sample case and
\(H_a: \mu_1 < \mu_2\) for the two-sample case.
logical scalar indicating whether to compute the power based on an approximation to
the non-central t-distribution. The default value is FALSE
.
numeric vector of sample sizes for group 2. The default value is
NULL
in which case it is assumed that the sample sizes for groups
1 and 2 are equal.
This argument is ignored when sample.type="one.sample"
.
Missing (NA
), undefined (NaN
), and infinite (Inf
, -Inf
)
values are not allowed.
logical scalar indicating whether to round up the values of the computed
sample size(s) to the next smallest integer. The default value is
TRUE
.
positive integer greater than 1 indicating the maximum sample size when sample.type="one.sample"
or the maximum sample size for group 1
when sample.type="two.sample"
. The default value is n.max=5000
.
numeric scalar indicating the toloerance to use in the
uniroot
search algorithm.
The default value is tol=1e-7
.
positive integer indicating the maximum number of iterations
argument to pass to the uniroot
function. The default
value is maxiter=1000
.
Formulas for the power of the t-test for specified values of
the sample size, scaled difference, and Type I error level are given in
the help file for tTestPower
. The function tTestN
uses the uniroot
search algorithm to determine the
required sample size(s) for specified values of the power,
scaled difference, and Type I error level.
When sample.type="one.sample"
, tTestN
returns a numeric vector of sample sizes.
When sample.type="two.sample"
and n2
is not supplied,
equal sample sizes for each group is assumed and tTestN
returns a numeric vector of
sample sizes indicating the required sample size for each group.
When sample.type="two.sample"
and n2
is supplied,
tTestN
returns a list with two components called n1
and
n2
, specifying the sample sizes for each group.
See tTestPower
.
See tTestPower
.
# Look at how the required sample size for the one-sample t-test
# increases with increasing required power:
seq(0.5, 0.9, by = 0.1)
#> [1] 0.5 0.6 0.7 0.8 0.9
#[1] 0.5 0.6 0.7 0.8 0.9
tTestN(delta.over.sigma = 0.5, power = seq(0.5, 0.9, by = 0.1))
#> [1] 18 22 27 34 44
#[1] 18 22 27 34 44
#----------
# Repeat the last example, but compute the sample size based on the
# approximation to the power instead of the exact method:
tTestN(delta.over.sigma = 0.5, power = seq(0.5, 0.9, by = 0.1),
approx = TRUE)
#> [1] 18 22 27 34 45
#[1] 18 22 27 34 45
#==========
# Look at how the required sample size for the two-sample t-test
# decreases with increasing scaled difference:
seq(0.5, 2,by = 0.5)
#> [1] 0.5 1.0 1.5 2.0
#[1] 0.5 1.0 1.5 2.0
tTestN(delta.over.sigma = seq(0.5, 2, by = 0.5), sample.type = "two")
#> [1] 105 27 13 8
#[1] 105 27 13 8
#----------
# Look at how the required sample size for the two-sample t-test decreases
# with increasing values of Type I error:
tTestN(delta.over.sigma = 0.5, alpha = c(0.001, 0.01, 0.05, 0.1),
sample.type="two")
#> [1] 198 145 105 88
#[1] 198 145 105 88
#----------
# For the two-sample t-test, compare the total sample size required to
# detect a scaled difference of 1 for equal sample sizes versus the case
# when the sample size for the second group is constrained to be 20.
# Assume a 5% significance level and 95% power. Note that for the case
# of equal sample sizes, a total of 54 samples (27+27) are required,
# whereas when n2 is constrained to be 20, a total of 62 samples
# (42 + 20) are required.
tTestN(1, sample.type="two")
#> [1] 27
#[1] 27
tTestN(1, n2 = 20)
#> $n1
#> [1] 42
#>
#> $n2
#> [1] 20
#>
#$n1
#[1] 42
#
#$n2
#[1] 20
#==========
# Modifying the example on pages 21-4 to 21-5 of USEPA (2009), determine the
# required sample size to detect a mean aldicarb level greater than the MCL
# of 7 ppb at the third compliance well with a power of 95%, assuming the
# true mean is 10 or 14. Use the estimated standard deviation from the
# first four months of data to estimate the true population standard
# deviation, use a Type I error level of alpha=0.01, and assume an
# upper one-sided alternative (third compliance well mean larger than 7).
# (The data are stored in EPA.09.Ex.21.1.aldicarb.df.)
# Note that the required sample size changes from 11 to 5 as the true mean
# increases from 10 to 14.
EPA.09.Ex.21.1.aldicarb.df
#> Month Well Aldicarb.ppb
#> 1 1 Well.1 19.9
#> 2 2 Well.1 29.6
#> 3 3 Well.1 18.7
#> 4 4 Well.1 24.2
#> 5 1 Well.2 23.7
#> 6 2 Well.2 21.9
#> 7 3 Well.2 26.9
#> 8 4 Well.2 26.1
#> 9 1 Well.3 5.6
#> 10 2 Well.3 3.3
#> 11 3 Well.3 2.3
#> 12 4 Well.3 6.9
# Month Well Aldicarb.ppb
#1 1 Well.1 19.9
#2 2 Well.1 29.6
#3 3 Well.1 18.7
#4 4 Well.1 24.2
#5 1 Well.2 23.7
#6 2 Well.2 21.9
#7 3 Well.2 26.9
#8 4 Well.2 26.1
#9 1 Well.3 5.6
#10 2 Well.3 3.3
#11 3 Well.3 2.3
#12 4 Well.3 6.9
sigma <- with(EPA.09.Ex.21.1.aldicarb.df,
sd(Aldicarb.ppb[Well == "Well.3"]))
sigma
#> [1] 2.101388
#[1] 2.101388
tTestN(delta.over.sigma = (c(10, 14) - 7)/sigma,
alpha = 0.01, sample.type="one", alternative="greater")
#> [1] 11 5
#[1] 11 5
# Clean up
#---------
rm(sigma)