varGroupTest.Rd
Test the null hypothesis that the variances of two or more normal distributions are the same using Levene's or Bartlett's test.
varGroupTest(object, ...)
# S3 method for class 'formula'
varGroupTest(object, data = NULL, subset,
na.action = na.pass, ...)
# Default S3 method
varGroupTest(object, group, test = "Levene",
correct = TRUE, data.name = NULL, group.name = NULL,
parent.of.data = NULL, subset.expression = NULL, ...)
# S3 method for class 'data.frame'
varGroupTest(object, ...)
# S3 method for class 'matrix'
varGroupTest(object, ...)
# S3 method for class 'list'
varGroupTest(object, ...)
an object containing data for 2 or more groups whose variances are to be compared.
In the default method, the argument object
must be a numeric vector.
When object
is a data frame, all columns must be numeric.
When object
is a matrix, it must be a numeric matrix.
When object
is a list, all components must be numeric vectors.
In the formula method, a symbolic specification of the form y ~ g
can be given, indicating the observations in the vector y
are to be grouped
according to the levels of the factor g
. Missing (NA
), undefined (NaN
),
and infinite (Inf
, -Inf
) values are allowed but will be removed.
when object
is a formula, data
specifies an optional data frame, list or
environment (or object coercible by as.data.frame
to a data frame) containing the
variables in the model. If not found in data
, the variables are taken from
environment(formula)
, typically the environment from which summaryStats
is called.
when object
is a formula, subset
specifies an optional vector specifying
a subset of observations to be used.
when object
is a formula, na.action
specifies a function which indicates
what should happen when the data contain NA
s. The default is na.pass
.
when object
is a numeric vector, group
is a factor or character vector
indicating which group each observation belongs to. When object
is a matrix or data frame
this argument is ignored and the columns define the groups. When object
is a list
this argument is ignored and the components define the groups. When object
is a formula,
this argument is ignored and the right-hand side of the formula specifies the grouping variable.
character string indicating which test to use. The possible values are "Levene"
(Levene's test; the default) and "Bartlett"
(Bartlett's test).
logical scalar indicating whether to use the correction factor for Bartlett's test.
The default value is correct=TRUE
. This argument is ignored if test="Levene"
.
character string indicating the name of the data used for the group variance test.
The default value is data.name=deparse(substitute(object))
.
character string indicating the name of the data used to create the groups.
The default value is group.name=deparse(substitute(group))
.
character string indicating the source of the data used for the group variance test.
character string indicating the expression used to subset the data.
additional arguments affecting the group variance test.
The function varGroupTest
performs Levene's or Bartlett's test for
homogeneity of variance among two or more groups. The R function var.test
compares two variances.
Bartlett's test is very sensitive to the assumption of normality and will tend to give significant results even when the null hypothesis is true if the underlying distributions have long tails (e.g., are leptokurtic). Levene's test is almost as powerful as Bartlett's test when the underlying distributions are normal, and unlike Bartlett's test it tends to maintain the assumed \(alpha\)-level when the underlying distributions are not normal (Snedecor and Cochran, 1989, p.252; Milliken and Johnson, 1992, p.22; Conover et al., 1981). Thus, Levene's test is generally recommended over Bartlett's test.
a list of class "htest"
containing the results of the group variance test.
Objects of class "htest"
have special printing and plotting methods.
See the help file for htest.object
for details.
Conover, W.J., M.E. Johnson, and M.M. Johnson. (1981). A Comparative Study of Tests for Homogeneity of Variances, with Applications to the Outer Continental Shelf Bidding Data. Technometrics 23(4), 351-361.
Davis, C.B. (1994). Environmental Regulatory Statistics. In Patil, G.P., and C.R. Rao, eds., Handbook of Statistics, Vol. 12: Environmental Statistics. North-Holland, Amsterdam, a division of Elsevier, New York, NY, Chapter 26, 817-865.
Milliken, G.A., and D.E. Johnson. (1992). Analysis of Messy Data, Volume I: Designed Experiments. Chapman & Hall, New York.
Snedecor, G.W., and W.G. Cochran. (1989). Statistical Methods, Eighth Edition. Iowa State University Press, Ames Iowa.
USEPA. (2009). Statistical Analysis of Groundwater Monitoring Data at RCRA Facilities, Unified Guidance. EPA 530/R-09-007, March 2009. Office of Resource Conservation and Recovery Program Implementation and Information Division. U.S. Environmental Protection Agency, Washington, D.C.
USEPA. (2010). Errata Sheet - March 2009 Unified Guidance. EPA 530/R-09-007a, August 9, 2010. Office of Resource Conservation and Recovery, Program Information and Implementation Division. U.S. Environmental Protection Agency, Washington, D.C.
Zar, J.H. (2010). Biostatistical Analysis. Fifth Edition. Prentice-Hall, Upper Saddle River, NJ.
Chapter 11 of USEPA (2009) discusses using Levene's test to test the assumption of equal variances between monitoring wells or to test that the variance is stable over time when performing intrawell tests.
# Example 11-2 of USEPA (2009, page 11-7) gives an example of
# testing the assumption of equal variances across wells for arsenic
# concentrations (ppb) in groundwater collected at 6 monitoring
# wells over 4 months. The data for this example are stored in
# EPA.09.Ex.11.1.arsenic.df.
head(EPA.09.Ex.11.1.arsenic.df)
#> Arsenic.ppb Month Well
#> 1 22.9 1 1
#> 2 3.1 2 1
#> 3 35.7 3 1
#> 4 4.2 4 1
#> 5 2.0 1 2
#> 6 1.2 2 2
# Arsenic.ppb Month Well
#1 22.9 1 1
#2 3.1 2 1
#3 35.7 3 1
#4 4.2 4 1
#5 2.0 1 2
#6 1.2 2 2
longToWide(EPA.09.Ex.11.1.arsenic.df, "Arsenic.ppb", "Month", "Well",
paste.row.name = TRUE, paste.col.name = TRUE)
#> Well.1 Well.2 Well.3 Well.4 Well.5 Well.6
#> Month.1 22.9 2.0 2.0 7.8 24.9 0.3
#> Month.2 3.1 1.2 109.4 9.3 1.3 4.8
#> Month.3 35.7 7.8 4.5 25.9 0.8 2.8
#> Month.4 4.2 52.0 2.5 2.0 27.0 1.2
# Well.1 Well.2 Well.3 Well.4 Well.5 Well.6
#Month.1 22.9 2.0 2.0 7.8 24.9 0.3
#Month.2 3.1 1.2 109.4 9.3 1.3 4.8
#Month.3 35.7 7.8 4.5 25.9 0.8 2.8
#Month.4 4.2 52.0 2.5 2.0 27.0 1.2
varGroupTest(Arsenic.ppb ~ Well, data = EPA.09.Ex.11.1.arsenic.df)
#>
#> Results of Hypothesis Test
#> --------------------------
#>
#> Null Hypothesis: Ratio of each pair of variances = 1
#>
#> Alternative Hypothesis: At least one variance differs
#>
#> Test Name: Levene's Test for
#> Homogenity of Variance
#>
#> Estimated Parameter(s): Well.1 = 246.8158
#> Well.2 = 592.6767
#> Well.3 = 2831.4067
#> Well.4 = 105.2967
#> Well.5 = 207.4467
#> Well.6 = 3.9025
#>
#> Data: Arsenic.ppb
#>
#> Grouping Variable: Well
#>
#> Data Source: EPA.09.Ex.11.1.arsenic.df
#>
#> Sample Sizes: Well.1 = 4
#> Well.2 = 4
#> Well.3 = 4
#> Well.4 = 4
#> Well.5 = 4
#> Well.6 = 4
#>
#> Test Statistic: F = 4.564176
#>
#> Test Statistic Parameters: num df = 5
#> denom df = 18
#>
#> P-value: 0.007294084
#>
#Results of Hypothesis Test
#--------------------------
#
#Null Hypothesis: Ratio of each pair of variances = 1
#
#Alternative Hypothesis: At least one variance differs
#
#Test Name: Levene's Test for
# Homogenity of Variance
#
#Estimated Parameter(s): Well.1 = 246.8158
# Well.2 = 592.6767
# Well.3 = 2831.4067
# Well.4 = 105.2967
# Well.5 = 207.4467
# Well.6 = 3.9025
#
#Data: Arsenic.ppb
#
#Grouping Variable: Well
#
#Data Source: EPA.09.Ex.11.1.arsenic.df
#
#Sample Sizes: Well.1 = 4
# Well.2 = 4
# Well.3 = 4
# Well.4 = 4
# Well.5 = 4
# Well.6 = 4
#
#Test Statistic: F = 4.564176
#
#Test Statistic Parameters: num df = 5
# denom df = 18
#
#P-value: 0.007294084