geoMean.Rd
Compute the sample geometric mean.
geoMean(x, na.rm = FALSE)
numeric vector of observations.
logical scalar indicating whether to remove missing values from x
.
If na.rm=FALSE
(the default) and x
contains missing values,
then a missing value (NA
) is returned. If na.rm=TRUE
,
missing values are removed from x
prior to computing the coefficient
of variation.
If x
contains any non-positive values (values less than or equal to 0),
geoMean
returns NA
and issues a warning.
Let \(\underline{x}\) denote a vector of \(n\) observations from some distribution. The sample geometric mean is a measure of central tendency. It is defined as: $$\bar{x}_G = \sqrt[n]{x_1 x_2 \ldots x_n} = [\prod_{i=1}^n x_i]^{1/n} \;\;\;\;\;\; (1)$$ that is, it is the \(n\)'th root of the product of all \(n\) observations.
An equivalent way to define the geometric mean is by: $$\bar{x}_G = exp[\frac{1}{n} \sum_{i=1}^n log(x_i)] = e^{\bar{y}} \;\;\;\;\;\; (2)$$ where $$\bar{y} = \frac{1}{n} \sum_{i=1}^n y_i \;\;\;\;\;\; (3)$$ $$y_i = log(x_i), \;\; i = 1, 2, \ldots, n \;\;\;\;\;\; (4)$$ That is, the sample geometric mean is antilog of the sample mean of the log-transformed observations.
The geometric mean is only defined for positive observations. It can be shown that the geometric mean is less than or equal to the sample arithmetic mean with equality only when all of the observations are the same value.
A numeric scalar – the sample geometric mean.
Berthouex, P.M., and L.C. Brown. (2002). Statistics for Environmental Engineers, Second Edition. Lewis Publishers, Boca Raton, FL.
Gilbert, R.O. (1987). Statistical Methods for Environmental Pollution Monitoring. Van Nostrand Reinhold, NY.
Ott, W.R. (1995). Environmental Statistics and Data Analysis. Lewis Publishers, Boca Raton, FL.
Taylor, J.K. (1990). Statistical Techniques for Data Analysis. Lewis Publishers, Boca Raton, FL.
Zar, J.H. (2010). Biostatistical Analysis. Fifth Edition. Prentice-Hall, Upper Saddle River, NJ.
The geometric mean is sometimes used to average ratios and percent changes
(Zar, 2010). For the lognormal distribution, the geometric mean is the
maximum likelihood estimator of the median of the distribution,
although it is sometimes used incorrectly to estimate the mean of the
distribution (see the NOTE section in the help file for elnormAlt
).
# Generate 20 observations from a lognormal distribution with parameters
# mean=10 and cv=2, and compute the mean, median, and geometric mean.
# (Note: the call to set.seed simply allows you to reproduce this example.)
set.seed(250)
dat <- rlnormAlt(20, mean = 10, cv = 2)
mean(dat)
#> [1] 5.339273
#[1] 5.339273
median(dat)
#> [1] 3.692091
#[1] 3.692091
geoMean(dat)
#> [1] 4.095127
#[1] 4.095127
#----------
# Clean up
rm(dat)