Compute the sample geometric standard deviation.

geoSD(x, na.rm = FALSE, sqrt.unbiased = TRUE)

Arguments

x

numeric vector of observations.

na.rm

logical scalar indicating whether to remove missing values from x.
If na.rm=FALSE (the default) and x contains missing values, then a missing value (NA) is returned. If na.rm=TRUE, missing values are removed from x prior to computing the coefficient of variation.

sqrt.unbiased

logical scalar specifying what method to use to compute the sample standard deviation of the log-transformed observations. If sqrt.unbiased=TRUE (the default), the square root of the unbiased estimator of variance is used, otherwise the method of moments estimator of standard deviation is used. See the DETAILS section for more information.

Details

If x contains any non-positive values (values less than or equal to 0), geoMean returns NA and issues a warning.

Let \(\underline{x}\) denote a vector of \(n\) observations from some distribution. The sample geometric standard deviation is a measure of variability. It is defined as: $$s_G = exp(s_y) \;\;\;\;\;\; (1)$$ where $$s_y = [\frac{1}{n-1} \sum_{i=1}^n (y_i - \bar{y})^2]^{1/2} \;\;\;\;\;\; (2)$$ $$y_i = log(x_i), \;\; i = 1, 2, \ldots, n \;\;\;\;\;\; (3)$$ That is, the sample geometric standard deviation is the antilog of the sample standard deviation of the log-transformed observations.

The sample standard deviation of the log-transformed observations shown in Equation (2) is the square root of the unbiased estimator of variance. (Note that this estimator of standard deviation is not an unbiased estimator.) Sometimes, the square root of the method of moments estimator of variance is used instead: $$s_y = [\frac{1}{n} \sum_{i=1}^n (y_i - \bar{y})^2]^{1/2} \;\;\;\;\;\; (4)$$ This is the estimator used in Equation (1) when sqrt.unbiased=FALSE.

Value

A numeric scalar – the sample geometric standard deviation.

References

Berthouex, P.M., and L.C. Brown. (2002). Statistics for Environmental Engineers, Second Edition. Lewis Publishers, Boca Raton, FL.

Gilbert, R.O. (1987). Statistical Methods for Environmental Pollution Monitoring. Van Nostrand Reinhold, NY.

Leidel, N.A., K.A. Busch, and J.R. Lynch. (1977). Occupational Exposure Sampling Strategy Manual. U.S. Department of Health, Education, and Welfare, Public Health Service, Center for Disease Control, National Institute for Occupational Safety and Health, Cincinnati, Ohio 45226, January, 1977, pp.102–103.

Ott, W.R. (1995). Environmental Statistics and Data Analysis. Lewis Publishers, Boca Raton, FL.

Taylor, J.K. (1990). Statistical Techniques for Data Analysis. Lewis Publishers, Boca Raton, FL.

Zar, J.H. (2010). Biostatistical Analysis. Fifth Edition. Prentice-Hall, Upper Saddle River, NJ.

Author

Steven P. Millard (EnvStats@ProbStatInfo.com)

Note

The geometric standard deviation is only defined for positive observations. It is usually computed only for observations that are assumed to have come from a lognormal distribution.

Examples

  # Generate 2000 observations from a lognormal distribution with parameters 
  # mean=10 and cv=1, which implies the standard deviation (on the original 
  # scale) is 10.  Compute the mean, geometric mean, standard deviation, 
  # and geometric standard deviation. 
  # (Note: the call to set.seed simply allows you to reproduce this example.)

  set.seed(250) 
  dat <- rlnormAlt(2000, mean = 10, cv = 1) 

  mean(dat) 
#> [1] 10.23417
  #[1] 10.23417
 
  geoMean(dat) 
#> [1] 7.160154
  #[1] 7.160154
 
  sd(dat) 
#> [1] 9.786493
  #[1] 9.786493
 
  geoSD(dat) 
#> [1] 2.334358
  #[1] 2.334358

  #----------
  # Clean up
  rm(dat)