Construct a \(\beta\)-content or \(\beta\)-expectation tolerance interval for a normal distribution.

tolIntNorm(x, coverage = 0.95, cov.type = "content", 
    ti.type = "two-sided", conf.level = 0.95, method = "exact")

Arguments

x

numeric vector of observations, or an object resulting from a call to an estimating function that assumes a normal (Gaussian) distribution (i.e., enorm or enormCensored). If x is a numeric vector, missing (NA), undefined (NaN), and infinite (Inf, -Inf) values are allowed but will be removed.

coverage

a scalar between 0 and 1 indicating the desired coverage of the tolerance interval. The default value is coverage=0.95. If cov.type="expectation", this argument is ignored.

cov.type

character string specifying the coverage type for the tolerance interval. The possible values are "content" (\(\beta\)-content; the default), and "expectation" (\(\beta\)-expectation). See the DETAILS section for more information.

ti.type

character string indicating what kind of tolerance interval to compute. The possible values are "two-sided" (the default), "lower", and "upper".

conf.level

a scalar between 0 and 1 indicating the confidence level associated with the tolerance interval. The default value is conf.level=0.95.

method

for the case of a two-sided tolerance interval, a character string specifying the method for constructing the tolerance interval. This argument is ignored if ti.type="lower" or ti.type="upper". The possible values are
"exact" (the default) and "wald.wolfowitz" (the Wald-Wolfowitz approximation). See the DETAILS section for more information.

Details

If x contains any missing (NA), undefined (NaN) or infinite (Inf, -Inf) values, they will be removed prior to performing the estimation.

A tolerance interval for some population is an interval on the real line constructed so as to contain \(100 \beta \%\) of the population (i.e., \(100 \beta \%\) of all future observations), where \(0 < \beta < 1\). The quantity \(100 \beta \%\) is called the coverage.

There are two kinds of tolerance intervals (Guttman, 1970):

  • A \(\beta\)-content tolerance interval with confidence level \(100(1-\alpha)\%\) is constructed so that it contains at least \(100 \beta \%\) of the population (i.e., the coverage is at least \(100 \beta \%\)) with probability \(100(1-\alpha)\%\), where \(0 < \alpha < 1\). The quantity \(100(1-\alpha)\%\) is called the confidence level or confidence coefficient associated with the tolerance interval.

  • A \(\beta\)-expectation tolerance interval is constructed so that the average coverage of the interval is \(100 \beta \%\).

Note: A \(\beta\)-expectation tolerance interval with coverage \(100 \beta \%\) is equivalent to a prediction interval for one future observation with associated confidence level \(100 \beta \%\). Note that there is no explicit confidence level associated with a \(\beta\)-expectation tolerance interval. If a \(\beta\)-expectation tolerance interval is treated as a \(\beta\)-content tolerance interval, the confidence level associated with this tolerance interval is usually around 50% (e.g., Guttman, 1970, Table 4.2, p.76).

For a normal distribution, the form of a two-sided \(100(1-\alpha)\%\) tolerance interval is: $$[\bar{x} - Ks, \, \bar{x} + Ks]$$ where \(\bar{x}\) denotes the sample mean, \(s\) denotes the sample standard deviation, and \(K\) denotes a constant that depends on the sample size \(n\), the coverage, and, for a \(\beta\)-content tolerance interval (but not a \(\beta\)-expectation tolerance interval), the confidence level.

Similarly, the form of a one-sided lower tolerance interval is: $$[\bar{x} - Ks, \, \infty]$$ and the form of a one-sided upper tolerance interval is: $$[-\infty, \, \bar{x} + Ks]$$ but \(K\) differs for one-sided versus two-sided tolerance intervals. The derivation of the constant \(K\) is explained in the help file for tolIntNormK.

Value

If x is a numeric vector, tolIntNorm returns a list of class "estimate" containing the estimated parameters, a component called interval containing the tolerance interval information, and other information. See estimate.object for details.

If x is the result of calling an estimation function, tolIntNorm returns a list whose class is the same as x. The list contains the same components as x. If x already has a component called interval, this component is replaced with the tolerance interval information.

References

Berthouex, P.M., and L.C. Brown. (2002). Statistics for Environmental Engineers. Lewis Publishers, Boca Raton.

Draper, N., and H. Smith. (1998). Applied Regression Analysis. Third Edition. John Wiley and Sons, New York.

Ellison, B.E. (1964). On Two-Sided Tolerance Intervals for a Normal Distribution. Annals of Mathematical Statistics 35, 762-772.

Gibbons, R.D., D.K. Bhaumik, and S. Aryal. (2009). Statistical Methods for Groundwater Monitoring, Second Edition. John Wiley & Sons, Hoboken.

Guttman, I. (1970). Statistical Tolerance Regions: Classical and Bayesian. Hafner Publishing Co., Darien, CT.

Hahn, G.J. (1970b). Statistical Intervals for a Normal Population, Part I: Tables, Examples and Applications. Journal of Quality Technology 2(3), 115-125.

Hahn, G.J. (1970c). Statistical Intervals for a Normal Population, Part II: Formulas, Assumptions, Some Derivations. Journal of Quality Technology 2(4), 195-206.

Hahn, G.J., and W.Q. Meeker. (1991). Statistical Intervals: A Guide for Practitioners. John Wiley and Sons, New York.

Krishnamoorthy K., and T. Mathew. (2009). Statistical Tolerance Regions: Theory, Applications, and Computation. John Wiley and Sons, Hoboken.

Millard, S.P., and N.K. Neerchal. (2001). Environmental Statistics with S-PLUS. CRC Press, Boca Raton.

Odeh, R.E., and D.B. Owen. (1980). Tables for Normal Tolerance Limits, Sampling Plans, and Screening. Marcel Dekker, New York.

Owen, D.B. (1962). Handbook of Statistical Tables. Addison-Wesley, Reading, MA.

Singh, A., R. Maichle, and N. Armbya. (2010a). ProUCL Version 4.1.00 User Guide (Draft). EPA/600/R-07/041, May 2010. Office of Research and Development, U.S. Environmental Protection Agency, Washington, D.C.

Singh, A., N. Armbya, and A. Singh. (2010b). ProUCL Version 4.1.00 Technical Guide (Draft). EPA/600/R-07/041, May 2010. Office of Research and Development, U.S. Environmental Protection Agency, Washington, D.C.

USEPA. (2009). Statistical Analysis of Groundwater Monitoring Data at RCRA Facilities, Unified Guidance. EPA 530/R-09-007, March 2009. Office of Resource Conservation and Recovery Program Implementation and Information Division. U.S. Environmental Protection Agency, Washington, D.C.

USEPA. (2010). Errata Sheet - March 2009 Unified Guidance. EPA 530/R-09-007a, August 9, 2010. Office of Resource Conservation and Recovery, Program Information and Implementation Division. U.S. Environmental Protection Agency, Washington, D.C.

Wald, A., and J. Wolfowitz. (1946). Tolerance Limits for a Normal Distribution. Annals of Mathematical Statistics 17, 208-215.

Author

Steven P. Millard (EnvStats@ProbStatInfo.com)

Note

Tolerance intervals have long been applied to quality control and life testing problems (Hahn, 1970b,c; Hahn and Meeker, 1991; Krishnamoorthy and Mathew, 2009). References that discuss tolerance intervals in the context of environmental monitoring include: Berthouex and Brown (2002, Chapter 21), Gibbons et al. (2009), Millard and Neerchal (2001, Chapter 6), Singh et al. (2010b), and USEPA (2009).

Examples

  # Generate 20 observations from a normal distribution with parameters 
  # mean=10 and sd=2, then create a tolerance interval. 
  # (Note: the call to set.seed simply allows you to reproduce this 
  # example.)

  set.seed(250) 
  dat <- rnorm(20, mean = 10, sd = 2) 
  tolIntNorm(dat)
#> 
#> Results of Distribution Parameter Estimation
#> --------------------------------------------
#> 
#> Assumed Distribution:            Normal
#> 
#> Estimated Parameter(s):          mean = 9.861160
#>                                  sd   = 1.180226
#> 
#> Estimation Method:               mvue
#> 
#> Data:                            dat
#> 
#> Sample Size:                     20
#> 
#> Tolerance Interval Coverage:     95%
#> 
#> Coverage Type:                   content
#> 
#> Tolerance Interval Method:       Exact
#> 
#> Tolerance Interval Type:         two-sided
#> 
#> Confidence Level:                95%
#> 
#> Tolerance Interval:              LTL =  6.603328
#>                                  UTL = 13.118993
#> 

  #Results of Distribution Parameter Estimation
  #--------------------------------------------
  #
  #Assumed Distribution:            Normal
  #
  #Estimated Parameter(s):          mean = 9.861160
  #                                 sd   = 1.180226
  #
  #Estimation Method:               mvue
  #
  #Data:                            dat
  #
  #Sample Size:                     20
  #
  #Tolerance Interval Coverage:     95%
  #
  #Coverage Type:                   content
  #
  #Tolerance Interval Method:       Exact
  #
  #Tolerance Interval Type:         two-sided
  #
  #Confidence Level:                95%
  #
  #Tolerance Interval:              LTL =  6.603328
  #                                 UTL = 13.118993

  #----------

  # Clean up
  rm(dat)
  
  #--------------------------------------------------------------------

  # Example 17-3 of USEPA (2009, p. 17-17) shows how to construct a 
  # beta-content upper tolerance limit with 95% coverage and 95% 
  # confidence  using chrysene data and assuming a lognormal distribution.  
  # The data for this example are stored in EPA.09.Ex.17.3.chrysene.df, 
  # which contains chrysene concentration data (ppb) found in water 
  # samples obtained from two background wells (Wells 1 and 2) and 
  # three compliance wells (Wells 3, 4, and 5).  The tolerance limit 
  # is based on the data from the background wells.

  # Here we will first take the log of the data and  
  # then construct the tolerance interval; note however that it is 
  # easier to call the function tolIntLnorm instead using the 
  # original data.

  head(EPA.09.Ex.17.3.chrysene.df)
#>   Month   Well  Well.type Chrysene.ppb
#> 1     1 Well.1 Background         19.7
#> 2     2 Well.1 Background         39.2
#> 3     3 Well.1 Background          7.8
#> 4     4 Well.1 Background         12.8
#> 5     1 Well.2 Background         10.2
#> 6     2 Well.2 Background          7.2
  #  Month   Well  Well.type Chrysene.ppb
  #1     1 Well.1 Background         19.7
  #2     2 Well.1 Background         39.2
  #3     3 Well.1 Background          7.8
  #4     4 Well.1 Background         12.8
  #5     1 Well.2 Background         10.2
  #6     2 Well.2 Background          7.2

  longToWide(EPA.09.Ex.17.3.chrysene.df, "Chrysene.ppb", "Month", "Well")
#>   Well.1 Well.2 Well.3 Well.4 Well.5
#> 1   19.7   10.2   68.0   26.8   47.0
#> 2   39.2    7.2   48.9   17.7   30.5
#> 3    7.8   16.1   30.1   31.9   15.0
#> 4   12.8    5.7   38.1   22.2   23.4
  #  Well.1 Well.2 Well.3 Well.4 Well.5
  #1   19.7   10.2   68.0   26.8   47.0
  #2   39.2    7.2   48.9   17.7   30.5
  #3    7.8   16.1   30.1   31.9   15.0
  #4   12.8    5.7   38.1   22.2   23.4

  tol.int.list <- with(EPA.09.Ex.17.3.chrysene.df, 
    tolIntNorm(log(Chrysene.ppb[Well.type == "Background"]), 
    ti.type = "upper", coverage = 0.95, conf.level = 0.95))

  tol.int.list
#> 
#> Results of Distribution Parameter Estimation
#> --------------------------------------------
#> 
#> Assumed Distribution:            Normal
#> 
#> Estimated Parameter(s):          mean = 2.5085773
#>                                  sd   = 0.6279479
#> 
#> Estimation Method:               mvue
#> 
#> Data:                            log(Chrysene.ppb[Well.type == "Background"])
#> 
#> Sample Size:                     8
#> 
#> Tolerance Interval Coverage:     95%
#> 
#> Coverage Type:                   content
#> 
#> Tolerance Interval Method:       Exact
#> 
#> Tolerance Interval Type:         upper
#> 
#> Confidence Level:                95%
#> 
#> Tolerance Interval:              LTL =     -Inf
#>                                  UTL = 4.510032
#> 

  #Results of Distribution Parameter Estimation
  #--------------------------------------------
  #
  #Assumed Distribution:            Normal
  #
  #Estimated Parameter(s):          mean = 2.5085773
  #                                 sd   = 0.6279479
  #
  #Estimation Method:               mvue
  #
  #Data:                            log(Chrysene.ppb[Well.type == "Background"])
  #
  #Sample Size:                     8
  #
  #Tolerance Interval Coverage:     95%
  #
  #Coverage Type:                   content
  #
  #Tolerance Interval Method:       Exact
  #
  #Tolerance Interval Type:         upper
  #
  #Confidence Level:                95%
  #
  #Tolerance Interval:              LTL =     -Inf
  #                                 UTL = 4.510032

  # Compute the upper tolerance interaval on the original scale
  # by exponentiating the upper tolerance limit:

  exp(tol.int.list$interval$limits["UTL"])
#>     UTL 
#> 90.9247 
  #    UTL 
  #90.9247

  #----------

  # Clean up

  rm(tol.int.list)