Estimate the location and scale parameters of an extreme value distribution, and optionally construct a confidence interval for one of the parameters.

eevd(x, method = "mle", pwme.method = "unbiased", 
    plot.pos.cons = c(a = 0.35, b = 0), ci = FALSE, 
    ci.parameter = "location", ci.type = "two-sided", 
    ci.method = "normal.approx", conf.level = 0.95)

Arguments

x

numeric vector of observations.

method

character string specifying the method of estimation. Possible values are "mle" (maximum likelihood; the default), "mme" (methods of moments), "mmue" (method of moments based on the unbiased estimator of variance), and "pwme" (probability-weighted moments). See the DETAILS section for more information on these estimation methods.

pwme.method

character string specifying what method to use to compute the probability-weighted moments when method="pwme". The possible values are "ubiased" (method based on the U-statistic; the default), or "plotting.position" (method based on the plotting position formula). See the DETAILS section in this help file and the help file for pwMoment for more information. This argument is ignored if method is not equal to "pwme".

plot.pos.cons

numeric vector of length 2 specifying the constants used in the formula for the plotting positions when method="pwme" and
pwme.method="plotting.position". The default value is
plot.pos.cons=c(a=0.35, b=0). If this vector has a names attribute with the value c("a","b") or c("b","a"), then the elements will be matched by name in the formula for computing the plotting positions. Otherwise, the first element is mapped to the name "a" and the second element to the name "b". See the DETAILS section in this help file and the help file for pwMoment for more information. This argument is ignored if method is not equal to "pwme" or if pwme.method="ubiased".

ci

logical scalar indicating whether to compute a confidence interval for the location or scale parameter. The default value is FALSE.

ci.parameter

character string indicating the parameter for which the confidence interval is desired. The possible values are "location" (the default) and "scale". This argument is ignored if ci=FALSE.

ci.type

character string indicating what kind of confidence interval to compute. The possible values are "two-sided" (the default), "lower", and "upper". This argument is ignored if ci=FALSE.

ci.method

character string indicating what method to use to construct the confidence interval for the location or scale parameter. Currently, the only possible value is "normal.approx" (the default). See the DETAILS section for more information. This argument is ignored if ci=FALSE.

conf.level

a scalar between 0 and 1 indicating the confidence level of the confidence interval. The default value is conf.level=0.95. This argument is ignored if ci=FALSE.

Details

If x contains any missing (NA), undefined (NaN) or infinite (Inf, -Inf) values, they will be removed prior to performing the estimation.

Let \(\underline{x} = (x_1, x_2, \ldots, x_n)\) be a vector of \(n\) observations from an extreme value distribution with parameters location=\(\eta\) and scale=\(\theta\).

Estimation

Maximum Likelihood Estimation (method="mle")
The maximum likelihood estimators (mle's) of \(\eta\) and \(\theta\) are the solutions of the simultaneous equations (Forbes et al., 2011): $$\hat{\eta}_mle = \hat{\theta}_mle \, log[\frac{1}{n} \sum_{i=1}^{n} exp(\frac{-x_i}{\hat{\theta}_mle})]$$ $$\hat{\theta}_mle = \bar{x} - \frac{\sum_{i=1}^{n} x_i exp(\frac{-x_i}{\hat{\theta}_mle})}{\sum_{i=1}^{n} exp(\frac{-x_i}{\hat{\theta}_mle})}$$ where $$\bar{x} = \frac{1}{n} \sum_{i=1}^n x_i$$.

Method of Moments Estimation (method="mme")
The method of moments estimators (mme's) of \(\eta\) and \(\theta\) are given by (Johnson et al., 1995, p.27): $$\hat{\eta}_{mme} = \bar{x} - \epsilon \hat{\theta}_{mme}$$ $$\hat{\theta}_{mme} = \frac{\sqrt{6}}{\pi} s_m$$ where \(\epsilon\) denotes Euler's constant and \(s_m\) denotes the square root of the method of moments estimator of variance: $$s_m^2 = \frac{1}{n} \sum_{i=1}^n (x_i - \bar{x})^2$$

Method of Moments Estimators Based on the Unbiased Estimator of Variance (method="mmue")
These estimators are the same as the method of moments estimators except that the method of moments estimator of variance is replaced with the unbiased estimator of variance: $$s^2 = \frac{1}{n-1} \sum_{i=1}^n (x_i - \bar{x})^2$$

Probability-Weighted Moments Estimation (method="pwme")
Greenwood et al. (1979) show that the relationship between the distribution parameters \(\eta\) and \(\theta\) and the probability-weighted moments is given by: $$\eta = M(1, 0, 0) - \epsilon \theta$$ $$\theta = \frac{M(1, 0, 0) - 2M(1, 0, 1)}{log(2)}$$ where \(M(i, j, k)\) denotes the \(ijk\)'th probability-weighted moment and \(\epsilon\) denotes Euler's constant. The probability-weighted moment estimators (pwme's) of \(\eta\) and \(\theta\) are computed by simply replacing the \(M(i,j,k)\)'s in the above two equations with estimates of the \(M(i,j,k)\)'s (and for the estimate of \(\eta\), replacing \(\theta\) with its estimated value). See the help file for pwMoment for more information on how to estimate the \(M(i,j,k)\)'s. Also, see Landwehr et al. (1979) for an example of this method of estimation using the unbiased (U-statistic type) probability-weighted moment estimators. Hosking et al. (1985) note that this method of estimation using the U-statistic type probability-weighted moments is equivalent to Downton's (1966) linear estimates with linear coefficients.

Confidence Intervals
When ci=TRUE, an approximate \((1-\alpha)\)100% confidence intervals for \(\eta\) can be constructed assuming the distribution of the estimator of \(\eta\) is approximately normally distributed. A two-sided confidence interval is constructed as: $$[\hat{\eta} - t(n-1, 1-\alpha/2) \hat{\sigma}_{\hat{\eta}}, \, \hat{\eta} + t(n-1, 1-\alpha/2) \hat{\sigma}_{\hat{\eta}}]$$ where \(t(\nu, p)\) is the \(p\)'th quantile of Student's t-distribution with \(\nu\) degrees of freedom, and the quantity $$\hat{\sigma}_{\hat{\eta}}$$ denotes the estimated asymptotic standard deviation of the estimator of \(\eta\).

Similarly, a two-sided confidence interval for \(\theta\) is constructed as: $$[\hat{\theta} - t(n-1, 1-\alpha/2) \hat{\sigma}_{\hat{\theta}}, \, \hat{\theta} + t(n-1, 1-\alpha/2) \hat{\sigma}_{\hat{\theta}}]$$

One-sided confidence intervals for \(\eta\) and \(\theta\) are computed in a similar fashion.

Maximum Likelihood (method="mle")
Downton (1966) shows that the estimated asymptotic variances of the mle's of \(\eta\) and \(\theta\) are given by: $$\hat{\sigma}_{\hat{\eta}_mle}^2 = \frac{\hat{\theta}_mle^2}{n} [1 + \frac{6(1 - \epsilon)^2}{\pi^2}] = \frac{1.10867 \hat{\theta}_mle^2}{n}$$ $$\hat{\sigma}_{\hat{\theta}_mle}^2 = \frac{6}{\pi^2} \frac{\hat{\theta}_mle^2}{n} = \frac{0.60793 \hat{\theta}_mle^2}{n}$$ where \(\epsilon\) denotes Euler's constant.

Method of Moments (method="mme" or method="mmue")
Tiago de Oliveira (1963) and Johnson et al. (1995, p.27) show that the estimated asymptotic variance of the mme's of \(\eta\) and \(\theta\) are given by: $$\hat{\sigma}_{\hat{\eta}_mme}^2 = \frac{\hat{\theta}_mme^2}{n} [\frac{\pi^2}{6} + \frac{\epsilon^2}{4}(\beta_2 - 1) - \frac{\pi \epsilon \sqrt{\beta_1}}{\sqrt{6}}] = \frac{1.1678 \hat{\theta}_mme^2}{n}$$ $$\hat{\sigma}_{\hat{\theta}_mme}^2 = \frac{\hat{\theta}_mle^2}{n} \frac{(\beta_2 - 1)}{4} = \frac{1.1 \hat{\theta}_mme^2}{n}$$ where the quantities $$\sqrt{\beta_1}, \; \beta_2$$ denote the skew and kurtosis of the distribution, and \(\epsilon\) denotes Euler's constant.

The estimated asymptotic variances of the mmue's of \(\eta\) and \(\theta\) are the same, except replace the mme of \(\theta\) in the above equations with the mmue of \(\theta\).

Probability-Weighted Moments (method="pwme")
As stated above, Hosking et al. (1985) note that this method of estimation using the U-statistic type probability-weighted moments is equivalent to Downton's (1966) linear estimates with linear coefficients. Downton (1966) provides exact values of the variances of the estimates of location and scale parameters for the smallest extreme value distribution. For the largest extreme value distribution, the formula for the estimate of scale is the same, but the formula for the estimate of location must be modified. Thus, Downton's (1966) equation (3.4) is modified to: $$\hat{\eta}_pwme = \frac{(n-1)log(2) + (n+1)\epsilon}{n(n-1)log(2)} v - \frac{2 \epsilon}{n(n-1)log(2)} w$$ where \(\epsilon\) denotes Euler's constant, and \(v\) and \(w\) are defined in Downton (1966, p.8). Using Downton's (1966) equations (3.9)-(3.12), the exact variance of the pwme of \(\eta\) can be derived. Note that when method="pwme" and pwme.method="plotting.position", these are only the asymptotically correct variances.

Value

a list of class "estimate" containing the estimated parameters and other information. See
estimate.object for details.

References

Castillo, E. (1988). Extreme Value Theory in Engineering. Academic Press, New York, pp.184–198.

Downton, F. (1966). Linear Estimates of Parameters in the Extreme Value Distribution. Technometrics 8(1), 3–17.

Forbes, C., M. Evans, N. Hastings, and B. Peacock. (2011). Statistical Distributions. Fourth Edition. John Wiley and Sons, Hoboken, NJ.

Greenwood, J.A., J.M. Landwehr, N.C. Matalas, and J.R. Wallis. (1979). Probability Weighted Moments: Definition and Relation to Parameters of Several Distributions Expressible in Inverse Form. Water Resources Research 15(5), 1049–1054.

Hosking, J.R.M., J.R. Wallis, and E.F. Wood. (1985). Estimation of the Generalized Extreme-Value Distribution by the Method of Probability-Weighted Moments. Technometrics 27(3), 251–261.

Johnson, N. L., S. Kotz, and N. Balakrishnan. (1995). Continuous Univariate Distributions, Volume 2. Second Edition. John Wiley and Sons, New York.

Landwehr, J.M., N.C. Matalas, and J.R. Wallis. (1979). Probability Weighted Moments Compared With Some Traditional Techniques in Estimating Gumbel Parameters and Quantiles. Water Resources Research 15(5), 1055–1064.

Tiago de Oliveira, J. (1963). Decision Results for the Parameters of the Extreme Value (Gumbel) Distribution Based on the Mean and Standard Deviation. Trabajos de Estadistica 14, 61–81.

Author

Steven P. Millard (EnvStats@ProbStatInfo.com)

Note

There are three families of extreme value distributions. The one described here is the Type I, also called the Gumbel extreme value distribution or simply Gumbel distribution. The name “extreme value” comes from the fact that this distribution is the limiting distribution (as \(n\) approaches infinity) of the greatest value among \(n\) independent random variables each having the same continuous distribution.

The Gumbel extreme value distribution is related to the exponential distribution as follows. Let \(Y\) be an exponential random variable with parameter rate=\(\lambda\). Then \(X = \eta - log(Y)\) has an extreme value distribution with parameters location=\(\eta\) and scale=\(1/\lambda\).

The distribution described above and assumed by eevd is the largest extreme value distribution. The smallest extreme value distribution is the limiting distribution (as \(n\) approaches infinity) of the smallest value among \(n\) independent random variables each having the same continuous distribution. If \(X\) has a largest extreme value distribution with parameters location=\(\eta\) and scale=\(\theta\), then \(Y = -X\) has a smallest extreme value distribution with parameters location=\(-\eta\) and scale=\(\theta\). The smallest extreme value distribution is related to the Weibull distribution as follows. Let \(Y\) be a Weibull random variable with parameters shape=\(\beta\) and scale=\(\alpha\). Then \(X = log(Y)\) has a smallest extreme value distribution with parameters location=\(log(\alpha)\) and scale=\(1/\beta\).

The extreme value distribution has been used extensively to model the distribution of streamflow, flooding, rainfall, temperature, wind speed, and other meteorological variables, as well as material strength and life data.

Examples

  # Generate 20 observations from an extreme value distribution with 
  # parameters location=2 and scale=1, then estimate the parameters 
  # and construct a 90% confidence interval for the location parameter. 
  # (Note: the call to set.seed simply allows you to reproduce this example.)

  set.seed(250) 
  dat <- revd(20, location = 2) 
  eevd(dat, ci = TRUE, conf.level = 0.9) 
#> 
#> Results of Distribution Parameter Estimation
#> --------------------------------------------
#> 
#> Assumed Distribution:            Extreme Value
#> 
#> Estimated Parameter(s):          location = 1.9684093
#>                                  scale    = 0.7481955
#> 
#> Estimation Method:               mle
#> 
#> Data:                            dat
#> 
#> Sample Size:                     20
#> 
#> Confidence Interval for:         location
#> 
#> Confidence Interval Method:      Normal Approximation
#>                                  (t Distribution)
#> 
#> Confidence Interval Type:        two-sided
#> 
#> Confidence Level:                90%
#> 
#> Confidence Interval:             LCL = 1.663809
#>                                  UCL = 2.273009
#> 

  #Results of Distribution Parameter Estimation
  #--------------------------------------------
  #
  #Assumed Distribution:            Extreme Value
  #
  #Estimated Parameter(s):          location = 1.9684093
  #                                 scale    = 0.7481955
  #
  #Estimation Method:               mle
  #
  #Data:                            dat
  #
  #Sample Size:                     20
  #
  #Confidence Interval for:         location
  #
  #Confidence Interval Method:      Normal Approximation
  #                                 (t Distribution)
  #
  #Confidence Interval Type:        two-sided
  #
  #Confidence Level:                90%
  #
  #Confidence Interval:             LCL = 1.663809
  #                                 UCL = 2.273009

  #----------

  #Compare the values of the different types of estimators: 

  eevd(dat, method = "mle")$parameters 
#>  location     scale 
#> 1.9684093 0.7481955 
  # location     scale 
  #1.9684093 0.7481955

  eevd(dat, method = "mme")$parameters 
#>  location     scale 
#> 1.9575980 0.8339256 
  # location     scale 
  #1.9575980 0.8339256 

  eevd(dat, method = "mmue")$parameters 
#>  location     scale 
#> 1.9450932 0.8555896 
  # location     scale 
  #1.9450932 0.8555896 

  eevd(dat, method = "pwme")$parameters 
#>  location     scale 
#> 1.9434922 0.8583633 
  # location     scale 
  #1.9434922 0.8583633

  #----------

  # Clean up
  #---------
  rm(dat)