evNormOrdStats.Rd
Compute the expected values of order statistics for a random sample from a standard normal distribution.
evNormOrdStats(n = 1,
method = "royston", lower = -9, inc = 0.025, warn = TRUE,
alpha = 3/8, nmc = 2000, seed = 47, approximate = NULL)
evNormOrdStatsScalar(r = 1, n = 1,
method = "royston", lower = -9, inc = 0.025, warn = TRUE,
alpha = 3/8, nmc = 2000, conf.level = 0.95, seed = 47, approximate = NULL)
positive integer indicating the sample size.
positive integer between 1
and n
specifying the order statistic
for which to compute the expected value.
character string indicating what method to use. The possible values are:
"royston"
. Method based on approximating the exact integral as
given in Royston (1982).
"blom"
. Method based on the approximation formula proposed by
Blom (1958).
"mc"
. Method based on Monte Carlo simulation.
See the DETAILS section below.
numeric scalar \(\le -9\) defining the lower bound used for approximating the
integral when method="royston"
. The upper bound is automatically set
to -lower
. The default value is lower=-9
.
numeric scalar between .Machine$double.eps
and 0.025
that determines
the width of each subdivision used to approximate the integral when method="royston"
. The default value is inc=0.025
.
logical scalar indicating whether to issue a warning when
method="royston"
and the sample size is greater than 2000.
The default value is warn=TRUE
.
numeric scalar between 0 and 0.5 that determines the constant used when method="blom"
. The default value is alpha=3/8
.
integer \(\ge 100\) denoting the number of Monte Carlo simulations to use
when method="mc"
. The default value is nmc=2000
.
numeric scalar between 0 and 1 denoting the confidence level of
the confidence interval for the expected value of the normal
order statistic when method="mc"
.
The default value is conf.level=0.95
.
integer between \(-(2^31 - 1)\) and \(2^31 - 1\) specifying
the argument to set.seed
(the random number seed)
when method="mc"
. The default value is seed=47
.
logical scalar included for backwards compatibility with versions of
EnvStats prior to version 2.3.0.
When method
is not supplied and approximate=FALSE
, method
is set to method="royston"
.
When method
is not supplied and approximate=TRUE
,
method
is set to method="blom"
.
This argument is ignored if method
is supplied and/or
approxmiate=NULL
(the default).
Let \(\underline{z} = z_1, z_2, \ldots, z_n\) denote a vector of \(n\)
observations from a normal distribution with parameters
mean=0
and sd=1
. That is, \(\underline{z}\) denotes a vector of
\(n\) observations from a standard normal distribution. Let
\(z_{(r)}\) denote the \(r\)'th order statistic of \(\underline{z}\),
for \(r = 1, 2, \ldots, n\). The probability density function of
\(z_{(r)}\) is given by:
$$f_{r,n}(t) = \frac{n!}{(r-1)!(n-r)!} [\Phi(t)]^{r-1} [1 - \Phi(t)]^{n-r} \phi(t) \;\;\;\;\;\; (1)$$
where \(\Phi\) and \(\phi\) denote the cumulative distribution function and
probability density function of the standard normal distribution, respectively
(Johnson et al., 1994, p.93). Thus, the expected value of \(z_{(r)}\) is given by:
$$E(r, n) = E[z_{(r)}] = \int_{-\infty}^{\infty} t f_{r,n}(t) dt \;\;\;\;\;\; (2)$$
It can be shown that if \(n\) is odd, then
$$E[(n+1)/2, n] = 0 \;\;\;\;\;\; (3)$$
Also, for all values of \(n\),
$$E(r, n) = -E(n-r+1, n) \;\;\;\;\;\; (4)$$
The function evNormOrdStatsScalar
computes the value of \(E(r,n)\) for
user-specified values of \(r\) and \(n\).
The function evNormOrdStats
computes the values of \(E(r,n)\) for all
values of \(r\) (i.e., for \(r = 1, 2, \ldots, n\))
for a user-specified value of \(n\).
Exact Method Based on Royston's Approximation to the Integral (method="royston"
)
When method="royston"
, the integral in Equation (2) above is approximated by
computing the value of the integrand between the values of lower
and
-lower
using increments of inc
, then summing these values and
multiplying by inc
. In particular, the integrand is restructured as:
$$t \; f_{r,n}(t) = t \; exp\{log(n!) - log[(r-1)!] - log[(n-r)!] + (r-1)log[\Phi(t)] + (n-r)log[1 - \Phi(t)] + log[\phi(t)]\} \;\;\; (5)$$
By default, as per Royston (1982), the integrand is evaluated between -9 and 9 in
increments of 0.025. The approximation is computed this way for values of
\(r\) between \(1\) and \([n/2]\), where \([x]\) denotes the floor of \(x\).
If \(r > [n/2]\), then the approximation is computed for \(E(n-r+1, n)\) and
Equation (4) is used.
Note that Equation (1) in Royston (1982) differs from Equations (1) and (2) above because Royston's paper is based on the \(r^{th}\) largest value, not the \(r^{th}\) order statistic.
Royston (1982) states that this algorithm “is accurate to at least seven decimal
places on a 36-bit machine,” that it has been validated up to a sample size
of \(n=2000\), and that the accuracy for \(n > 2000\) may be improved by
reducing the value of the argument inc
. Note that making
inc
smaller will increase the computation time.
Approxmation Based on Blom's Method (method="blom"
)
When method="blom"
, the following approximation to \(E(r,n)\),
proposed by Blom (1958, pp. 68-75), is used:
$$E(r, n) \approx \Phi^{-1}(\frac{r - \alpha}{n - 2\alpha + 1}) \;\;\;\;\;\; (5)$$
By default, \(\alpha = 3/8 = 0.375\). This approximation is quite accurate.
For example, for \(n \ge 2\), the approximation is accurate to the first decimal place,
and for \(n \ge 9\) it is accurate to the second decimal place.
Harter (1961) discusses appropriate values of \(\alpha\) for various sample sizes
\(n\) and values of \(r\).
Approximation Based on Monte Carlo Simulation (method="mc"
)
When method="mc"
, Monte Carlo simulation is used to estmate the expected value
of the \(r^{th}\) order statistic. That is, \(N =\) nmc
trials are run in which,
for each trial, a random sample of \(n\) standard normal observations is
generated and the \(r^{th}\) order statistic is computed. Then, the average value
of this order statistic over all \(N\) trials is computed, along with a
confidence interval for the expected value, assuming an approximately
normal distribution for the mean of the order statistic (the confidence interval
is computed by supplying the simulated values of the \(r^{th}\) order statistic
to the function enorm
).
NOTE: This method has not been optimized for large sample sizes \(n\)
(i.e., large values of the argument n
) and/or a large number of
Monte Carlo trials \(N\) (i.e., large values of the argument nmc
) and
may take a long time to execute in these cases.
For evNormOrdStats
: a numeric vector of length n
containing the
expected values of all the order statistics for a random sample of n
standard normal deviates.
For evNormOrdStatsScalar
: a numeric scalar containing the expected value
of the r
'th order statistic from a random sample of n
standard
normal deviates. When method="mc"
, the returned object also has a
cont.int
attribute that contains the 95
and a nmc
attribute indicating the number of Monte Carlo trials run.
Blom, G. (1958). Statistical Estimates and Transformed Beta Variables. John Wiley and Sons, New York.
Harter, H. L. (1961). Expected Values of Normal Order Statistics 48, 151–165.
Johnson, N. L., S. Kotz, and N. Balakrishnan. (1994). Continuous Univariate Distributions, Volume 1. Second Edition. John Wiley and Sons, New York, pp. 93–99.
Royston, J.P. (1982). Algorithm AS 177. Expected Normal Order Statistics (Exact and Approximate). Applied Statistics 31, 161–165.
The expected values of normal order statistics are used to construct normal
quantile-quantile (Q-Q) plots (see qqPlot
) and to compute
goodness-of-fit statistics (see gofTest
). Usually, however,
approximations are used instead of exact values. The functions
evNormOrdStats
and evNormOrdStatsScalar
have been included mainly
because evNormOrdStatsScalar
is called by elnorm3
and
predIntNparSimultaneousTestPower
.
# Compute the expected value of the minimum for a random sample of size 10
# from a standard normal distribution:
# Based on method="royston"
#--------------------------
evNormOrdStatsScalar(r = 1, n = 10)
#> [1] -1.538753
#[1] -1.538753
# Based on method="blom"
#-----------------------
evNormOrdStatsScalar(r = 1, n = 10, method = "blom")
#> [1] -1.546635
#[1] -1.546635
# Based on method="mc" with 10,000 Monte Carlo trials
#----------------------------------------------------
evNormOrdStatsScalar(r = 1, n = 10, method = "mc", nmc = 10000)
#> [1] -1.544318
#> attr(,"confint")
#> 95%LCL 95%UCL
#> -1.555838 -1.532797
#> attr(,"nmc")
#> [1] 10000
#[1] -1.544318
#attr(,"confint")
# 95%LCL 95%UCL
#-1.555838 -1.532797
#attr(,"nmc")
#[1] 10000
#====================
# Compute the expected values of all of the order statistics
# for a random sample of size 10 from a standard normal distribution
# based on Royston's (1982) method:
#--------------------------------------------------------------------
evNormOrdStats(10)
#> [1] -1.5387527 -1.0013570 -0.6560591 -0.3757647 -0.1226678 0.1226678
#> [7] 0.3757647 0.6560591 1.0013570 1.5387527
#[1] -1.5387527 -1.0013570 -0.6560591 -0.3757647 -0.1226678
#[6] 0.1226678 0.3757647 0.6560591 1.0013570 1.5387527
# Compare the above with Blom (1958) scores:
#-------------------------------------------
evNormOrdStats(10, method = "blom")
#> [1] -1.5466353 -1.0004905 -0.6554235 -0.3754618 -0.1225808 0.1225808
#> [7] 0.3754618 0.6554235 1.0004905 1.5466353
#[1] -1.5466353 -1.0004905 -0.6554235 -0.3754618 -0.1225808
#[6] 0.1225808 0.3754618 0.6554235 1.0004905 1.5466353