epois.Rd
Estimate the mean of a Poisson distribution, and optionally construct a confidence interval for the mean.
epois(x, method = "mle/mme/mvue", ci = FALSE, ci.type = "two-sided",
ci.method = "exact", conf.level = 0.95)
numeric vector of observations.
character string specifying the method of estimation. Currently the only possible
value is "mle/mme/mvue"
(maximum likelihood/method of moments/minimum variance unbiased;
the default). See the DETAILS section for more information.
logical scalar indicating whether to compute a confidence interval for the
location or scale parameter. The default value is FALSE
.
character string indicating what kind of confidence interval to compute. The
possible values are "two-sided"
(the default), "lower"
, and
"upper"
. This argument is ignored if ci=FALSE
.
character string indicating what method to use to construct the confidence interval
for the location or scale parameter. Possible values are "exact"
(the default), "pearson.hartley.approx"
(Pearson-Hartley approximation), and
"normal.approx"
(normal approximation). See the DETAILS section for more
information. This argument is ignored if ci=FALSE
.
a scalar between 0 and 1 indicating the confidence level of the confidence interval.
The default value is conf.level=0.95
. This argument is ignored if
ci=FALSE
.
If x
contains any missing (NA
), undefined (NaN
) or
infinite (Inf
, -Inf
) values, they will be removed prior to
performing the estimation.
Let \(\underline{x} = (x_1, x_2, \ldots, x_n)\) be a vector of
\(n\) observations from a Poisson distribution with
parameter lambda=
\(\lambda\). It can be shown (e.g., Forbes et al., 2009)
that if \(y\) is defined as:
$$y = \sum_{i=1}^n x_i \;\;\;\; (1)$$
then \(y\) is an observation from a Poisson distribution with parameter
lambda=
\(n \lambda\).
Estimation
The maximum likelihood, method of moments, and minimum variance unbiased estimator
(mle/mme/mvue) of \(\lambda\) is given by:
$$\hat{\lambda} = \bar{x} \;\;\;\; (2)$$
where
$$\bar{x} = \frac{1}{n} \sum_{i=1}^n x_i = \frac{y}{n} \;\;\;\; (3)$$
Confidence Intervals
There are three possible ways to construct a confidence interval for
\(\lambda\): based on the exact distribution of the estimator of
\(\lambda\) (ci.type="exact"
), based on an approximation of
Pearson and Hartley (ci.type="pearson.hartley.approx"
), or based on the
normal approximation
(ci.type="normal.approx"
).
Exact Confidence Interval (ci.method="exact"
)
If ci.type="two-sided"
, an exact \((1-\alpha)100\%\) confidence interval
for \(\lambda\) can be constructed as \([LCL, UCL]\), where the confidence
limits are computed such that:
$$Pr[Y \ge y \| \lambda = LCL] = \frac{\alpha}{2} \;\;\;\; (4)$$
$$Pr[Y \le y \| \lambda = UCL] = \frac{\alpha}{2} \;\;\;\; (5)$$
where \(y\) is defined in equation (1) and \(Y\) denotes a Poisson random
variable with parameter lambda=
\(n \lambda\).
If ci.type="lower"
, \(\alpha/2\) is replaced with \(\alpha\) in
equation (4) and \(UCL\) is set to \(\infty\).
If ci.type="upper"
, \(\alpha/2\) is replaced with \(\alpha\) in
equation (5) and \(LCL\) is set to 0.
Note that an exact upper confidence bound can be computed even when all
observations are 0.
Pearson-Hartley Approximation (ci.method="pearson.hartley.approx"
)
For a two-sided \((1-\alpha)100\%\) confidence interval for \(\lambda\), the
Pearson and Hartley approximation (Zar, 2010, p.587; Pearson and Hartley, 1970, p.81)
is given by:
$$[\frac{\chi^2_{2n\bar{x}, \alpha/2}}{2n}, \frac{\chi^2_{2n\bar{x} + 2, 1 - \alpha/2}}{2n}] \;\;\;\; (6)$$
where \(\chi^2_{\nu, p}\) denotes the \(p\)'th quantile of the
chi-square distribution with \(\nu\) degrees of freedom.
One-sided confidence intervals are computed in a similar fashion.
Normal Approximation (ci.method="normal.approx"
)
An approximate \((1-\alpha)100\%\) confidence interval for \(\lambda\) can be
constructed assuming the distribution of the estimator of \(\lambda\) is
approximately normally distributed. A two-sided confidence interval is constructed
as:
$$[\hat{\lambda} - z_{1-\alpha/2} \hat{\sigma}_{\hat{\lambda}}, \hat{\lambda} + z_{1-\alpha/2} \hat{\sigma}_{\hat{\lambda}}] \;\;\;\; (7)$$
where \(z_p\) is the \(p\)'th quantile of the standard normal distribution, and
the quantity
$$\hat{\sigma}_{\hat{\lambda}} = \sqrt{\hat{\lambda} / n} \;\;\;\; (8)$$
denotes the estimated asymptotic standard deviation of the estimator of
\(\lambda\).
One-sided confidence intervals are constructed in a similar manner.
a list of class "estimate"
containing the estimated parameters and other information.
See estimate.object
for details.
Forbes, C., M. Evans, N. Hastings, and B. Peacock. (2011). Statistical Distributions. Fourth Edition. John Wiley and Sons, Hoboken, NJ.
Gibbons, R.D. (1987b). Statistical Models for the Analysis of Volatile Organic Compounds in Waste Disposal Sites. Ground Water 25, 572-580.
Gibbons, R.D., D.K. Bhaumik, and S. Aryal. (2009). Statistical Methods for Groundwater Monitoring, Second Edition. John Wiley & Sons, Hoboken.
Johnson, N. L., S. Kotz, and A. Kemp. (1992). Univariate Discrete Distributions. Second Edition. John Wiley and Sons, New York, Chapter 4.
Pearson, E.S., and H.O. Hartley, eds. (1970). Biometrika Tables for Statisticians, Volume 1. Cambridge Universtiy Press, New York, p.81.
Zar, J.H. (2010). Biostatistical Analysis. Fifth Edition. Prentice-Hall, Upper Saddle River, NJ, pp. 585–586.
The Poisson distribution is named after Poisson, who
derived this distribution as the limiting distribution of the
binomial distribution with parameters size=
\(N\)
and prob=
\(p\), where \(N\) tends to infinity, \(p\) tends to 0, and
\(Np\) stays constant.
In this context, the Poisson distribution was used by Bortkiewicz (1898) to model the number of deaths (per annum) from kicks by horses in Prussian Army Corps. In this case, \(p\), the probability of death from this cause, was small, but the number of soldiers exposed to this risk, \(N\), was large.
The Poisson distribution has been applied in a variety of fields, including quality control (modeling number of defects produced in a process), ecology (number of organisms per unit area), and queueing theory. Gibbons (1987b) used the Poisson distribution to model the number of detected compounds per scan of the 32 volatile organic priority pollutants (VOC), and also to model the distribution of chemical concentration (in ppb).
# Generate 20 observations from a Poisson distribution with parameter
# lambda=2, then estimate the parameter and construct a 90% confidence
# interval.
# (Note: the call to set.seed simply allows you to reproduce this example.)
set.seed(250)
dat <- rpois(20, lambda = 2)
epois(dat, ci = TRUE, conf.level = 0.9)
#>
#> Results of Distribution Parameter Estimation
#> --------------------------------------------
#>
#> Assumed Distribution: Poisson
#>
#> Estimated Parameter(s): lambda = 1.8
#>
#> Estimation Method: mle/mme/mvue
#>
#> Data: dat
#>
#> Sample Size: 20
#>
#> Confidence Interval for: lambda
#>
#> Confidence Interval Method: exact
#>
#> Confidence Interval Type: two-sided
#>
#> Confidence Level: 90%
#>
#> Confidence Interval: LCL = 1.336558
#> UCL = 2.377037
#>
#Results of Distribution Parameter Estimation
#--------------------------------------------
#
#Assumed Distribution: Poisson
#
#Estimated Parameter(s): lambda = 1.8
#
#Estimation Method: mle/mme/mvue
#
#Data: dat
#
#Sample Size: 20
#
#Confidence Interval for: lambda
#
#Confidence Interval Method: exact
#
#Confidence Interval Type: two-sided
#
#Confidence Level: 90%
#
#Confidence Interval: LCL = 1.336558
# UCL = 2.377037
#----------
# Compare the different ways of constructing confidence intervals for
# lambda using the same data as in the previous example:
epois(dat, ci = TRUE, ci.method = "pearson",
conf.level = 0.9)$interval$limits
#> LCL UCL
#> 1.336558 2.377037
# LCL UCL
#1.336558 2.377037
epois(dat, ci = TRUE, ci.method = "normal.approx",
conf.level = 0.9)$interval$limits
#> LCL UCL
#> 1.306544 2.293456
# LCL UCL
#1.306544 2.293456
#----------
# Clean up
#---------
rm(dat)