Title: | Classical Goodness-of-Fit Tests for Univariate Distributions |
---|---|
Description: | Cramer-Von Mises and Anderson-Darling tests of goodness-of-fit for continuous univariate distributions, using efficient algorithms. |
Authors: | Julian Faraway [aut], George Marsaglia [aut], John Marsaglia [aut], Adrian Baddeley [aut, cre] |
Maintainer: | Adrian Baddeley <[email protected]> |
License: | GPL (>=2) |
Version: | 1.2-2 |
Built: | 2024-11-03 05:20:36 UTC |
Source: | https://github.com/baddstats/goftest |
Cramer-von Mises and Anderson-Darling tests of goodness-of-fit for continuous univariate distributions, using modern algorithms to compute the null distributions.
The goftest package contains implementations of the classical Cramer-von Mises and Anderson-Darling tests of goodness-of-fit for continuous univariate distributions.
The Cramer-von Mises test
is performed by cvm.test
. The cumulative distribution
function of the null distribution of the test statistic
is computed by pCvM
using the algorithm of Csorgo
and Faraway (1996). The quantiles are computed by qCvM
by root-finding.
The Anderson-Darling test is performed by
ad.test
. The cumulative distribution
function of the null distribution of the test statistic
is computed by pAD
using the algorithm of Marsaglia and Marsaglia (2004).
The quantiles are computed by qAD
by root-finding.
By default, each test assumes that the parameters of the null
distribution are known (a simple null hypothesis).
If the parameters were estimated (calculated from the data)
then the user should set estimated=TRUE
which uses
the method of Braun (1980) to adjust for the effect of
estimating the parameters from the data.
Adrian Baddeley, Julian Faraway, John Marsaglia, George Marsaglia.
Maintainer: Adrian Baddeley <[email protected]>
Braun, H. (1980) A simple method for testing goodness-of-fit in the presence of nuisance parameters. Journal of the Royal Statistical Society 42, 53–63.
Csorgo, S. and Faraway, J.J. (1996) The exact and asymptotic distributions of Cramer-von Mises statistics. Journal of the Royal Statistical Society, Series B 58, 221–234.
Marsaglia, G. and Marsaglia, J. (2004) Evaluating the Anderson-Darling Distribution. Journal of Statistical Software 9 (2), 1–5. February 2004. http://www.jstatsoft.org/v09/i02
x <- rnorm(30, mean=2, sd=1) # default behaviour: parameters fixed: simple null hypothesis cvm.test(x, "pnorm", mean=2, sd=1) ad.test(x, "pnorm", mean=2, sd=1) # parameters estimated: composite null hypothesis mu <- mean(x) sigma <- sd(x) cvm.test(x, "pnorm", mean=mu, sd=sigma, estimated=TRUE) ad.test(x, "pnorm", mean=mu, sd=sigma, estimated=TRUE)
x <- rnorm(30, mean=2, sd=1) # default behaviour: parameters fixed: simple null hypothesis cvm.test(x, "pnorm", mean=2, sd=1) ad.test(x, "pnorm", mean=2, sd=1) # parameters estimated: composite null hypothesis mu <- mean(x) sigma <- sd(x) cvm.test(x, "pnorm", mean=mu, sd=sigma, estimated=TRUE) ad.test(x, "pnorm", mean=mu, sd=sigma, estimated=TRUE)
Performs the Anderson-Darling test of goodness-of-fit to a specified continuous univariate probability distribution.
ad.test(x, null = "punif", ..., estimated=FALSE, nullname)
ad.test(x, null = "punif", ..., estimated=FALSE, nullname)
x |
Numeric vector of data values. |
null |
A function, or a character string giving the name of a function, to compute the cumulative distribution function for the null distribution. |
... |
Additional arguments for the cumulative distribution function. |
estimated |
Logical value indicating whether the parameters of the distribution
were estimated using the data |
nullname |
Optional character string describing the null distribution.
The default is |
This command performs the Anderson-Darling test
of goodness-of-fit to the distribution specified by the argument
null
. It is assumed that the values in x
are
independent and identically distributed random values, with some
cumulative distribution function .
The null hypothesis is that
is the function
specified by the argument
null
, while the alternative
hypothesis is that is some other function.
By default, the test assumes that all the parameters of the null distribution are known in advance (a simple null hypothesis). This test does not account for the effect of estimating the parameters.
If the parameters of the distribution were estimated (that is,
if they were calculated from the same data x
),
then this should be indicated by setting the argument estimated=TRUE
.
The test will then use the method of Braun (1980)
to adjust for the effect of parameter estimation.
Note that Braun's method involves randomly
dividing the data into two equally-sized subsets, so the -value
is not exactly the same if the test is repeated.
This technique is expected to work well when the number of
observations in
x
is large.
An object of class "htest"
representing the result of
the hypothesis test.
Original C code by George Marsaglia and John Marsaglia. R interface by Adrian Baddeley.
Anderson, T.W. and Darling, D.A. (1952) Asymptotic theory of certain 'goodness-of-fit' criteria based on stochastic processes. Annals of Mathematical Statistics 23, 193–212.
Anderson, T.W. and Darling, D.A. (1954) A test of goodness of fit. Journal of the American Statistical Association 49, 765–769.
Braun, H. (1980) A simple method for testing goodness-of-fit in the presence of nuisance parameters. Journal of the Royal Statistical Society 42, 53–63.
Marsaglia, G. and Marsaglia, J. (2004) Evaluating the Anderson-Darling Distribution. Journal of Statistical Software 9 (2), 1–5. February 2004. http://www.jstatsoft.org/v09/i02
pAD
for the null distribution of the test statistic.
x <- rnorm(10, mean=2, sd=1) ad.test(x, "pnorm", mean=2, sd=1) ad.test(x, "pnorm", mean=mean(x), sd=sd(x), estimated=TRUE)
x <- rnorm(10, mean=2, sd=1) ad.test(x, "pnorm", mean=2, sd=1) ad.test(x, "pnorm", mean=mean(x), sd=sd(x), estimated=TRUE)
Performs the Cramer-von Mises test of goodness-of-fit to a specified continuous univariate probability distribution.
cvm.test(x, null = "punif", ..., estimated=FALSE, nullname)
cvm.test(x, null = "punif", ..., estimated=FALSE, nullname)
x |
Numeric vector of data values. |
null |
A function, or a character string giving the name of a function, to compute the cumulative distribution function for the null distribution. |
... |
Additional arguments for the cumulative distribution function. |
estimated |
Logical value indicating whether the parameters of the distribution
were estimated using the data |
nullname |
Optional character string describing the null distribution.
The default is |
This command performs the
Cramer-von Mises test
of goodness-of-fit to the distribution specified by the argument
null
. It is assumed that the values in x
are
independent and identically distributed random values, with some
cumulative distribution function .
The null hypothesis is that
is the function
specified by the argument
null
, while the alternative
hypothesis is that is some other function.
By default, the test assumes that all the parameters of the null distribution are known in advance (a simple null hypothesis). This test does not account for the effect of estimating the parameters.
If the parameters of the distribution were estimated (that is,
if they were calculated from the same data x
),
then this should be indicated by setting the argument estimated=TRUE
.
The test will then use the method of Braun (1980)
to adjust for the effect of parameter estimation.
Note that Braun's method involves randomly
dividing the data into two equally-sized subsets, so the -value
is not exactly the same if the test is repeated.
This technique is expected to work well when the number of
observations in
x
is large.
An object of class "htest"
representing the result of
the hypothesis test.
Adrian Baddeley.
Braun, H. (1980) A simple method for testing goodness-of-fit in the presence of nuisance parameters. Journal of the Royal Statistical Society 42, 53–63.
Csorgo, S. and Faraway, J.J. (1996) The exact and asymptotic distributions of Cramer-von Mises statistics. Journal of the Royal Statistical Society, Series B 58, 221–234.
pCvM
for the null distribution of the test statistic.
x <- rnorm(10, mean=2, sd=1) cvm.test(x, "pnorm", mean=2, sd=1) cvm.test(x, "pnorm", mean=mean(x), sd=sd(x), estimated=TRUE)
x <- rnorm(10, mean=2, sd=1) cvm.test(x, "pnorm", mean=2, sd=1) cvm.test(x, "pnorm", mean=mean(x), sd=sd(x), estimated=TRUE)
pAD
computes the cumulative distribution function,
and qAD
computes the quantile function,
of the null distribution of the Anderson-Darling test
statistic.
pAD(q, n = Inf, lower.tail = TRUE, fast=TRUE) qAD(p, n = Inf, lower.tail = TRUE, fast=TRUE)
pAD(q, n = Inf, lower.tail = TRUE, fast=TRUE) qAD(p, n = Inf, lower.tail = TRUE, fast=TRUE)
q |
Numeric vector of quantiles (values for which the cumulative probability is required). |
p |
Numeric vector of probabilities. |
n |
Integer. Sample size for the Anderson-Darling test. |
lower.tail |
Logical. If |
fast |
Logical value indicating whether to use a fast algorithm
or a slower, more accurate algorithm, in the case |
pAD
uses the algorithms and C code described
in Marsaglia and Marsaglia (2004).
qAD
uses uniroot
to find the
quantiles.
The argument fast
applies only when n=Inf
and determines whether the asymptotic distribution is approximated
using the faster algorithm adinf
(accurate to 4-5 places)
or the slower algorithm ADinf
(accurate to 11 places)
described in Marsaglia and Marsaglia (2004).
A numeric vector of the same length as p
or q
.
Original C code by G. and J. Marsaglia. R interface by Adrian Baddeley.
Anderson, T.W. and Darling, D.A. (1952) Asymptotic theory of certain 'goodness-of-fit' criteria based on stochastic processes. Annals of Mathematical Statistics 23, 193–212.
Anderson, T.W. and Darling, D.A. (1954) A test of goodness of fit. Journal of the American Statistical Association 49, 765–769.
Marsaglia, G. and Marsaglia, J. (2004) Evaluating the Anderson-Darling Distribution. Journal of Statistical Software 9 (2), 1–5. February 2004. http://www.jstatsoft.org/v09/i02
pAD(1.1, n=5) pAD(1.1) pAD(1.1, fast=FALSE) qAD(0.5, n=5) qAD(0.5)
pAD(1.1, n=5) pAD(1.1) pAD(1.1, fast=FALSE) qAD(0.5, n=5) qAD(0.5)
pCvM
computes the cumulative distribution function,
and qCvM
computes the quantile function,
of the null distribution of the
Cramer-von Mises test
statistic.
pCvM(q, n = Inf, lower.tail = TRUE) qCvM(p, n = Inf, lower.tail = TRUE)
pCvM(q, n = Inf, lower.tail = TRUE) qCvM(p, n = Inf, lower.tail = TRUE)
q |
Numeric vector of quantiles (values for which the cumulative probability is required). |
p |
Numeric vector of probabilities. |
n |
Integer. Sample size for the Cramer-von Mises test. |
lower.tail |
Logical. If |
For finite n
the cumulative distribution function is
approximated by the first order expansion
,
equation (1.8) of
Csorgo and Faraway (1996).
qCvM
uses uniroot
to find the
quantiles.
A numeric vector of the same length as p
or q
.
Original Matlab code by Julian Faraway, translated to R by Adrian Baddeley.
Csorgo, S. and Faraway, J.J. (1996) The exact and asymptotic distributions of Cramer-von Mises statistics. Journal of the Royal Statistical Society, Series B 58, 221–234.
pCvM(1.1, n=5) pCvM(1.1) qCvM(0.5, n=5) qCvM(0.5)
pCvM(1.1, n=5) pCvM(1.1) qCvM(0.5, n=5) qCvM(0.5)
Recognises many standard cumulative distribution functions and returns a string describing the distribution.
recogniseCdf(s="punif")
recogniseCdf(s="punif")
s |
A single character string giving the name of an R function that calculates cumulative probabilities. |
The list of recognised distribution functions includes all those available in the stats package and in goftest.
By convention, the name of a cumulative distribution function
begins with the letter p
. For example, punif
is the
cumulative distribution function of the uniform distribution.
The initial letter p
can be omitted in the
function recogniseCdf
.
Character string, or NULL
if the name is not recognised.
Adrian Baddeley.
recogniseCdf("punif") recogniseCdf("unif") recogniseCdf("pt")
recogniseCdf("punif") recogniseCdf("unif") recogniseCdf("pt")