Heavy-tailed distribution

In probability theory, heavy-tailed distributions are probability distributions whose tails are not exponentially bounded:^[1] that is, they have heavier tails than the exponential distribution. In many applications it is the right tail of the distribution that is of interest, but a distribution may have a heavy left tail, or both tails may be heavy.

There are three important subclasses of heavy-tailed distributions: the

long-tailed distributions, and the subexponential distributions. In practice, all commonly used heavy-tailed distributions belong to the subexponential class, introduced by Jozef Teugels.^[2]

There is still some discrepancy over the use of the term heavy-tailed. There are two other definitions in use. Some authors use the term to refer to those distributions which do not have all their power

log-normal

that possess all their power moments, yet which are generally considered to be heavy-tailed. (Occasionally, heavy-tailed is used for any distribution that has heavier tails than the normal distribution.)

Definitions

Definition of heavy-tailed distribution

The distribution of a

moment generating function of X, M_X(t), is infinite for all t > 0.^[3]

That means

\int _{-\infty }^{\infty }e^{tx}\,dF(x)=\infty \quad {\mbox{for all }}t>0.

[4]

This is also written in terms of the tail distribution function

{\overline {F}}(x)\equiv \Pr[X>x]\,

as

\lim _{x\to \infty }e^{tx}{\overline {F}}(x)=\infty \quad {\mbox{for all }}t>0.\,

Definition of long-tailed distribution

The distribution of a random variable X with distribution function F is said to have a long right tail^[1] if for all t > 0,

\lim _{x\to \infty }\Pr[X>x+t\mid X>x]=1,\,

or equivalently

{\overline {F}}(x+t)\sim {\overline {F}}(x)\quad {\mbox{as }}x\to \infty .\,

This has the intuitive interpretation for a right-tailed long-tailed distributed quantity that if the long-tailed quantity exceeds some high level, the probability approaches 1 that it will exceed any other higher level.

All long-tailed distributions are heavy-tailed, but the converse is false, and it is possible to construct heavy-tailed distributions that are not long-tailed.

Subexponential distributions

Subexponentiality is defined in terms of

random variables

X_{1},X_{2}

with a common distribution function

F

, the convolution of

F

with itself, written

F^{*2}

and called the convolution square, is defined using Lebesgue–Stieltjes integration by:

\Pr[X_{1}+X_{2}\leq x]=F^{*2}(x)=\int _{0}^{x}F(x-y)\,dF(y),

and the n-fold convolution $F^{*n}$ is defined inductively by the rule:

F^{*n}(x)=\int _{0}^{x}F(x-y)\,dF^{*n-1}(y).

The tail distribution function ${\overline {F}}$ is defined as ${\overline {F}}(x)=1-F(x)$ .

A distribution $F$ on the positive half-line is subexponential^[1]^[5]^[2] if

{\overline {F^{*2}}}(x)\sim 2{\overline {F}}(x)\quad {\mbox{as }}x\to \infty .

This implies^[6] that, for any $n\geq 1$ ,

{\overline {F^{*n}}}(x)\sim n{\overline {F}}(x)\quad {\mbox{as }}x\to \infty .

The probabilistic interpretation^[6] of this is that, for a sum of $n$

random variables

X_{1},\ldots ,X_{n}

with common distribution

F

,

\Pr[X_{1}+\cdots +X_{n}>x]\sim \Pr[\max(X_{1},\ldots ,X_{n})>x]\quad {\text{as }}x\to \infty .

This is often known as the principle of the single big jump^[7] or catastrophe principle.^[8]

A distribution $F$ on the whole real line is subexponential if the distribution $FI([0,\infty ))$ is.^[9] Here $I([0,\infty ))$ is the indicator function of the positive half-line. Alternatively, a random variable $X$ supported on the real line is subexponential if and only if $X^{+}=\max(0,X)$ is subexponential.

All subexponential distributions are long-tailed, but examples can be constructed of long-tailed distributions that are not subexponential.

Common heavy-tailed distributions

All commonly used heavy-tailed distributions are subexponential.^[6]

Those that are one-tailed include:

the Pareto distribution;
the Log-normal distribution;
the Lévy distribution;
the Weibull distribution with shape parameter greater than 0 but less than 1;
the Burr distribution;
the log-logistic distribution;
the log-gamma distribution;
the Fréchet distribution;
the q-Gaussian distribution
the log-Cauchy distribution, sometimes described as having a "super-heavy tail" because it exhibits logarithmic decay producing a heavier tail than the Pareto distribution.^[10]^[11]

Those that are two-tailed include:

The Cauchy distribution, itself a special case of both the stable distribution and the t-distribution;
The family of
stable distributions,^[12] excepting the special case of the normal distribution within that family. Some stable distributions are one-sided (or supported by a half-line), see e.g. Lévy distribution. See also financial models with long-tailed distributions and volatility clustering
.

The t-distribution.
The skew lognormal cascade distribution.^[13]

Relationship to fat-tailed distributions

A fat-tailed distribution is a distribution for which the probability density function, for large x, goes to zero as a power $x^{-a}$ . Since such a power is always bounded below by the probability density function of an exponential distribution, fat-tailed distributions are always heavy-tailed. Some distributions, however, have a tail which goes to zero slower than an exponential function (meaning they are heavy-tailed), but faster than a power (meaning they are not fat-tailed). An example is the log-normal distribution^{[contradictory]}. Many other heavy-tailed distributions such as the log-logistic and Pareto distribution are, however, also fat-tailed.

Estimating the tail-index

There are parametric^[6] and non-parametric^[14] approaches to the problem of the tail-index estimation.^{[when defined as?]}

To estimate the tail-index using the parametric approach, some authors employ

GEV distribution or Pareto distribution

; they may apply the maximum-likelihood estimator (MLE).

Pickand's tail-index estimator

With $(X_{n},n\geq 1)$ a random sequence of independent and same density function $F\in D(H(\xi ))$ , the Maximum Attraction Domain [15] of the generalized extreme value density $H$ , where $\xi \in \mathbb {R}$ . If $\lim _{n\to \infty }k(n)=\infty$ and $\lim _{n\to \infty }{\frac {k(n)}{n}}=0$ , then the Pickands tail-index estimation is^[6]^[15]

\xi _{(k(n),n)}^{\text{Pickands}}={\frac {1}{\ln 2}}\ln \left({\frac {X_{(n-k(n)+1,n)}-X_{(n-2k(n)+1,n)}}{X_{(n-2k(n)+1,n)}-X_{(n-4k(n)+1,n)}}}\right),

where $X_{(n-k(n)+1,n)}=\max \left(X_{n-k(n)+1},\ldots ,X_{n}\right)$ . This estimator converges in probability to $\xi$ .

Hill's tail-index estimator

Let $(X_{t},t\geq 1)$ be a sequence of independent and identically distributed random variables with distribution function $F\in D(H(\xi ))$ , the maximum domain of attraction of the generalized extreme value distribution $H$ , where $\xi \in \mathbb {R}$ . The sample path is ${X_{t}:1\leq t\leq n}$ where $n$ is the sample size. If $\{k(n)\}$ is an intermediate order sequence, i.e. $k(n)\in \{1,\ldots ,n-1\},$ , $k(n)\to \infty$ and $k(n)/n\to 0$ , then the Hill tail-index estimator is^[16]

\xi _{(k(n),n)}^{\text{Hill}}=\left({\frac {1}{k(n)}}\sum _{i=n-k(n)+1}^{n}\ln(X_{(i,n)})-\ln(X_{(n-k(n)+1,n)})\right)^{-1},

where $X_{(i,n)}$ is the $i$ -th order statistic of $X_{1},\dots ,X_{n}$ . This estimator converges in probability to $\xi$ , and is asymptotically normal provided $k(n)\to \infty$ is restricted based on a higher order regular variation property^[17] .^[18] Consistency and asymptotic normality extend to a large class of dependent and heterogeneous sequences,^[19]^[20] irrespective of whether $X_{t}$ is observed, or a computed residual or filtered data from a large class of models and estimators, including mis-specified models and models with errors that are dependent.^[21]^[22]^[23] Note that both Pickand's and Hill's tail-index estimators commonly make use of logarithm of the order statistics.^[24]

Ratio estimator of the tail-index

The ratio estimator (RE-estimator) of the tail-index was introduced by Goldie and Smith.^[25] It is constructed similarly to Hill's estimator but uses a non-random "tuning parameter".

A comparison of Hill-type and RE-type estimators can be found in Novak.^[14]

Software

aest Archived 2020-11-25 at the Wayback Machine, C tool for estimating the heavy-tail index.^[26]

Estimation of heavy-tailed density

Nonparametric approaches to estimate heavy- and superheavy-tailed probability density functions were given in Markovich.^[27] These are approaches based on variable bandwidth and long-tailed kernel estimators; on the preliminary data transform to a new random variable at finite or infinite intervals, which is more convenient for the estimation and then inverse transform of the obtained density estimate; and "piecing-together approach" which provides a certain parametric model for the tail of the density and a non-parametric model to approximate the mode of the density. Nonparametric estimators require an appropriate selection of tuning (smoothing) parameters like a bandwidth of kernel estimators and the bin width of the histogram. The well known data-driven methods of such selection are a cross-validation and its modifications, methods based on the minimization of the mean squared error (MSE) and its asymptotic and their upper bounds.^[28] A discrepancy method which uses well-known nonparametric statistics like Kolmogorov-Smirnov's, von Mises and Anderson-Darling's ones as a metric in the space of distribution functions (dfs) and quantiles of the later statistics as a known uncertainty or a discrepancy value can be found in.^[27] Bootstrap is another tool to find smoothing parameters using approximations of unknown MSE by different schemes of re-samples selection, see e.g.^[29]

References

^
ISBN 978-0-387-00211-8
.

^
doi:10.1214/aop/1176996225
. Retrieved April 7, 2019.

^ Rolski, Schmidli, Scmidt, Teugels, Stochastic Processes for Insurance and Finance, 1999
^ S. Foss, D. Korshunov, S. Zachary, An Introduction to Heavy-Tailed and Subexponential Distributions, Springer Science & Business Media, 21 May 2013
^ Chistyakov, V. P. (1964). "A Theorem on Sums of Independent Positive Random Variables and Its Applications to Branching Random Processes". ResearchGate. Retrieved April 7, 2019.
^
ISBN 978-3-642-08242-9
.

S2CID 3047753
.

^ Wierman, Adam (January 9, 2014). "Catastrophes, Conspiracies, and Subexponential Distributions (Part III)". Rigor + Relevance blog. RSRG, Caltech. Retrieved January 9, 2014.

^ Willekens, E. (1986). "Subexponentiality on the real line". Technical Report. K.U. Leuven.

ISBN 978-3-0348-0008-2.{{cite book}}: CS1 maint: multiple names: authors list (link
)

^ Alves, M.I.F., de Haan, L. & Neves, C. (March 10, 2006). "Statistical inference for heavy and super-heavy tailed distributions" (PDF). Archived from the original (PDF) on June 23, 2007. Retrieved November 1, 2011.{{cite web}}: CS1 maint: multiple names: authors list (link)

^ John P. Nolan (2009). "Stable Distributions: Models for Heavy Tailed Data" (PDF). Archived from the original (PDF) on 2011-07-17. Retrieved 2009-02-21.

^ Stephen Lihn (2009). "Skew Lognormal Cascade Distribution". Archived from the original on 2014-04-07. Retrieved 2009-06-12.

^
ISBN 978-1-43983-574-6
.

^
JSTOR 2958083
.

^ Hill B.M. (1975) A simple general approach to inference about the tail of a distribution. Ann. Stat., v. 3, 1163–1174.

^ Hall, P.(1982) On some estimates of an exponent of regular variation. J. R. Stat. Soc. Ser. B., v. 44, 37–42.

^ Haeusler, E. and J. L. Teugels (1985) On asymptotic normality of Hill's estimator for the exponent of regular variation. Ann. Stat., v. 13, 743–756.

^ Hsing, T. (1991) On tail index estimation using dependent data. Ann. Stat., v. 19, 1547–1569.

^ Hill, J. (2010) On tail index estimation for dependent, heterogeneous data. Econometric Th., v. 26, 1398–1436.

^ Resnick, S. and Starica, C. (1997). Asymptotic behavior of Hill’s estimator for autoregressive data. Comm. Statist. Stochastic Models 13, 703–721.

^ Ling, S. and Peng, L. (2004). Hill’s estimator for the tail index of an ARMA model. J. Statist. Plann. Inference 123, 279–293.

^ Hill, J. B. (2015). Tail index estimation for a filtered dependent time series. Stat. Sin. 25, 609–630.

S2CID 88514574
.

^ Goldie C.M., Smith R.L. (1987) Slow variation with remainder: theory and applications. Quart. J. Math. Oxford, v. 38, 45–71.

S2CID 8917289
.

^
ISBN 978-0-470-72359-3
.

ISBN 978-0412552700
.

ISBN 9780387945088
.

Retrieved from "https://en.wikipedia.org/w/index.php?title=Heavy-tailed_distribution&oldid=1151907524"

[Asmussen-1] 
ISBN 978-0-387-00211-8
.

[subexp-2] 
doi:10.1214/aop/1176996225
. Retrieved April 7, 2019.

[ReferenceA-3] Rolski, Schmidli, Scmidt, Teugels, Stochastic Processes for Insurance and Finance, 1999

[4] S. Foss, D. Korshunov, S. Zachary, An Introduction to Heavy-Tailed and Subexponential Distributions, Springer Science & Business Media, 21 May 2013

[5] Chistyakov, V. P. (1964). "A Theorem on Sums of Independent Positive Random Variables and Its Applications to Branching Random Processes". ResearchGate. Retrieved April 7, 2019.

[Embrechts-6] 
ISBN 978-3-642-08242-9
.

[7] S2CID 3047753
.

[8] Wierman, Adam (January 9, 2014). "Catastrophes, Conspiracies, and Subexponential Distributions (Part III)". Rigor + Relevance blog. RSRG, Caltech. Retrieved January 9, 2014.

[9] Willekens, E. (1986). "Subexponentiality on the real line". Technical Report. K.U. Leuven.

[10] ISBN 978-3-0348-0008-2.{{cite book}}: CS1 maint: multiple names: authors list (link
)

[11] Alves, M.I.F., de Haan, L. & Neves, C. (March 10, 2006). "Statistical inference for heavy and super-heavy tailed distributions" (PDF). Archived from the original (PDF) on June 23, 2007. Retrieved November 1, 2011.{{cite web}}: CS1 maint: multiple names: authors list (link)

[12] John P. Nolan (2009). "Stable Distributions: Models for Heavy Tailed Data" (PDF). Archived from the original (PDF) on 2011-07-17. Retrieved 2009-02-21.

[13] Stephen Lihn (2009). "Skew Lognormal Cascade Distribution". Archived from the original on 2014-04-07. Retrieved 2009-06-12.

[Novak2011-14] 
ISBN 978-1-43983-574-6
.

[Pickands-15] 
JSTOR 2958083
.

[16] Hill B.M. (1975) A simple general approach to inference about the tail of a distribution. Ann. Stat., v. 3, 1163–1174.

[17] Hall, P.(1982) On some estimates of an exponent of regular variation. J. R. Stat. Soc. Ser. B., v. 44, 37–42.

[18] Haeusler, E. and J. L. Teugels (1985) On asymptotic normality of Hill's estimator for the exponent of regular variation. Ann. Stat., v. 13, 743–756.

[19] Hsing, T. (1991) On tail index estimation using dependent data. Ann. Stat., v. 19, 1547–1569.

[20] Hill, J. (2010) On tail index estimation for dependent, heterogeneous data. Econometric Th., v. 26, 1398–1436.

[21] Resnick, S. and Starica, C. (1997). Asymptotic behavior of Hill’s estimator for autoregressive data. Comm. Statist. Stochastic Models 13, 703–721.

[22] Ling, S. and Peng, L. (2004). Hill’s estimator for the tail index of an ARMA model. J. Statist. Plann. Inference 123, 279–293.

[23] Hill, J. B. (2015). Tail index estimation for a filtered dependent time series. Stat. Sin. 25, 609–630.

[24] S2CID 88514574
.

[25] Goldie C.M., Smith R.L. (1987) Slow variation with remainder: theory and applications. Quart. J. Math. Oxford, v. 38, 45–71.

[26] S2CID 8917289
.

[Markovich2007-27] 
ISBN 978-0-470-72359-3
.

[WandJon1995-28] ISBN 978-0412552700
.

[Hall1992-29] ISBN 9780387945088
.

[1]

[2]

[3]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]