Heavy-tailed distribution

Source: Wikipedia, the free encyclopedia.

In probability theory, heavy-tailed distributions are probability distributions whose tails are not exponentially bounded:[1] that is, they have heavier tails than the exponential distribution. In many applications it is the right tail of the distribution that is of interest, but a distribution may have a heavy left tail, or both tails may be heavy.

There are three important subclasses of heavy-tailed distributions: the

long-tailed distributions, and the subexponential distributions. In practice, all commonly used heavy-tailed distributions belong to the subexponential class, introduced by Jozef Teugels.[2]

There is still some discrepancy over the use of the term heavy-tailed. There are two other definitions in use. Some authors use the term to refer to those distributions which do not have all their power

log-normal
that possess all their power moments, yet which are generally considered to be heavy-tailed. (Occasionally, heavy-tailed is used for any distribution that has heavier tails than the normal distribution.)

Definitions

Definition of heavy-tailed distribution

The distribution of a

moment generating function of X, MX(t), is infinite for all t > 0.[3]

That means

[4]


This is also written in terms of the tail distribution function

as

Definition of long-tailed distribution

The distribution of a random variable X with distribution function F is said to have a long right tail[1] if for all t > 0,

or equivalently

This has the intuitive interpretation for a right-tailed long-tailed distributed quantity that if the long-tailed quantity exceeds some high level, the probability approaches 1 that it will exceed any other higher level.

All long-tailed distributions are heavy-tailed, but the converse is false, and it is possible to construct heavy-tailed distributions that are not long-tailed.

Subexponential distributions

Subexponentiality is defined in terms of

random variables
with a common distribution function , the convolution of with itself, written and called the convolution square, is defined using Lebesgue–Stieltjes integration by:

and the n-fold convolution is defined inductively by the rule:

The tail distribution function is defined as .

A distribution on the positive half-line is subexponential[1][5][2] if

This implies[6] that, for any ,

The probabilistic interpretation[6] of this is that, for a sum of

random variables
with common distribution ,

This is often known as the principle of the single big jump[7] or catastrophe principle.[8]

A distribution on the whole real line is subexponential if the distribution is.[9] Here is the indicator function of the positive half-line. Alternatively, a random variable supported on the real line is subexponential if and only if is subexponential.

All subexponential distributions are long-tailed, but examples can be constructed of long-tailed distributions that are not subexponential.

Common heavy-tailed distributions

All commonly used heavy-tailed distributions are subexponential.[6]

Those that are one-tailed include:

Those that are two-tailed include:

Relationship to fat-tailed distributions

A fat-tailed distribution is a distribution for which the probability density function, for large x, goes to zero as a power . Since such a power is always bounded below by the probability density function of an exponential distribution, fat-tailed distributions are always heavy-tailed. Some distributions, however, have a tail which goes to zero slower than an exponential function (meaning they are heavy-tailed), but faster than a power (meaning they are not fat-tailed). An example is the log-normal distribution[contradictory]. Many other heavy-tailed distributions such as the log-logistic and Pareto distribution are, however, also fat-tailed.

Estimating the tail-index

There are parametric[6] and non-parametric[14] approaches to the problem of the tail-index estimation.[when defined as?]

To estimate the tail-index using the parametric approach, some authors employ

GEV distribution or Pareto distribution
; they may apply the maximum-likelihood estimator (MLE).

Pickand's tail-index estimator

With a random sequence of independent and same density function , the Maximum Attraction Domain[15] of the generalized extreme value density , where . If and , then the Pickands tail-index estimation is[6][15]

where . This estimator converges in probability to .

Hill's tail-index estimator

Let be a sequence of independent and identically distributed random variables with distribution function , the maximum domain of attraction of the generalized extreme value distribution , where . The sample path is where is the sample size. If is an intermediate order sequence, i.e. , and , then the Hill tail-index estimator is[16]

where is the -th order statistic of . This estimator converges in probability to , and is asymptotically normal provided is restricted based on a higher order regular variation property[17] .[18] Consistency and asymptotic normality extend to a large class of dependent and heterogeneous sequences,[19][20] irrespective of whether is observed, or a computed residual or filtered data from a large class of models and estimators, including mis-specified models and models with errors that are dependent.[21][22][23] Note that both Pickand's and Hill's tail-index estimators commonly make use of logarithm of the order statistics.[24]

Ratio estimator of the tail-index

The ratio estimator (RE-estimator) of the tail-index was introduced by Goldie and Smith.[25] It is constructed similarly to Hill's estimator but uses a non-random "tuning parameter".

A comparison of Hill-type and RE-type estimators can be found in Novak.[14]

Software

Estimation of heavy-tailed density

Nonparametric approaches to estimate heavy- and superheavy-tailed probability density functions were given in Markovich.[27] These are approaches based on variable bandwidth and long-tailed kernel estimators; on the preliminary data transform to a new random variable at finite or infinite intervals, which is more convenient for the estimation and then inverse transform of the obtained density estimate; and "piecing-together approach" which provides a certain parametric model for the tail of the density and a non-parametric model to approximate the mode of the density. Nonparametric estimators require an appropriate selection of tuning (smoothing) parameters like a bandwidth of kernel estimators and the bin width of the histogram. The well known data-driven methods of such selection are a cross-validation and its modifications, methods based on the minimization of the mean squared error (MSE) and its asymptotic and their upper bounds.[28] A discrepancy method which uses well-known nonparametric statistics like Kolmogorov-Smirnov's, von Mises and Anderson-Darling's ones as a metric in the space of distribution functions (dfs) and quantiles of the later statistics as a known uncertainty or a discrepancy value can be found in.[27] Bootstrap is another tool to find smoothing parameters using approximations of unknown MSE by different schemes of re-samples selection, see e.g.[29]

See also

References

  1. ^ .
  2. ^ . Retrieved April 7, 2019.
  3. ^ Rolski, Schmidli, Scmidt, Teugels, Stochastic Processes for Insurance and Finance, 1999
  4. ^ S. Foss, D. Korshunov, S. Zachary, An Introduction to Heavy-Tailed and Subexponential Distributions, Springer Science & Business Media, 21 May 2013
  5. ^ Chistyakov, V. P. (1964). "A Theorem on Sums of Independent Positive Random Variables and Its Applications to Branching Random Processes". ResearchGate. Retrieved April 7, 2019.
  6. ^ .
  7. .
  8. ^ Wierman, Adam (January 9, 2014). "Catastrophes, Conspiracies, and Subexponential Distributions (Part III)". Rigor + Relevance blog. RSRG, Caltech. Retrieved January 9, 2014.
  9. ^ Willekens, E. (1986). "Subexponentiality on the real line". Technical Report. K.U. Leuven.
  10. ISBN 978-3-0348-0008-2.{{cite book}}: CS1 maint: multiple names: authors list (link
    )
  11. ^ Alves, M.I.F., de Haan, L. & Neves, C. (March 10, 2006). "Statistical inference for heavy and super-heavy tailed distributions" (PDF). Archived from the original (PDF) on June 23, 2007. Retrieved November 1, 2011.{{cite web}}: CS1 maint: multiple names: authors list (link)
  12. ^ John P. Nolan (2009). "Stable Distributions: Models for Heavy Tailed Data" (PDF). Archived from the original (PDF) on 2011-07-17. Retrieved 2009-02-21.
  13. ^ Stephen Lihn (2009). "Skew Lognormal Cascade Distribution". Archived from the original on 2014-04-07. Retrieved 2009-06-12.
  14. ^ .
  15. ^ .
  16. ^ Hill B.M. (1975) A simple general approach to inference about the tail of a distribution. Ann. Stat., v. 3, 1163–1174.
  17. ^ Hall, P.(1982) On some estimates of an exponent of regular variation. J. R. Stat. Soc. Ser. B., v. 44, 37–42.
  18. ^ Haeusler, E. and J. L. Teugels (1985) On asymptotic normality of Hill's estimator for the exponent of regular variation. Ann. Stat., v. 13, 743–756.
  19. ^ Hsing, T. (1991) On tail index estimation using dependent data. Ann. Stat., v. 19, 1547–1569.
  20. ^ Hill, J. (2010) On tail index estimation for dependent, heterogeneous data. Econometric Th., v. 26, 1398–1436.
  21. ^ Resnick, S. and Starica, C. (1997). Asymptotic behavior of Hill’s estimator for autoregressive data. Comm. Statist. Stochastic Models 13, 703–721.
  22. ^ Ling, S. and Peng, L. (2004). Hill’s estimator for the tail index of an ARMA model. J. Statist. Plann. Inference 123, 279–293.
  23. ^ Hill, J. B. (2015). Tail index estimation for a filtered dependent time series. Stat. Sin. 25, 609–630.
  24. S2CID 88514574
    .
  25. ^ Goldie C.M., Smith R.L. (1987) Slow variation with remainder: theory and applications. Quart. J. Math. Oxford, v. 38, 45–71.
  26. S2CID 8917289
    .
  27. ^ .
  28. .
  29. .