Extreme value theory

Extreme value theory or extreme value analysis (EVA) is a branch of

coastal engineer

would seek to estimate the 50 year wave and design the structure accordingly.

Data analysis

Two main approaches exist for practical extreme value analysis.

The first method relies on deriving block maxima (minima) series as a preliminary step. In many situations it is customary and convenient to extract the annual maxima (minima), generating an annual maxima series (AMS).

The second method relies on extracting, from a continuous record, the peak values reached for any period during which values exceed a certain threshold (falls below a certain threshold). This method is generally referred to as the peak over threshold method (POT).^[1]

For AMS data, the analysis may partly rely on the results of the

independent random variables from the same distribution. Given that the number of relevant random events within a year may be rather limited, it is unsurprising that analyses of observed AMS data often lead to distributions other than the generalized extreme value distribution (GEVD) being selected.^[4]

For POT data, the analysis may involve fitting two distributions: One for the number of events in a time period considered and a second for the size of the exceedances.

A common assumption for the first is the Poisson distribution, with the generalized Pareto distribution being used for the exceedances. A

Pickands–Balkema–de Haan theorem.^[5]^[6]

Novak (2011) reserves the term "POT method" to the case where the threshold is non-random, and distinguishes it from the case where one deals with exceedances of a random threshold.[7]

Applications

Applications of extreme value theory include predicting the probability distribution of:

Extreme
freak waves
Tornado outbreaks^[8]
Maximum sizes of
ecological populations^[9]

Side effects of drugs (e.g., ximelagatran)
The magnitudes of large insurance losses
Equity risks; day-to-day market risk
Mutation events during evolution
Large wildfires^[10]
Environmental loads on structures^[11]
Time the fastest
humans could ever run the 100 metres sprint^[12] and performances in other athletic disciplines^[13]^[14]^[15]

Pipeline failures due to pitting corrosion
Anomalous IT network traffic, prevent attackers from reaching important data
Road safety analysis^[16]^[17]

Wireless communications^[18]

Epidemics^[19]
Neurobiology^[20]

Solar energy^[21]

History

The field of extreme value theory was pioneered by

British Cotton Industry Research Association, where he worked to make cotton thread stronger. In his studies, he realized that the strength of a thread was controlled by the strength of its weakest fibres. With the help of R.A. Fisher, Tippet obtained three asymptotic limits describing the distributions of extremes assuming independent variables. E.J. Gumbel (1958)^[22]

codified this theory. These results can be extended to allow for slight correlations between variables, but the classical theory does not extend to strong correlations of the order of the variance. One universality class of particular interest is that of log-correlated fields, where the correlations decay logarithmically with the distance.

Univariate theory

The theory for extreme values of a single variable is governed by the extreme value theorem, also called the Fisher–Tippett–Gnedenko theorem, which describes which of the three possible distributions for extreme values applies for a particular statistical variable $\ X\ ,$ which is summarized in this section.

Let $\ X_{1},\ \dots \ X_{n}\$ be a sample of

independent and identically distributed random variables with cumulative distribution function

\ F\

and let

\ M_{n}=\max \ \{\ X_{1},\ \dots \ X_{n}\ \}\

denote the sample maximum.

In theory, the exact distribution of the maximum can be derived:

{\begin{aligned}{\boldsymbol {\operatorname {\mathcal {P}} }}\left\{\ M_{n}\leq z\ \right\}&={\boldsymbol {\operatorname {\mathcal {P}} }}\left\{\ X_{1}\leq z,\ \dots \ ,\ X_{n}\leq z\ \right\}\\&={\boldsymbol {\operatorname {\mathcal {P}} }}\left\{\ X_{1}\leq z\ \right\}\times \cdots \times {\boldsymbol {\operatorname {\mathcal {P}} }}\left\{\ X_{n}\leq z\ \right\}={\bigl (}\ F(z)\ {\bigr )}^{n}~.\end{aligned}}

The value of the associated indicator function $\ I_{n}={\boldsymbol {\operatorname {\mathcal {I}} }}\left[\ M_{n}>z\ \right]\$ is a Bernoulli process with a success probability $\ p(z)=1-{\bigl (}\ F(z)\ {\bigr )}^{n}\$ that depends on the magnitude $\ z\$ of the extreme event. The number of extreme events within $\ n\$ trials thus follows a binomial distribution and the number of trials until an event occurs follows a geometric distribution with expected value and standard deviation of the same order $\ {\boldsymbol {\operatorname {\mathcal {O}} }}\left({\tfrac {\ 1\ }{p(z)}}\right)~.$

In practice, we might not have the distribution function $\ F\$ but the Fisher–Tippett–Gnedenko theorem provides an asymptotic result. If there exist sequences of paired constants $\ (a_{n},b_{n})\ ,$ with $\ a_{n}>0\$ and $\ b_{n}\in \mathbb {R} \ ,$ such that

{\boldsymbol {\operatorname {\mathcal {P}} }}\left\{{\frac {\ M_{n}-b_{n}\ }{\ a_{n}\ }}\leq z\right\}\rightarrow G(z)

as $\ n\rightarrow \infty \$ then

G(z)\ \propto \ \exp \left[-{\bigl (}1+\gamma \cdot z{\bigr )}^{-{\frac {\ 1\ }{\gamma }}}\right]

where the parameter $\ \gamma \$ depends on how steeply of the distribution's tail(s) diminish (called "ordinary" tail(s), "thin" tail(s), and "fat" tail(s), with the normal distribution put in the "thin" tailed group instead of "ordinary" for this context, at least). When normalized, $\ G\$ belongs to one of the following non-degenerate distribution families:

Type 1: Gumbel distribution, for

\ \gamma =0\

G(z)=\exp \left[-\exp \left(-{\tfrac {\ z\ -\ b\ }{a}}\right)\right]

when the distribution of

\ M_{n}\

has an "ordinary" exponentially diminishing tail.

Type 2: Fréchet distribution, for

\ \gamma <0\

G(z)={\begin{cases}0\quad &z\leq b\\\exp \left[-\left({\tfrac {\ z\ -\ b\ }{a}}\right)^{-\left|\gamma \right|}\right]&z>b\end{cases}}

when the distribution of

\ M_{n}\

has a heavy tail (including polynomial decay).

Type 3: Weibull distribution,

\ \gamma >0\

G(z)={\begin{cases}\exp \left[-\left(-{\tfrac {\ z\ -\ b\ }{a}}\right)^{\gamma }\right]&z<b\\1&z\geq b\end{cases}}\

for

\ z\in \mathbb {R} \

when the distribution of

\ M_{n}\

has a thin tail with finite upper bound.

Multivariate theory

Extreme value theory in more than one variable introduces additional issues that have to be addressed. One problem that arises is that one must specify what constitutes an extreme event.^[23] Although this is straightforward in the univariate case, there is no unambiguous way to do this in the multivariate case. The fundamental problem is that although it is possible to order a set of real-valued numbers, there is no natural way to order a set of vectors.

As an example, in the univariate case, given a set of observations $\ x_{i}\$ it is straightforward to find the most extreme event simply by taking the maximum (or minimum) of the observations. However, in the bivariate case, given a set of observations $\ (x_{i},y_{i})\$ , it is not immediately clear how to find the most extreme event. Suppose that one has measured the values $\ (3,4)\$ at a specific time and the values $\ (5,2)\$ at a later time. Which of these events would be considered more extreme? There is no universal answer to this question.

Another issue in the multivariate case is that the limiting model is not as fully prescribed as in the univariate case. In the univariate case, the model (GEV distribution) contains three parameters whose values are not predicted by the theory and must be obtained by fitting the distribution to the data. In the multivariate case, the model not only contains unknown parameters, but also a function whose exact form is not prescribed by the theory. However, this function must obey certain constraints.^[24]^[25] It is not straightforward to devise estimators that obey such constraints though some have been recently constructed.^[26]^[27]^[28]

As an example of an application, bivariate extreme value theory has been applied to ocean research.^[23]^[29]

Non-stationary extremes

Statistical modeling for nonstationary time series was developed in the 1990s.^[30] Methods for nonstationary multivariate extremes have been introduced more recently.^[31] The latter can be used for tracking how the dependence between extreme values changes over time, or over another covariate.^[32]^[33]^[34]

References

doi:10.1016/0167-7152(91)90107-3
.

^ Fisher & Tippett (1928)

^ Gnedenko (1943)

^ Embrechts, Klüppelberg & Mikosch (1997)

^ Pickands (1975)

^ Balkema & de Haan (1974)

^ Novak (2011)

^ Tippett, Lepore & Cohen (2016)

^ Batt, Ryan D.; Carpenter, Stephen R.; Ives, Anthony R. (March 2017). "Extreme events in lake ecosystem time series". Limnology and Oceanography Letters. 2 (3): 63.
doi:10.1002/lol2.10037
.

^ Alvarado, Sandberg & Pickford (1998), p. 68

^ Makkonen (2008)

^ Einmahl, J.H.J.; Smeets, S.G.W.R. (2009). Ultimate 100m world records through extreme-value theory (PDF) (Report). CentER Discussion Paper. Vol. 57. Tilburg University. Archived from the original (PDF) on 2016-03-12. Retrieved 2009-08-12.

^ Gembris, D.; Taylor, J.; Suter, D. (2002). "Trends and random fluctuations in athletics". Nature. 417 (6888): 506.
S2CID 13469470
.

^ Gembris, D.; Taylor, J.; Suter, D. (2007). "Evolution of athletic records: Statistical effects versus real improvements". S2CID 55378036
.

^ Spearing, H.; Tawn, J.; Irons, D.; Paulden, T.; Bennett, G. (2021). "Ranking, and other properties, of elite swimmers using extreme value theory". Journal of the Royal Statistical Society. Series A (Statistics in Society). 184 (1): 368–395.
S2CID 204823947
.

^ Songchitruksa, P.; Tarko, A.P. (2006). "The extreme value theory approach to safety estimation". Accident Analysis and Prevention. 38 (4): 811–822.
PMID 16546103
.

^ Orsini, F.; Gecchele, G.; Gastaldi, M.; Rossi, R. (2019). "Collision prediction in roundabouts: A comparative study of extreme value theory approaches". Transportmetrica. Series A: Transport Science. 15 (2): 556–572.
S2CID 158343873
.

^ Tsinos, C.G.; Foukalas, F.; Khattab, T.; Lai, L. (February 2018). "On channel selection for carrier aggregation systems". S2CID 3405114
.

^ Wong, Felix; Collins, James J. (2 November 2020). "Evidence that coronavirus superspreading is fat-tailed". Proceedings of the National Academy of Sciences of the U.S. 117 (47): 29416–29418.
PMID 33139561
.

^ Basnayake, Kanishka; Mazaud, David; Bemelmans, Alexis; Rouach, Nathalie; Korkotian, Eduard; Holcman, David (4 June 2019). "Fast calcium transients in dendritic spines driven by extreme statistics". PMID 31163024
.

^ Younis, Abubaker; Abdeljalil, Anwar; Omer, Ali (1 January 2023). "Determination of panel generation factor using peaks over threshold method and short-term data for an off-grid photovoltaic system in Sudan: A case of Khartoum city". Solar Energy. 249: 242–249.
S2CID 254207549
.

^ Gumbel (2004)

^ ^a ^b Morton, I.D.; Bowers, J. (December 1996). "Extreme value analysis in a multivariate offshore environment". Applied Ocean Research. 18 (6): 303–317.
ISSN 0141-1187
.

^ Beirlant, Jan; Goegebeur, Yuri; Teugels, Jozef; Segers, Johan (27 August 2004). Statistics of Extremes: Theory and applications. Wiley Series in Probability and Statistics. Chichester, UK: John Wiley & Sons, Ltd.
ISBN 978-0-470-01238-3
.

^ Coles, Stuart (2001). An Introduction to Statistical Modeling of Extreme Values. Springer Series in Statistics.
ISSN 0172-7397
.

^ de Carvalho, M.; Davison, A.C. (2014). "Spectral density ratio models for multivariate extremes" (PDF). Journal of the American Statistical Association. 109: 764‒776.
S2CID 53338058
.

^ Hanson, T.; de Carvalho, M.; Chen, Yuhui (2017). "Bernstein polynomial angular densities of multivariate extreme value distributions" (PDF). Statistics and Probability Letters. 128: 60–66.
S2CID 53338058
.

^ de Carvalho, M. (2013). "A Euclidean likelihood estimator for bivariate tail dependence" (PDF). Communications in Statistics – Theory and Methods. 42 (7): 1176–1192.
S2CID 42652601
.

^ Zachary, S.; Feld, G.; Ward, G.; Wolfram, J. (October 1998). "Multivariate extrapolation in the offshore environment". ISSN 0141-1187
.

^ Davison, A.C.; Smith, Richard (1990). "Models for exceedances over high thresholds". Journal of the Royal Statistical Society. Series B (Methodological). 52 (3): 393–425.
doi:10.1111/j.2517-6161.1990.tb01796.x
.

^ de Carvalho, M. (2016). "Statistics of extremes: Challenges and opportunities". Handbook of EVT and its Applications to Finance and Insurance (PDF). Hoboken, NJ: John Wiley's Sons. pp. 195–214.
ISBN 978-1-118-65019-6
.

^ Castro, D.; de Carvalho, M.; Wadsworth, J. (2018). "Time-Varying Extreme Value Dependence with Application to Leading European Stock Markets" (PDF). Annals of Applied Statistics. 12: 283–309.
S2CID 33350408
.

^ Mhalla, L.; de Carvalho, M.; Chavez-Demoulin, V. (2019). "Regression type models for extremal dependence" (PDF). Scandinavian Journal of Statistics. 46 (4): 1141–1167.
S2CID 53570822
.

^ Mhalla, L.; de Carvalho, M.; Chavez-Demoulin, V. (2018). "Local robust estimation of the Pickands dependence function". S2CID 59467614
.

Sources

Abarbanel, H.; Koonin, S.; Levine, H.; MacDonald, G.; Rothaus, O. (January 1992). "Statistics of extreme events with application to climate" (PDF). JASON. JSR-90-30S. Retrieved 2015-03-03.

Alvarado, Ernesto; Sandberg, David V.; Pickford, Stewart G. (1998). "Modeling Large Forest Fires as Extreme Events" (PDF). Northwest Science. 72: 66–75. Archived from the original (PDF) on 2009-02-26. Retrieved 2009-02-06.

Balkema, A.;
JSTOR 2959306
.

Burry, K.V. (1975). Statistical Methods in Applied Science. Hoboken, NJ: John Wiley & Sons.

Castillo, E. (1988). Extreme Value Theory in Engineering. New York, NY: Academic Press.
ISBN 0-12-163475-2
.

Castillo, E.; Hadi, A.S.; Balakrishnan, N.; Sarabia, J.M. (2005). Extreme Value and Related Models with Applications in Engineering and Science. Wiley Series in Probability and Statistics. Hoboken, NJ: John Wiley's Sons.
ISBN 0-471-67172-X
.

Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. London, UK: Springer.

Embrechts, P.; Klüppelberg, C.; Mikosch, T. (1997). Modelling extremal events for insurance and finance. Berlin, DE: Springer Verlag.

S2CID 123125823
.

Gnedenko, B.V. (1943). "Sur la distribution limite du terme maximum d'une serie aleatoire" [On the limiting distribution(s) of the maximum value of a series ...]. JSTOR 1968974
.

Gumbel, E.J., ed. (1935) [1933–1934]. "Les valeurs extrêmes des distributions statistiques" [The statistical distributions of extreme values] (pdf). Annales de l'institut Henri Poincaré (conference papers) (in French). 5 (2). France: 115–158. Retrieved 2009-04-01 – via numdam.org.

ISBN 978-0-486-43604-3
.

Makkonen, L. (2008). "Problems in the extreme value analysis". Structural Safety. 30 (5): 405–419.
doi:10.1016/j.strusafe.2006.12.001
.

Leadbetter, M.R. (1991). "On a basis for 'peaks over threshold' modeling". Statistics & Probability Letters. 12 (4): 357–362.
doi:10.1016/0167-7152(91)90107-3
.

Leadbetter, M.R.; Lindgren, G.; Rootzen, H. (1982). Extremes and Related Properties of Random Sequences and Processes. New York, NY: Springer-Verlag.

Lindgren, G.; Rootzen, H. (1987). "Extreme values: Theory and technical applications". Scandinavian Journal of Statistics, Theory and Applications. 14: 241–279.

Novak, S.Y. (2011). Extreme Value Methods with Applications to Finance. London, UK / Boca Raton, FL: Chapman & Hall / CRC Press.
ISBN 978-1-4398-3574-6
.

Pickands, J. (1975). "Statistical inference using extreme order statistics". Annals of Statistics. 3: 119–131.
doi:10.1214/aos/1176343003
.

Tippett, Michael K.; Lepore, Chiara; Cohen, Joel E. (16 December 2016). "More tornadoes in the most extreme U.S. tornado outbreaks". Science. 354 (6318): 1419–1423.
PMID 27934705
.

Software

"Extreme Value Statistics in R". cran.r-project.org (software). 4 November 2023. — Package for extreme value statistics in R.

"Extremes.jl". github.com (software). — Package for extreme value statistics in Julia.

"Source code for stationary and non-stationary extreme value analysis". amir.eng.uci.edu (software). Irvine, CA: University of California, Irvine.

External links

Chavez-Demoulin, Valérie; Roehrl, Armin (8 January 2004). Extreme value theory can save your neck (PDF). risknet.de (Report). Germany. — Easy non-mathematical introduction.

Steps in applying extreme value theory to finance: A review (PDF). bankofcanada.ca (Report). Bank of Canada (published January 2010). c. 2010.

Gumbel, E.J., ed. (1935) [1933–1934]. "Les valeurs extrêmes des distributions statistiques" [The statistical distributions of extreme values] (pdf). Annales de l'institut Henri Poincaré (conference papers) (in French). 5 (2). France: 115–158. Retrieved 2009-04-01 – via numdam.org. — Full-text access to conferences held by E.J. Gumbel in 1933–1934.

Authority control databases: National

Israel

United States

Czech Republic

Retrieved from "https://en.wikipedia.org/w/index.php?title=Extreme_value_theory&oldid=1202645121"

[1] :10.1016/0167-7152(91)90107-3
.

[2] Fisher & Tippett (1928)

[3] Gnedenko (1943)

[4] Embrechts, Klüppelberg & Mikosch (1997)

[5] Pickands (1975)

[6] Balkema & de Haan (1974)

[7] Novak (2011)

[8] Tippett, Lepore & Cohen (2016)

[9] Batt, Ryan D.; Carpenter, Stephen R.; Ives, Anthony R. (March 2017). "Extreme events in lake ecosystem time series". Limnology and Oceanography Letters. 2 (3): 63.
doi:10.1002/lol2.10037
.

[10] Alvarado, Sandberg & Pickford (1998), p. 68

[11] Makkonen (2008)

[12] Einmahl, J.H.J.; Smeets, S.G.W.R. (2009). Ultimate 100m world records through extreme-value theory (PDF) (Report). CentER Discussion Paper. Vol. 57. Tilburg University. Archived from the original (PDF) on 2016-03-12. Retrieved 2009-08-12.

[13] Gembris, D.; Taylor, J.; Suter, D. (2002). "Trends and random fluctuations in athletics". Nature. 417 (6888): 506.
S2CID 13469470
.

[14] Gembris, D.; Taylor, J.; Suter, D. (2007). "Evolution of athletic records: Statistical effects versus real improvements". S2CID 55378036
.

[15] Spearing, H.; Tawn, J.; Irons, D.; Paulden, T.; Bennett, G. (2021). "Ranking, and other properties, of elite swimmers using extreme value theory". Journal of the Royal Statistical Society. Series A (Statistics in Society). 184 (1): 368–395.
S2CID 204823947
.

[16] Songchitruksa, P.; Tarko, A.P. (2006). "The extreme value theory approach to safety estimation". Accident Analysis and Prevention. 38 (4): 811–822.
PMID 16546103
.

[17] Orsini, F.; Gecchele, G.; Gastaldi, M.; Rossi, R. (2019). "Collision prediction in roundabouts: A comparative study of extreme value theory approaches". Transportmetrica. Series A: Transport Science. 15 (2): 556–572.
S2CID 158343873
.

[18] Tsinos, C.G.; Foukalas, F.; Khattab, T.; Lai, L. (February 2018). "On channel selection for carrier aggregation systems". S2CID 3405114
.

[19] Wong, Felix; Collins, James J. (2 November 2020). "Evidence that coronavirus superspreading is fat-tailed". Proceedings of the National Academy of Sciences of the U.S. 117 (47): 29416–29418.
PMID 33139561
.

[20] Basnayake, Kanishka; Mazaud, David; Bemelmans, Alexis; Rouach, Nathalie; Korkotian, Eduard; Holcman, David (4 June 2019). "Fast calcium transients in dendritic spines driven by extreme statistics". PMID 31163024
.

[21] Younis, Abubaker; Abdeljalil, Anwar; Omer, Ali (1 January 2023). "Determination of panel generation factor using peaks over threshold method and short-term data for an off-grid photovoltaic system in Sudan: A case of Khartoum city". Solar Energy. 249: 242–249.
S2CID 254207549
.

[22] Gumbel (2004)

[Morton-Bowers-1996-23] Morton, I.D.; Bowers, J. (December 1996). "Extreme value analysis in a multivariate offshore environment". Applied Ocean Research. 18 (6): 303–317.
ISSN 0141-1187
.

[24] Beirlant, Jan; Goegebeur, Yuri; Teugels, Jozef; Segers, Johan (27 August 2004). Statistics of Extremes: Theory and applications. Wiley Series in Probability and Statistics. Chichester, UK: John Wiley & Sons, Ltd.
ISBN 978-0-470-01238-3
.

[25] Coles, Stuart (2001). An Introduction to Statistical Modeling of Extreme Values. Springer Series in Statistics.
ISSN 0172-7397
.

[dC2014-26] Carvalho, M.; Davison, A.C. (2014). "Spectral density ratio models for multivariate extremes" (PDF). Journal of the American Statistical Association. 109: 764‒776.
S2CID 53338058
.

[hanson2017-27] Hanson, T.; de Carvalho, M.; Chen, Yuhui (2017). "Bernstein polynomial angular densities of multivariate extreme value distributions" (PDF). Statistics and Probability Letters. 128: 60–66.
S2CID 53338058
.

[dC2013-28] Carvalho, M. (2013). "A Euclidean likelihood estimator for bivariate tail dependence" (PDF). Communications in Statistics – Theory and Methods. 42 (7): 1176–1192.
S2CID 42652601
.

[29] Zachary, S.; Feld, G.; Ward, G.; Wolfram, J. (October 1998). "Multivariate extrapolation in the offshore environment". ISSN 0141-1187
.

[dS1990-30] Davison, A.C.; Smith, Richard (1990). "Models for exceedances over high thresholds". Journal of the Royal Statistical Society. Series B (Methodological). 52 (3): 393–425.
doi:10.1111/j.2517-6161.1990.tb01796.x
.

[dC2012-31] Carvalho, M. (2016). "Statistics of extremes: Challenges and opportunities". Handbook of EVT and its Applications to Finance and Insurance (PDF). Hoboken, NJ: John Wiley's Sons. pp. 195–214.
ISBN 978-1-118-65019-6
.

[castro2018-32] Castro, D.; de Carvalho, M.; Wadsworth, J. (2018). "Time-Varying Extreme Value Dependence with Application to Leading European Stock Markets" (PDF). Annals of Applied Statistics. 12: 283–309.
S2CID 33350408
.

[mhalla2019-33] Mhalla, L.; de Carvalho, M.; Chavez-Demoulin, V. (2019). "Regression type models for extremal dependence" (PDF). Scandinavian Journal of Statistics. 46 (4): 1141–1167.
S2CID 53570822
.

[EB2018-34] Mhalla, L.; de Carvalho, M.; Chavez-Demoulin, V. (2018). "Local robust estimation of the Pickands dependence function". S2CID 59467614
.

[1]

[4]

[5]

[6]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33]

[34]

Data analysis

Applications

History

Univariate theory

Multivariate theory

Non-stationary extremes

See also

References

Sources

Software

External links