Algebraic statistics

Source: Wikipedia, the free encyclopedia.

Algebraic statistics is the use of

hypothesis testing
.

Traditionally, algebraic statistics has been associated with the design of experiments and

multivariate analysis (especially time series). In recent years, the term "algebraic statistics" has been sometimes restricted, sometimes being used to label the use of algebraic geometry and commutative algebra
in statistics.

The tradition of algebraic statistics

In the past, statisticians have used algebra to advance research in statistics. Some algebraic statistics led to the development of new topics in algebra and combinatorics, such as association schemes.

Design of experiments

For example,

R. C. Bose. Orthogonal arrays were introduced by C. R. Rao
also for experimental designs.

Algebraic analysis and abstract statistical inference]

stationary stochastic processes, which is important in time series
statistics.

Encompassing previous results on probability theory on algebraic structures,

lattice theory
.

Partially ordered sets and lattices

and colleagues.

.

Recent work using commutative algebra and algebraic geometry

In recent years, the term "algebraic statistics" has been used more restrictively, to label the use of

discrete random variables with finite state spaces. Commutative algebra and algebraic geometry have applications in statistics because many commonly used classes of discrete random variables can be viewed as algebraic varieties
.

Introductory example

Consider a random variable X which can take on the values 0, 1, 2. Such a variable is completely characterized by the three probabilities

and these numbers satisfy

Conversely, any three such numbers unambiguously specify a random variable, so we can identify the random variable X with the tuple (p0,p1,p2)∈R3.

Now suppose X is a

binomial random variable
with parameter q and n = 2, i.e. X represents the number of successes when repeating a certain experiment two times, where each experiment has an individual success probability of q. Then

and it is not hard to show that the tuples (p0,p1,p2) which arise in this way are precisely the ones satisfying

The latter is a polynomial equation defining an algebraic variety (or surface) in R3, and this variety, when intersected with the simplex given by

yields a piece of an algebraic curve which may be identified with the set of all 3-state Bernoulli variables. Determining the parameter q amounts to locating one point on this curve; testing the hypothesis that a given variable X is Bernoulli amounts to testing whether a certain point lies on that curve or not.

Application of algebraic geometry to statistical learning theory

Algebraic geometry has also recently found applications to

singular statistical models.[2]

References

  1. ^ A gap in Garrett Birkhoff's original proof was filled by Alexander Ostrowski.
  2. ^ Watanabe, Sumio. "Why algebraic geometry?".

External links