Ergodicity
In mathematics, ergodicity expresses the idea that a point of a moving system, either a dynamical system or a stochastic process, will eventually visit all parts of the space that the system moves in, in a uniform and random sense. This implies that the average behavior of the system can be deduced from the trajectory of a "typical" point. Equivalently, a sufficiently large collection of random samples from a process can represent the average statistical properties of the entire process. Ergodicity is a property of the system; it is a statement that the system cannot be reduced or factored into smaller components. Ergodic theory is the study of systems possessing ergodicity.
Ergodic systems occur in a broad range of systems in
Ergodic systems capture the common-sense, every-day notions of randomness, such that smoke might come to fill all of a smoke-filled room, or that a block of metal might eventually come to have the same temperature throughout, or that flips of a fair coin may come up heads and tails half the time. A stronger concept than ergodicity is that of mixing, which aims to mathematically describe the common-sense notions of mixing, such as mixing drinks or mixing cooking ingredients.
The proper mathematical formulation of ergodicity is founded on the formal definitions of
Informal explanation
Ergodicity occurs in broad settings in physics and mathematics. All of these settings are unified by a common mathematical description, that of the measure-preserving dynamical system. Equivalently, ergodicity can be understood in terms of stochastic processes. They are one and the same, despite using dramatically different notation and language.
Measure-preserving dynamical systems
The mathematical definition of ergodicity aims to capture ordinary every-day ideas about
The set is understood to be the total space to be filled: the mixing bowl, the smoke-filled room, etc. The measure is understood to define the natural volume of the space and of its subspaces. The collection of subspaces is denoted by , and the size of any given subset is ; the size is its volume. Naively, one could imagine to be the power set of ; this doesn't quite work, as not all subsets of a space have a volume (famously, the
The time evolution of the system is described by a map . Given some subset , its map will in general be a deformed version of – it is squashed or stretched, folded or cut into pieces. Mathematical examples include the baker's map and the horseshoe map, both inspired by bread-making. The set must have the same volume as ; the squashing/stretching does not alter the volume of the space, only its distribution. Such a system is "measure-preserving" (area-preserving, volume-preserving).
A formal difficulty arises when one tries to reconcile the volume of sets with the need to preserve their size under a map. The problem arises because, in general, several different points in the domain of a function can map to the same point in its range; that is, there may be with . Worse, a single point has no size. These difficulties can be avoided by working with the inverse map ; it will map any given subset to the parts that were assembled to make it: these parts are . It has the important property of not losing track of where things came from. More strongly, it has the important property that any (measure-preserving) map is the inverse of some map . The proper definition of a volume-preserving map is one for which because describes all the pieces-parts that came from.
One is now interested in studying the time evolution of the system. If a set eventually comes to fill all of over a long period of time (that is, if approaches all of for large ), the system is said to be
Mixing is a stronger statement than ergodicity. Mixing asks for this ergodic property to hold between any two sets , and not just between some set and . That is, given any two sets , a system is said to be (topologically) mixing if there is an integer such that, for all and , one has that . Here, denotes
Ergodic Processes
The above discussion appeals to a physical sense of a volume. The volume does not have to literally be some portion of
The idea of a volume can be very abstract. Consider, for example, the set of all possible coin-flips: the set of infinite sequences of heads and tails. Assigning the volume of 1 to this space, it is clear that half of all such sequences start with heads, and half start with tails. One can slice up this volume in other ways: one can say "I don't care about the first coin-flips; but I want the 'th of them to be heads, and then I don't care about what comes after that". This can be written as the set where is "don't care" and is "heads". The volume of this space is again one-half.
The above is enough to build up a measure-preserving dynamical system, in its entirety. The sets of or occurring in the 'th place are called cylinder sets. The set of all possible intersections, unions and complements of the cylinder sets then form the Borel set defined above. In formal terms, the cylinder sets form the
For the coin-flip process, the time-evolution operator is the shift operator that says "throw away the first coin-flip, and keep the rest". Formally, if is a sequence of coin-flips, then . The measure is obviously shift-invariant: as long as we are talking about some set where the first coin-flip is the "don't care" value, then the volume does not change: . In order to avoid talking about the first coin-flip, it is easier to define as inserting a "don't care" value into the first position: . With this definition, one obviously has that with no constraints on . This is again an example of why is used in the formal definitions.
The above development takes a random process, the Bernoulli process, and converts it to a measure-preserving dynamical system The same conversion (equivalence, isomorphism) can be applied to any stochastic process. Thus, an informal definition of ergodicity is that a sequence is ergodic if it visits all of ; such sequences are "typical" for the process. Another is that its statistical properties can be deduced from a single, sufficiently long, random sample of the process (thus uniformly sampling all of ), or that any collection of random samples from a process must represent the average statistical properties of the entire process (that is, samples drawn uniformly from are representative of as a whole.) In the present example, a sequence of coin flips, where half are heads, and half are tails, is a "typical" sequence.
There are several important points to be made about the Bernoulli process. If one writes 0 for tails and 1 for heads, one gets the set of all infinite strings of binary digits. These correspond to the base-two expansion of real numbers. Explicitly, given a sequence , the corresponding real number is
The statement that the Bernoulli process is ergodic is equivalent to the statement that the real numbers are uniformly distributed. The set of all such strings can be written in a variety of ways: This set is the Cantor set, sometimes called the Cantor space to avoid confusion with the Cantor function
In the end, these are all "the same thing".
The Cantor set plays key roles in many branches of mathematics. In recreational mathematics, it underpins the
The Ornstein isomorphism theorem states that every stationary stochastic process is equivalent to a Bernoulli scheme (a Bernoulli process with an N-sided (and possibly unfair) gaming die). Other results include that every non-dissipative ergodic system is equivalent to the Markov odometer, sometimes called an "adding machine" because it looks like elementary-school addition, that is, taking a base-N digit sequence, adding one, and propagating the carry bits. The proof of equivalence is very abstract; understanding the result is not: by adding one at each time step, every possible state of the odometer is visited, until it rolls over, and starts again. Likewise, ergodic systems visit each state, uniformly, moving on to the next, until they have all been visited.
Systems that generate (infinite) sequences of N letters are studied by means of
History and etymology
The term ergodic is commonly thought to derive from the Greek words ἔργον (ergon: "work") and ὁδός (hodos: "path", "way"), as chosen by Ludwig Boltzmann while he was working on a problem in statistical mechanics.[2] At the same time it is also claimed to be a derivation of ergomonode, coined by Boltzmann in a relatively obscure paper from 1884. The etymology appears to be contested in other ways as well.[3]
The idea of ergodicity was born in the field of
For example, in
In
Ergodicity in physics and geometry
A review of ergodicity in physics, and in geometry follows. In all cases, the notion of ergodicity is exactly the same as that for dynamical systems; there is no difference, except for outlook, notation, style of thinking and the journals where results are published.
Physical systems can be split into three categories: classical mechanics, which describes machines with a finite number of moving parts, quantum mechanics, which describes the structure of atoms, and statistical mechanics, which describes gases, liquids, solids; this includes condensed matter physics. These presented below.
In statistical mechanics
This section reviews ergodicity in statistical mechanics. The above abstract definition of a volume is required as the appropriate setting for definitions of ergodicity in physics. Consider a container of liquid, or gas, or plasma, or other collection of atoms or particles. Each and every particle has a 3D position, and a 3D velocity, and is thus described by six numbers: a point in six-dimensional space If there are of these particles in the system, a complete description requires numbers. Any one system is just a single point in The physical system is not all of , of course; if it's a box of width, height and length then a point is in Nor can velocities be infinite: they are scaled by some probability measure, for example the Boltzmann–Gibbs measure for a gas. Nonetheless, for close to the
A physical system is said to be ergodic if any representative point of the system eventually comes to visit the entire volume of the system. For the above example, this implies that any given atom not only visits every part of the box with uniform probability, but it does so with every possible velocity, with probability given by the Boltzmann distribution for that velocity (so, uniform with respect to that measure). The
Formal mathematical proofs of ergodicity in statistical physics are hard to come by; most high-dimensional many-body systems are assumed to be ergodic, without mathematical proof. Exceptions include the
Simple dynamical systems
The formal study of ergodicity can be approached by examining fairly simple dynamical systems. Some of the primary ones are listed here.
The
In classical mechanics and geometry
Ergodicity is a widespread phenomenon in the study of
The
For non-flat surfaces, one has that the
These results extend to higher dimensions. The geodesic flow for negatively curved compact
with the (inverse of the) metric tensor and the momentum. The resemblance to the kinetic energy of a point particle is hardly accidental; this is the whole point of calling such things "energy". In this sense, chaotic behavior with ergodic orbits is a more-or-less generic phenomenon in large tracts of geometry.
Ergodicity results have been provided in translation surfaces, hyperbolic groups and systolic geometry. Techniques include the study of ergodic flows, the Hopf decomposition, and the Ambrose–Kakutani–Krengel–Kubo theorem. An important class of systems are the Axiom A systems.
A number of both classification and "anti-classification" results have been obtained. The
In wave mechanics
All of the previous sections considered ergodicty either from the point of view of a measurable dynamical system, or from the dual notion of tracking the motion of individual particle trajectories. A closely related concept occurs in (non-linear)
A resonant interaction is possible whenever the dispersion relations for the wave media allow three or more normal modes to sum in such a way as to conserve both the total momentum and the total energy. This allows energy concentrated in one mode to bleed into other modes, eventually distributing that energy uniformly across all interacting modes.
Resonant interactions between waves helps provide insight into the distinction between high-dimensional chaos (that is, turbulence) and thermalization. When normal modes can be combined so that energy and momentum are exactly conserved, then the theory of resonant interactions applies, and energy spreads into all of the interacting modes. When the dispersion relations only allow an approximate balance, turbulence or chaotic motion results. The turbulent modes can then transfer energy into modes that do mix, eventually leading to thermalization, but not before a preceding interval of chaotic motion.
In quantum mechanics
As to quantum mechanics, there is no universal quantum definition of ergodocity or even chaos (see quantum chaos).[6] However, there is a quantum ergodicity theorem stating that the expectation value of an operator converges to the corresponding microcanonical classical average in the semiclassical limit . Nevertheless, the theorem does not imply that all eigenstates of the Hamiltionian whose classical counterpart is chaotic are features and random. For example, the quantum ergodicity theorem do not exclude the existence of non-ergodic states such as quantum scars. In addition to the conventional scarring,[7][8][9][10] there are two other types of quantum scarring, which further illustrate the weak-ergodicity breaking in quantum chaotic systems: perturbation-induced[11][12][13][14][15] and many-body quantum scars.[16]
Definition for discrete-time systems
Ergodic measures provide one of the cornerstones with which ergodicity is generally discussed. A formal definition follows.
Invariant measure
Let be a measurable space. If is a measurable function from to itself and a probability measure on , then a measure-preserving dynamical system is defined as a dynamical system for which for all . Such a is said to preserve equivalently, that is -invariant.
Ergodic measure
A measurable function is said to be -ergodic or that is an ergodic measure for if preserves and the following condition holds:
- For any such that either or .
In other words, there are no -invariant subsets up to measure 0 (with respect to ).
Some authors[17] relax the requirement that preserves to the requirement that is a non-singular transformation with respect to , meaning that if is a subset of zero measure then so is .
Examples
The simplest example is when is a finite set and the counting measure. Then a self-map of preserves if and only if it is a bijection, and it is ergodic if and only if has only one orbit (that is, for every there exists such that ). For example, if then the cycle is ergodic, but the permutation is not (it has the two invariant subsets and ).
Equivalent formulations
The definition given above admits the following immediate reformulations:
- for every with we have or (where denotes the symmetric difference);
- for every with positive measure we have ;
- for every two sets of positive measure, there exists such that ;
- Every measurable function with is constant on a subset of full measure.
Importantly for applications, the condition in the last characterisation can be restricted to square-integrable functions only:
- If and then is constant almost everywhere.
Further examples
Bernoulli shifts and subshifts
Let be a finite set and with the product measure (each factor being endowed with its counting measure). Then the shift operator defined by is -ergodic.[18]
There are many more ergodic measures for the shift map on . Periodic sequences give finitely supported measures. More interestingly, there are infinitely-supported ones which are subshifts of finite type.
Irrational rotations
Let be the unit circle , with its Lebesgue measure . For any the rotation of of angle is given by . If then is not ergodic for the Lebesgue measure as it has infinitely many finite orbits. On the other hand, if is irrational then is ergodic.[19]
Arnold's cat map
Let be the 2-torus. Then any element defines a self-map of since . When one obtains the so-called Arnold's cat map, which is ergodic for the Lebesgue measure on the torus.
Ergodic theorems
If is a probability measure on a space which is ergodic for a transformation the pointwise ergodic theorem of G. Birkhoff states that for every measurable functions and for -almost every point the time average on the orbit of converges to the space average of . Formally this means that
The mean ergodic theorem of J. von Neumann is a similar, weaker statement about averaged translates of square-integrable functions.
Related properties
Dense orbits
An immediate consequence of the definition of ergodicity is that on a topological space , and if is the σ-algebra of Borel sets, if is -ergodic then -almost every orbit of is dense in the support of .
This is not an equivalence since for a transformation which is not uniquely ergodic, but for which there is an ergodic measure with full support , for any other ergodic measure the measure is not ergodic for but its orbits are dense in the support. Explicit examples can be constructed with shift-invariant measures.[20]
Mixing
A transformation of a probability measure space is said to be mixing for the measure if for any measurable sets the following holds:
It is immediate that a mixing transformation is also ergodic (taking to be a -stable subset and its complement). The converse is not true, for example a rotation with irrational angle on the circle (which is ergodic per the examples above) is not mixing (for a sufficiently small interval its successive images will not intersect itself most of the time). Bernoulli shifts are mixing, and so is Arnold's cat map.
This notion of mixing is sometimes called strong mixing, as opposed to weak mixing which means that
Proper ergodicity
The transformation is said to be properly ergodic if it does not have an orbit of full measure. In the discrete case this means that the measure is not supported on a finite orbit of .
Definition for continuous-time dynamical systems
The definition is essentially the same for
- For any , if for all we have then either or .
Examples
As in the discrete case the simplest example is that of a transitive action, for instance the action on the circle given by is ergodic for Lebesgue measure.
An example with infinitely many orbits is given by the flow along an irrational slope on the torus: let and . Let ; then if this is ergodic for the Lebesgue measure.
Ergodic flows
Further examples of ergodic flows are:
- Billiards in convex Euclidean domains;
- the geodesic flowof a negatively curved Riemannian manifold of finite volume is ergodic (for the normalised volume measure);
- the horocycle flow on a hyperbolic manifoldof finite volume is ergodic (for the normalised volume measure)
Ergodicity in compact metric spaces
If is a compact metric space it is naturally endowed with the σ-algebra of Borel sets. The additional structure coming from the topology then allows a much more detailed theory for ergodic transformations and measures on .
Functional analysis interpretation
A very powerful alternate definition of ergodic measures can be given using the theory of Banach spaces. Radon measures on form a Banach space of which the set of probability measures on is a convex subset. Given a continuous transformation of the subset of -invariant measures is a closed convex subset, and a measure is ergodic for if and only if it is an extreme point of this convex.[21]
Existence of ergodic measures
In the setting above it follows from the
Ergodic decomposition
In general an invariant measure need not be ergodic, but as a consequence of
Example
In the case of and the counting measure is not ergodic. The ergodic measures for are the
Continuous systems
Everything in this section transfers verbatim to continuous actions of or on compact metric spaces.
Unique ergodicity
The transformation is said to be uniquely ergodic if there is a unique Borel probability measure on which is ergodic for .
In the examples considered above, irrational rotations of the circle are uniquely ergodic;[23] shift maps are not.
Probabilistic interpretation: ergodic processes
If is a discrete-time stochastic process on a space , it is said to be ergodic if the
The simplest case is that of an
A similar interpretation holds for continuous-time stochastic processes though the construction of the measurable structure of the action is more complicated.
Ergodicity of Markov chains
The dynamical system associated with a Markov chain
Let be a finite set. A Markov chain on is defined by a matrix , where is the transition probability from to , so for every we have . A stationary measure for is a probability measure on such that ; that is for all .
Using this data we can define a probability measure on the set with its product σ-algebra by giving the measures of the cylinders as follows:
Stationarity of then means that the measure is invariant under the shift map .
Criterion for ergodicity
The measure is always ergodic for the shift map if the associated Markov chain is irreducible (any state can be reached with positive probability from any other state in a finite number of steps).[24]
The hypotheses above imply that there is a unique stationary measure for the Markov chain. In terms of the matrix a sufficient condition for this is that 1 be a simple eigenvalue of the matrix and all other eigenvalues of (in ) are of modulus <1.
Note that in probability theory the Markov chain is called ergodic if in addition each state is aperiodic (the times where the return probability is positive are not multiples of a single integer >1). This is not necessary for the invariant measure to be ergodic; hence the notions of "ergodicity" for a Markov chain and the associated shift-invariant measure are different (the one for the chain is strictly stronger).[25]
Moreover the criterion is an "if and only if" if all communicating classes in the chain are recurrent and we consider all stationary measures.
Examples
Counting measure
If for all then the stationary measure is the counting measure, the measure is the product of counting measures. The Markov chain is ergodic, so the shift example from above is a special case of the criterion.
Non-ergodic Markov chains
Markov chains with recurring communicating classes are not irreducible are not ergodic, and this can be seen immediately as follows. If are two distinct recurrent communicating classes there are nonzero stationary measures supported on respectively and the subsets and are both shift-invariant and of measure 1.2 for the invariant probability measure . A very simple example of that is the chain on given by the matrix (both states are stationary).
A periodic chain
The Markov chain on given by the matrix is irreducible but periodic. Thus it is not ergodic in the sense of Markov chain though the associated measure on is ergodic for the shift map. However the shift is not mixing for this measure, as for the sets
and
we have but
Generalisations
The definition of ergodicity also makes sense for group actions. The classical theory (for invertible transformations) corresponds to actions of or .
For non-abelian groups there might not be invariant measures even on compact metric spaces. However the definition of ergodicity carries over unchanged if one replaces invariant measures by quasi-invariant measures.
Important examples are the action of a
A measurable equivalence relation it is said to be ergodic if all saturated subsets are either null or conull.
Notes
- ^ Achim Klenke, "Probability Theory: A Comprehensive Course" (2013) Springer Universitext ISBN 978-1-4471-5360-3 DOI 10.1007/978-1-4471-5361-0 (See Chapter One)
- ^ Walters 1982, §0.1, p. 2
- S2CID 17605281.
- ISBN 978-81-265-1806-7.
- .
- ISBN 978-0-521-02715-1.
- .
- S2CID 250793219.
- S2CID 120635994.
- OCLC 1034625177.
- S2CID 208248295.
- PMID 27892510.
- S2CID 119083672.
- S2CID 51693305.
- ISBN 978-952-03-1699-0.
- S2CID 256706206.
- )
- ^ Walters 1982, p. 32.
- ^ Walters 1982, p. 29.
- ^ "Example of a measure-preserving system with dense orbits that is not ergodic". MathOverflow. September 1, 2011. Retrieved May 16, 2020.
- ^ Walters 1982, p. 152.
- ^ Walters 1982, p. 153.
- ^ Walters 1982, p. 159.
- ^ Walters 1982, p. 42.
- ^ "Different uses of the word "ergodic"". MathOverflow. September 4, 2011. Retrieved May 16, 2020.
References
- Walters, Peter (1982). An Introduction to Ergodic Theory. ISBN 0-387-95152-0.
- Brin, Michael; Garrett, Stuck (2002). Introduction to Dynamical Systems. Cambridge University Press. ISBN 0-521-80841-3.
External links
- Karma Dajani and Sjoerd Dirksin, "A Simple Introduction to Ergodic Theory"