Stochastic game

In

limit inferior

of the averages of the stage payoffs.

Stochastic games generalize Markov decision processes to multiple interacting decision makers, as well as strategic-form games to dynamic situations in which the environment changes in response to the players’ choices.^[2]

Two-player games

Stochastic two-player games on directed graphs are widely used for modeling and analysis of discrete systems operating in an unknown (adversarial) environment^{[citation needed]}. Possible configurations of a system and its environment are represented as vertices, and the transitions correspond to actions of the system, its environment, or "nature". A run of the system then corresponds to an infinite path in the graph. Thus, a system and its environment can be seen as two players with antagonistic objectives, where one player (the system) aims at maximizing the probability of "good" runs, while the other player (the environment) aims at the opposite.

In many cases, there exists an equilibrium value of this probability, but optimal strategies for both players may not exist.

We introduce basic concepts and algorithmic questions studied in this area, and we mention some long-standing open problems. Then, we mention selected recent results.

Theory

The ingredients of a stochastic game are: a finite set of players $I$ ; a state space $S$ (either a finite set or a measurable space $(S,{\mathcal {S}})$ ); for each player $i\in I$ , an action set $A^{i}$ (either a finite set or a measurable space $(A^{i},{\mathcal {A}}^{i})$ ); a transition probability $P$ from $S\times A$ , where $A=\times _{i\in I}A^{i}$ is the action profiles, to $S$ , where $P(S\mid s,a)$ is the probability that the next state is in $S$ given the current state $s$ and the current action profile $a$ ; and a payoff function $g$ from $S\times A$ to $R^{I}$ , where the $i$ -th coordinate of $g$ , $g^{i}$ , is the payoff to player $i$ as a function of the state $s$ and the action profile $a$ .

The game starts at some initial state $s_{1}$ . At stage $t$ , players first observe $s_{t}$ , then simultaneously choose actions $a_{t}^{i}\in A^{i}$ , then observe the action profile $a_{t}=(a_{t}^{i})_{i}$ , and then nature selects $s_{t+1}$ according to the probability $P(\cdot \mid s_{t},a_{t})$ . A play of the stochastic game, $s_{1},a_{1},\ldots ,s_{t},a_{t},\ldots$ , defines a stream of payoffs $g_{1},g_{2},\ldots$ , where $g_{t}=g(s_{t},a_{t})$ .

The discounted game $\Gamma _{\lambda }$ with discount factor $\lambda$ ( $0<\lambda \leq 1$ ) is the game where the payoff to player $i$ is $\lambda \sum _{t=1}^{\infty }(1-\lambda )^{t-1}g_{t}^{i}$ . The $n$ -stage game is the game where the payoff to player $i$ is ${\bar {g}}_{n}^{i}:={\frac {1}{n}}\sum _{t=1}^{n}g_{t}^{i}$ .

The value $v_{n}(s_{1})$ , respectively $v_{\lambda }(s_{1})$ , of a two-person zero-sum stochastic game $\Gamma _{n}$ , respectively $\Gamma _{\lambda }$ , with finitely many states and actions exists, and Truman Bewley and Elon Kohlberg (1976) proved that $v_{n}(s_{1})$ converges to a limit as $n$ goes to infinity and that $v_{\lambda }(s_{1})$ converges to the same limit as $\lambda$ goes to $0$ .

The "undiscounted" game $\Gamma _{\infty }$ is the game where the payoff to player $i$ is the "limit" of the averages of the stage payoffs. Some precautions are needed in defining the value of a two-person zero-sum $\Gamma _{\infty }$ and in defining equilibrium payoffs of a non-zero-sum $\Gamma _{\infty }$ . The uniform value $v_{\infty }$ of a two-person zero-sum stochastic game $\Gamma _{\infty }$ exists if for every $\varepsilon >0$ there is a positive integer $N$ and a strategy pair $\sigma _{\varepsilon }$ of player 1 and $\tau _{\varepsilon }$ of player 2 such that for every $\sigma$ and $\tau$ and every $n\geq N$ the expectation of ${\bar {g}}_{n}^{i}$ with respect to the probability on plays defined by $\sigma _{\varepsilon }$ and $\tau$ is at least $v_{\infty }-\varepsilon$ , and the expectation of ${\bar {g}}_{n}^{i}$ with respect to the probability on plays defined by $\sigma$ and $\tau _{\varepsilon }$ is at most $v_{\infty }+\varepsilon$ . Jean-François Mertens and Abraham Neyman (1981) proved that every two-person zero-sum stochastic game with finitely many states and actions has a uniform value.^[3]

If there is a finite number of players and the action sets and the set of states are finite, then a stochastic game with a finite number of stages always has a Nash equilibrium. The same is true for a game with infinitely many stages if the total payoff is the discounted sum.

The non-zero-sum stochastic game $\Gamma _{\infty }$ has a uniform equilibrium payoff $v_{\infty }$ if for every $\varepsilon >0$ there is a positive integer $N$ and a strategy profile $\sigma$ such that for every unilateral deviation by a player $i$ , i.e., a strategy profile $\tau$ with $\sigma ^{j}=\tau ^{j}$ for all $j\neq i$ , and every $n\geq N$ the expectation of ${\bar {g}}_{n}^{i}$ with respect to the probability on plays defined by $\sigma$ is at least $v_{\infty }^{i}-\varepsilon$ , and the expectation of ${\bar {g}}_{n}^{i}$ with respect to the probability on plays defined by $\tau$ is at most $v_{\infty }^{i}+\varepsilon$ . Nicolas Vieille has shown that all two-person stochastic games with finite state and action spaces have a uniform equilibrium payoff.^[4]

The non-zero-sum stochastic game $\Gamma _{\infty }$ has a limiting-average equilibrium payoff $v_{\infty }$ if for every $\varepsilon >0$ there is a strategy profile $\sigma$ such that for every unilateral deviation by a player $i$ , the expectation of the limit inferior of the averages of the stage payoffs with respect to the probability on plays defined by $\sigma$ is at least $v_{\infty }^{i}-\varepsilon$ , and the expectation of the limit superior of the averages of the stage payoffs with respect to the probability on plays defined by $\tau$ is at most $v_{\infty }^{i}+\varepsilon$ . Jean-François Mertens and Abraham Neyman (1981) proves that every two-person zero-sum stochastic game with finitely many states and actions has a limiting-average value,^[3] and Nicolas Vieille has shown that all two-person stochastic games with finite state and action spaces have a limiting-average equilibrium payoff.^[4] In particular, these results imply that these games have a value and an approximate equilibrium payoff, called the liminf-average (respectively, the limsup-average) equilibrium payoff, when the total payoff is the limit inferior (or the limit superior) of the averages of the stage payoffs.

Whether every stochastic game with finitely many players, states, and actions, has a uniform equilibrium payoff, or a limiting-average equilibrium payoff, or even a liminf-average equilibrium payoff, is a challenging open question.

A

sub-game perfect Nash equilibrium

to stochastic games.

Stochastic games have been combined with Bayesian games to model uncertainty over player strategies.^[5] The resulting stochastic Bayesian game model is solved via a recursive combination of the Bayesian Nash equilibrium equation and the Bellman optimality equation.

Applications

Stochastic games have applications in economics, evolutionary biology and computer networks.^[6]^[7] They are generalizations of repeated games which correspond to the special case where there is only one state.

Notes

PMID 16589380
.

PMID 26556883
.

^
S2CID 189830419
.

^
ISBN 0-444-88098-4
.

S2CID 2599762
.

^ Constrained Stochastic Games in Wireless Networks by E.Altman, K.Avratchenkov, N.Bonneau, M.Debbah, R.El-Azouzi, D.S.Menasche

S2CID 16055840
.

Further reading

Filar, J. & Vrieze, K. (1997). Competitive Markov Decision Processes. Springer-Verlag.
ISBN 0-387-94805-8
.

Neyman, A. & Sorin, S. (2003). Stochastic Games and Applications. Dordrecht: Kluwer Academic Press.
ISBN 1-4020-1492-9
.

Yoav Shoham; Kevin Leyton-Brown (2009). Multiagent systems: algorithmic, game-theoretic, and logical foundations. Cambridge University Press. pp. 153–156.
ISBN 978-0-521-89943-7
. (suitable for undergraduates; main results, no proofs)

External links

Lecture on Stochastic Two-Player Games by Antonin Kucera

v
t
e
Topics in game theory
Definitions

Congestion game

Cooperative game

Determinacy

Escalation of commitment

Extensive-form game

First-player and second-player win

Game complexity

Graphical game

Hierarchy of beliefs

Information set

Normal-form game

Preference

Sequential game

Simultaneous game

Simultaneous action selection

Solved game

Succinct game

Equilibrium
concepts

Bayes correlated equilibrium

Bayesian Nash equilibrium

Berge equilibrium

Core

Correlated equilibrium

Epsilon-equilibrium

Evolutionarily stable strategy

Gibbs equilibrium

Mertens-stable equilibrium

Markov perfect equilibrium

Nash equilibrium

Pareto efficiency

Perfect Bayesian equilibrium

Proper equilibrium

Quantal response equilibrium

Quasi-perfect equilibrium

Risk dominance

Satisfaction equilibrium

Self-confirming equilibrium

Sequential equilibrium

Shapley value

Strong Nash equilibrium

Subgame perfection

Trembling hand

Strategies

Backward induction

Bid shading

Collusion

Forward induction

Grim trigger

Markov strategy

Dominant strategies

Pure strategy

Mixed strategy

Strategy-stealing argument

Tit for tat

Classes
of games

Bargaining problem

Cheap talk

Global game

Intransitive game

Mean-field game

Mechanism design

n-player game

Perfect information

Large Poisson game

Potential game

Repeated game

Screening game

Signaling game

Stackelberg competition

Strictly determined game

Stochastic game

Symmetric game

Zero-sum game

Games

Go

Chess

Infinite chess

Checkers

Tic-tac-toe

Prisoner's dilemma

Gift-exchange game

Optional prisoner's dilemma

Traveler's dilemma

Coordination game

Chicken

Centipede game

Lewis signaling game

Volunteer's dilemma

Dollar auction

Battle of the sexes

Stag hunt

Matching pennies

Ultimatum game

Rock paper scissors

Pirate game

Dictator game

Public goods game

Blotto game

War of attrition

El Farol Bar problem

Fair division

Fair cake-cutting

Cournot game

Deadlock

Diner's dilemma

Guess 2/3 of the average

Kuhn poker

Nash bargaining game

Induction puzzles

Trust game

Princess and monster game

Rendezvous problem

Theorems

Arrow's impossibility theorem

Aumann's agreement theorem

Folk theorem

Minimax theorem

Nash's theorem

Negamax theorem

Purification theorem

Revelation principle

Sprague–Grundy theorem

Zermelo's theorem

Key
figures

Albert W. Tucker

Amos Tversky

Antoine Augustin Cournot

Ariel Rubinstein

Claude Shannon

Daniel Kahneman

David K. Levine

David M. Kreps

Donald B. Gillies

Drew Fudenberg

Eric Maskin

Harold W. Kuhn

Herbert Simon

Hervé Moulin

John Conway

Jean Tirole

Jean-François Mertens

Jennifer Tour Chayes

John Harsanyi

John Maynard Smith

John Nash

John von Neumann

Kenneth Arrow

Kenneth Binmore

Leonid Hurwicz

Lloyd Shapley

Melvin Dresher

Merrill M. Flood

Olga Bondareva

Oskar Morgenstern

Paul Milgrom

Peyton Young

Reinhard Selten

Robert Axelrod

Robert Aumann

Robert B. Wilson

Roger Myerson

Samuel Bowles

Suzanne Scotchmer

Thomas Schelling

William Vickrey

Miscellaneous

All-pay auction

Alpha–beta pruning

Bertrand paradox

Bounded rationality

Combinatorial game theory

Confrontation analysis

Coopetition

Evolutionary game theory

First-move advantage in chess

Glossary of game theory

List of game theorists

List of games in game theory

No-win situation

Solving chess

Topological game

Tragedy of the commons

Tyranny of small decisions

Retrieved from "https://en.wikipedia.org/w/index.php?title=Stochastic_game&oldid=1190096170"

[1] PMID 16589380
.

[2] PMID 26556883
.

[MertensNeyman-3] 
S2CID 189830419
.

[Vieille-4] 
ISBN 0-444-88098-4
.

[5] S2CID 2599762
.

[6] Constrained Stochastic Games in Wireless Networks by E.Altman, K.Avratchenkov, N.Bonneau, M.Debbah, R.El-Azouzi, D.S.Menasche

[7] S2CID 16055840
.

[2]

[3]

[4]

[5]

[6]

[7]

Two-player games

Theory

Applications

See also

Notes

Further reading

External links