AlphaZero

AlphaZero is a

DeepMind to master the games of chess, shogi and go. This algorithm uses an approach similar to AlphaGo Zero

On December 5, 2017, the DeepMind team released a

Elo rating than Stockfish 8; after nine hours of training, the algorithm defeated Stockfish 8 in a time-controlled 100-game tournament (28 wins, 0 losses, and 72 draws).^[1]^[2]^[3]

The trained algorithm played on a single machine with four TPUs.

DeepMind's paper on AlphaZero was published in the journal Science on 7 December 2018;^[4] however, the AlphaZero program itself has not been made available to the public.^[5] In 2019, DeepMind published a new paper detailing MuZero, a new algorithm able to generalise AlphaZero's work, playing both Atari and board games without knowledge of the rules or representations of the game.^[6]

Relation to AlphaGo Zero

AlphaZero (AZ) is a more generalized variant of the AlphaGo Zero (AGZ) algorithm, and is able to play shogi and chess as well as Go. Differences between AZ and AGZ include:^[1]

AZ has hard-coded rules for setting search hyperparameters.
The neural network is now updated continually.
AZ doesn't use symmetries, unlike AGZ.
Chess or Shogi can end in a draw unlike Go; therefore, AlphaZero takes into account the possibility of a drawn game.

Stockfish and Elmo

Comparing Monte Carlo tree search searches, AlphaZero searches just 80,000 positions per second in chess and 40,000 in shogi, compared to 70 million for Stockfish and 35 million for Elmo. AlphaZero compensates for the lower number of evaluations by using its deep neural network to focus much more selectively on the most promising variation.^[1]

Training

AlphaZero was trained solely via

self-play, using 5,000 first-generation TPUs to generate the games and 64 second-generation TPUs to train the neural networks. In parallel, the in-training AlphaZero was periodically matched against its benchmark (Stockfish, Elmo, or AlphaGo Zero) in brief one-second-per-move games to determine how well the training was progressing. DeepMind judged that AlphaZero's performance exceeded the benchmark after around four hours of training for Stockfish, two hours for Elmo, and eight hours for AlphaGo Zero.^[1]

Preliminary results

Outcome

Chess

In AlphaZero's chess match against Stockfish 8 (2016

TPUs. In 100 games from the normal starting position, AlphaZero won 25 games as White, won 3 as Black, and drew the remaining 72.^[9] In a series of twelve, 100-game matches (of unspecified time or resource constraints) against Stockfish starting from the 12 most popular human openings, AlphaZero won 290, drew 886 and lost 24.^[1]

Shogi

AlphaZero was trained on shogi for a total of two hours before the tournament. In 100 shogi games against Elmo (World Computer Shogi Championship 27 summer 2017 tournament version with YaneuraOu 4.73 search), AlphaZero won 90 times, lost 8 times and drew twice.[9] As in the chess games, each program got one minute per move, and Elmo was given 64 threads and a hash size of 1 GB.^[1]

Go

After 34 hours of self-learning of Go and against AlphaGo Zero, AlphaZero won 60 games and lost 40.^[1]^[9]

Analysis

DeepMind stated in its preprint, "The game of chess represented the pinnacle of AI research over several decades. State-of-the-art programs are based on powerful engines that search many millions of positions, leveraging handcrafted domain expertise and sophisticated domain adaptations. AlphaZero is a generic reinforcement learning algorithm – originally devised for the game of go – that achieved superior results within a few hours, searching a thousand times fewer positions, given no domain knowledge except the rules."^[1] DeepMind's Demis Hassabis, a chess player himself, called AlphaZero's play style "alien": It sometimes wins by offering counterintuitive sacrifices, like offering up a queen and bishop to exploit a positional advantage. "It's like chess from another dimension."^[10]

Given the difficulty in chess of forcing a win against a strong opponent, the +28 –0 =72 result is a significant margin of victory. However, some grandmasters, such as Hikaru Nakamura and Komodo developer Larry Kaufman, downplayed AlphaZero's victory, arguing that the match would have been closer if the programs had access to an opening database (since Stockfish was optimized for that scenario).^[11] Romstad additionally pointed out that Stockfish is not optimized for rigidly fixed-time moves and the version used was a year old.^[8]^[12]

Similarly, some shogi observers argued that the Elmo hash size was too low, that the resignation settings and the "EnteringKingRule" settings (cf. shogi § Entering King) may have been inappropriate, and that Elmo is already obsolete compared with newer programs.^[13]^[14]

Reaction and criticism

Papers headlined that the chess training took only four hours: "It was managed in little more than the time between breakfast and lunch."^[2]^[15] Wired described AlphaZero as "the first multi-skilled AI board-game champ".^[16] AI expert Joanna Bryson noted that Google's "knack for good publicity" was putting it in a strong position against challengers. "It's not only about hiring the best programmers. It's also very political, as it helps make Google as strong as possible when negotiating with governments and regulators looking at the AI sector."^[9]

Human chess grandmasters generally expressed excitement about AlphaZero. Danish grandmaster

champion Garry Kasparov said, "It's a remarkable achievement, even if we should have expected it after AlphaGo."^[11]^[17]

Grandmaster Hikaru Nakamura was less impressed, stating: "I don't necessarily put a lot of credibility in the results simply because my understanding is that AlphaZero is basically using the Google supercomputer and Stockfish doesn't run on that hardware; Stockfish was basically running on what would be my laptop. If you wanna have a match that's comparable you have to have Stockfish running on a supercomputer as well."^[8]

Top US correspondence chess player Wolff Morrow was also unimpressed, claiming that AlphaZero would probably not make the semifinals of a fair competition such as

Petroff Defence, AlphaZero would not be able to beat him in a correspondence chess game either.^[18]

Motohiro Isozaki, the author of YaneuraOu, noted that although AlphaZero did comprehensively beat Elmo, the rating of AlphaZero in shogi stopped growing at a point which is at most 100–200 higher than Elmo. This gap is not that high, and Elmo and other shogi software should be able to catch up in 1–2 years.[19]

Final results

DeepMind addressed many of the criticisms in their final version of the paper, published in December 2018 in

tensor processing units (TPUs), but only ran on four TPUs and a 44-core CPU in its matches.^[20]

Chess

In the final results, Stockfish 9 dev ran under the same conditions as in the TCEC superfinal: 44 CPU cores, Syzygy endgame tablebases, and a 32GB hash size. Instead of a fixed time control of one move per minute, both engines were given 3 hours plus 15 seconds per move to finish the game. In a 1000-game match, AlphaZero won with a score of 155 wins, 6 losses, and 839 draws. DeepMind also played a series of games using the TCEC opening positions; AlphaZero also won convincingly. Stockfish needed 10-to-1 time odds to match AlphaZero.^[21]

Shogi

Similar to Stockfish, Elmo ran under the same conditions as in the 2017 CSA championship. The version of Elmo used was WCSC27 in combination with YaneuraOu 2017 Early KPPT 4.79 64AVX2 TOURNAMENT. Elmo operated on the same hardware as Stockfish: 44 CPU cores and a 32GB hash size. AlphaZero won 98.2% of games when playing sente (i.e. having the first move) and 91.2% overall.

Reactions and criticisms

Human grandmasters were generally impressed with AlphaZero's games against Stockfish.^[21] Former world champion Garry Kasparov said it was a pleasure to watch AlphaZero play, especially since its style was open and dynamic like his own.^[22]^[23]
In the computer chess community,
alpha–beta search.^[24]

AlphaZero inspired the computer chess community to develop Leela Chess Zero, using the same techniques as AlphaZero. Leela contested several championships against Stockfish, where it showed roughly similar strength to Stockfish, although Stockfish has since pulled away.^[25]
In 2019 DeepMind published MuZero, a unified system that played excellent chess, shogi, and go, as well as games in the Atari Learning Environment, without being pre-programmed with their rules.^[26]^[27]

See also

AlphaGo

AlphaDev

AlphaFold

AlphaGeometry

General game playing
MuZero

Leela Chess Zero

Pluribus (poker bot)

Notes

^ Stockfish developer Tord Romstad responded with
The match results by themselves are not particularly meaningful because of the rather strange choice of time controls and Stockfish parameter settings: The games were played at a fixed time of 1 minute/move, which means that Stockfish has no use of its time management heuristics (lot of effort has been put into making Stockfish identify critical points in the game and decide when to spend some extra time on a move; at a fixed time per move, the strength will suffer significantly). The version of Stockfish used is one year old, was playing with far more search threads than has ever received any significant amount of testing, and had way too small hash tables for the number of threads. I believe the percentage of draws would have been much higher in a match with more normal conditions.^[8]

References

^
arXiv:1712.01815 [cs.AI
].

^
Telegraph.co.uk
. Retrieved December 6, 2017.

^ Vincent, James (December 6, 2017). "DeepMind's AI became a superhuman chess player in a few hours, just for fun". The Verge. Retrieved December 6, 2017.

^
PMID 30523106
.

^ "Chess Terms: AlphaZero". Chess.com. Retrieved July 30, 2022.

S2CID 208158225
.

^ "AlphaZero vs. Stockfish 2017".

^ ^a ^b ^c ^d "AlphaZero: Reactions From Top GMs, Stockfish Author". chess.com. December 8, 2017. Retrieved December 9, 2017.

^ ^a ^b ^c ^d ^e "'Superhuman' Google AI claims chess crown". BBC News. December 6, 2017. Retrieved December 7, 2017.

^ Knight, Will (December 8, 2017). "Alpha Zero's "Alien" Chess Shows the Power, and the Peculiarity, of AI". MIT Technology Review. Retrieved December 11, 2017.

^ ^a ^b "Google's AlphaZero Destroys Stockfish In 100-Game Match". Chess.com. Retrieved December 7, 2017.

^ Katyanna Quach. "DeepMind's AlphaZero AI clobbered rival chess app on non-level playing...board". The Register (December 14, 2017).

^ "Some concerns on the matching conditions between AlphaZero and Shogi engine". コンピュータ将棋レーティング. "uuunuuun" (a blogger who rates free shogi engines). Retrieved December 9, 2017. (via "瀧澤誠@elmo (@mktakizawa) | Twitter". mktakizawa (elmo developer). December 9, 2017. Retrieved December 11, 2017.)

^ "DeepMind社がやねうら王に注目し始めたようです". The developer of YaneuraOu, a search component used by elmo. December 7, 2017. Retrieved December 9, 2017.

The Times of London
. Retrieved December 7, 2017.

^ "Alphabet's Latest AI Show Pony Has More Than One Trick". WIRED. December 6, 2017. Retrieved December 7, 2017.

^ Gibbs, Samuel (December 7, 2017). "AlphaZero AI beats champion chess program after teaching itself in four hours". The Guardian. Retrieved December 8, 2017.

^ "Talking modern correspondence chess". Chessbase. June 26, 2018. Retrieved July 11, 2018.

^ DeepMind社がやねうら王に注目し始めたようです | やねうら王公式サイト, 2017年12月7日

^ As given in the Science paper, a TPU is "roughly similar in inference speed to a Titan V GPU, although the architectures are not directly comparable" (Ref. 24).

^ ^a ^b "AlphaZero Crushes Stockfish In New 1,000-Game Match". December 6, 2018.

^ Sean Ingle (December 11, 2018). "'Creative' AlphaZero leads way for chess computers and, maybe, science". The Guardian.

^ Albert Silver (December 7, 2018). "Inside the (deep) mind of AlphaZero". Chessbase.

^ "Komodo MCTS (Monte Carlo Tree Search) is the new star of TCEC". Chessdom. December 18, 2018.

TCEC and Leela Chess Zero
.

^ "Could Artificial Intelligence Save Us From Itself?". Fortune. 2019. Retrieved February 29, 2020.

^ "DeepMind's MuZero teaches itself how to win at Atari, chess, shogi, and Go". VentureBeat. November 20, 2019. Retrieved February 29, 2020.

External links

Chessprogramming wiki on AlphaZero

Chess.com Youtube playlist for AlphaZero vs. Stockfish

v
t
e
Google AI

Google

Google Brain

Google DeepMind

Computer programs
AlphaGo
Versions

AlphaGo (2015)

Master (2016)

AlphaGo Zero (2017)

AlphaZero (2017)

MuZero (2019)

Competitions

Fan Hui (2015)

Lee Sedol (2016)

Ke Jie (2017)

In popular culture

AlphaGo (2017)

The MANIAC (2023)

Other

AlphaFold (2018)

AlphaStar (2019)

AlphaDev (2023)

AlphaGeometry (2024)

Machine learning
Neural networks

WaveNet (2016)

Transformer (2017)

Gato (2022)

Other

Quantum Artificial Intelligence Lab

TensorFlow

Tensor Processing Unit

Generative AI
Chatbots

Assistant (2016)

Sparrow
(2022)

Gemini (2023)

Language models

BERT (2018)

LaMDA (2021)

Chinchilla (2022)

PaLM (2022)

Gemini (2023)

VideoPoet (2024)

Other

Vids (2024)

See also

"Attention Is All You Need"

Future of Go Summit

Generative pre-trained transformer

Google Labs

Google Pixel

Google Workspace

Category

Commons

v
t
e
Differentiable computing
General

Differentiable programming

Information geometry

Statistical manifold

Automatic differentiation

Neuromorphic engineering

Pattern recognition

Tensor calculus

Computational learning theory

Inductive bias

Concepts

Gradient descent
SGD

Clustering

Regression
Overfitting

Hallucination

Adversary

Attention

Convolution

Loss functions

Backpropagation

Batchnorm

Activation
Softmax

Sigmoid

Rectifier

Regularization

Datasets

Augmentation

Diffusion

Autoregression

Applications

Machine learning
In-context learning

Artificial neural network

Deep learning

Scientific computing

Artificial Intelligence

Language model
Large language model

Hardware

IPU

TPU

VPU

Memristor

SpiNNaker

Software libraries

TensorFlow

PyTorch

Keras

Theano

JAX

Flux.jl

MindSpore

Implementations
Audio–visual

AlexNet

WaveNet

Human image synthesis

HWR

OCR

Speech synthesis

Speech recognition

Facial recognition

AlphaFold

Text-to-image models
DALL-E

Midjourney

Stable Diffusion

Text-to-video models
Sora

VideoPoet

Whisper

Verbal

Word2vec

Seq2seq

BERT

Gemini

LaMDA
Bard

NMT

Project Debater

IBM Watson

IBM Watsonx

Granite

GPT-1

GPT-2

GPT-3

GPT-4

ChatGPT

GPT-J

Chinchilla AI

PaLM

BLOOM

LLaMA

PanGu-Σ

Decisional

AlphaGo

AlphaZero

Q-learning

SARSA

OpenAI Five

Self-driving car

MuZero

Action selection
Auto-GPT

Robot control

People

Yoshua Bengio

Alex Graves

Ian Goodfellow

Stephen Grossberg

Demis Hassabis

Geoffrey Hinton

Yann LeCun

Fei-Fei Li

Andrew Ng

Jürgen Schmidhuber

David Silver

Ilya Sutskever

Organizations

Anthropic

EleutherAI

Google DeepMind

Hugging Face

OpenAI

Meta AI

Mila

MIT CSAIL

Huawei

Architectures

Neural Turing machine

Differentiable neural computer

Transformer

Recurrent neural network (RNN)

Long short-term memory (LSTM)

Gated recurrent unit (GRU)

Echo state network

Multilayer perceptron (MLP)

Convolutional neural network

Residual neural network

Mamba

Autoencoder

Variational autoencoder (VAE)

Generative adversarial network (GAN)

Graph neural network

Portals
Computer programming

Technology

Categories
Artificial neural networks

Machine learning

v
t
e
Chess
Outline

Chess theory

Chess titles
Grandmaster

Computer chess
glossary

matches

engines

software

Correspondence chess

FIDE

Glossary

Online chess
Premove

Internet chess server
list

Rating system
world rankings

norms

Variants
List

World records

Equipment

Chess set
chessboard

Dubrovnik chess set

Staunton chess set

Chess pieces
King

Queen

Rook

Bishop

Knight

Pawn

Fairy

Chess clock

Chess table

Score sheets

History

Timeline
Versus de scachis

Göttingen manuscript

Charlemagne chessmen

Lewis chessmen

Romantic chess

Hypermodernism

Soviet chess school

Top player comparison

Geography of chess
Africa
South Africa

China

Europe
Armenia

Spain

India

Notable games

List of chess players
amateurs

female

grandmasters

Women in chess

Chess museums
Bobby Fischer Center

Gökyay Association Chess Museum

World Chess Hall of Fame

Rules

Castling

Cheating in chess

Check

Checkmate

Draw
by agreement

Fifty-move rule

Perpetual check

Stalemate

Threefold repetition

En passant

Pawn promotion

Time control
Fast chess

Touch-move rule

White and Black

Terms

Blunder

Chess notation
algebraic

descriptive

PGN

annotation symbols

symbols in Unicode

Fianchetto

Gambit

Key square

King walk

Open file
Half-open file

Outpost

Pawns
backward

connected

doubled

isolated

passed

Swindle

Tempo

Transposition

Trap

Tactics

Artificial castling

Battery
Alekhine's gun

Block

Checkmate patterns

Combination

Decoy

Deflection

Desperado

Discovered attack

Double check

Fork

Interference

Overloading

Pawn storm

Pin

Sacrifice
Queen sacrifice

Skewer

Undermining

Windmill

X-ray

Zwischenzug

Strategy

Compensation

Exchange
the exchange

Initiative
first-move advantage

Middlegame

Pawn structure
Hedgehog

Isolated Queen's Pawn

Maróczy Bind

Minority attack

Piece values

Prophylaxis

School of chess

Openings
Flank opening

Benko Opening

Bird's Opening

Dunst Opening

English Opening

Grob's Attack

Larsen's Opening

Zukertort Opening
King's Indian Attack

Réti Opening

King's Pawn Game

Alekhine's Defence

Caro–Kann Defence

French Defence

Modern Defence

Nimzowitsch Defence

Open Game
Four Knights Game

Giuoco Piano

Italian Game

King's Gambit

Petrov's Defence

Philidor Defence

Ponziani Opening

Ruy Lopez

Semi-Italian Opening

Scotch Game

Two Knights Defense

Vienna Game

Owen's Defence

Pirc Defence
Austrian Attack

Scandinavian Defense

Sicilian Defence
Alapin

Dragon/Accelerated Dragon

Najdorf

Scheveningen

Queen's Pawn Game

Budapest Gambit

Colle System

Dutch Defence

English Defence

Indian Defence
Benoni Defence

Modern Benoni

Bogo-Indian Defence

Catalan Opening

Grünfeld Defence

King's Indian Defence

Nimzo-Indian Defence

Old Indian Defense

Queen's Indian Defence

London System

Richter–Veresov Attack

Queen's Gambit
Accepted

Declined

Slav Defence

Semi-Slav Defence

Chigorin Defense

Torre Attack

Trompowsky Attack

Other

List of openings
theory table

List of chess gambits

Irregular
Bongcloud Attack

Fool's mate

Scholar's mate

Endgames

Bishop and knight checkmate

King and pawn vs king

Opposite-coloured bishops

Pawnless endgame

Queen and pawn vs queen

Queen vs pawn

Rook and bishop vs rook

Rook and pawn vs rook
Lucena position

Philidor position

Strategy
fortress

opposition

Tarrasch rule

triangulation

Zugzwang

Study

Tablebase

Two knights endgame

Wrong bishop

Wrong rook pawn

Tournaments

List of strong chess tournaments

Chess Olympiad
Women

World Chess Championship
List

Candidates Tournament

Chess World Cup

FIDE Grand Prix

Other world championships
Women

Team

Rapid

Blitz

Junior

Youth

Senior

Amateur

Chess composition

Solving

Computer chess championships
CCC

CSVN

North American

TCEC

WCCC

WCSCC

Art and media

Caïssa

Chess aesthetics

Chess in the arts
early literature

film

novels

paintings

poetry

short stories

Chess books
opening books

endgame literature

Oxford Companion

Chess libraries

Chess newspaper columns

Chess periodicals

Related

Arbiter

Chess boxing

Chess club

Chess composer

Chess engine
AlphaZero

Deep Blue

Leela Chess Zero

Stockfish

Chess problem
glossary

joke chess

Chess prodigy

Simultaneous exhibition

Solving chess

Chess portal

Category

Retrieved from "https://en.wikipedia.org/w/index.php?title=AlphaZero&oldid=1229805776"

[9] Stockfish developer Tord Romstad responded with
The match results by themselves are not particularly meaningful because of the rather strange choice of time controls and Stockfish parameter settings: The games were played at a fixed time of 1 minute/move, which means that Stockfish has no use of its time management heuristics (lot of effort has been put into making Stockfish identify critical points in the game and decide when to spend some extra time on a move; at a fixed time per move, the strength will suffer significantly). The version of Stockfish used is one year old, was playing with far more search threads than has ever received any significant amount of testing, and had way too small hash tables for the number of threads. I believe the percentage of draws would have been much higher in a match with more normal conditions.^[8]

[preprint-1] 
arXiv:1712.01815 [cs.AI
].

[telegraph-2] 
Telegraph.co.uk
. Retrieved December 6, 2017.

[3] Vincent, James (December 6, 2017). "DeepMind's AI became a superhuman chess player in a few hours, just for fun". The Verge. Retrieved December 6, 2017.

[Science20181207-4] 
PMID 30523106
.

[5] "Chess Terms: AlphaZero". Chess.com. Retrieved July 30, 2022.

[6] S2CID 208158225
.

[7] "AlphaZero vs. Stockfish 2017".

[romstad-8] "AlphaZero: Reactions From Top GMs, Stockfish Author". chess.com. December 8, 2017. Retrieved December 9, 2017.

[bbc-10] "'Superhuman' Google AI claims chess crown". BBC News. December 6, 2017. Retrieved December 7, 2017.

[11] Knight, Will (December 8, 2017). "Alpha Zero's "Alien" Chess Shows the Power, and the Peculiarity, of AI". MIT Technology Review. Retrieved December 11, 2017.

[chess.com-12] "Google's AlphaZero Destroys Stockfish In 100-Game Match". Chess.com. Retrieved December 7, 2017.

[13] Katyanna Quach. "DeepMind's AlphaZero AI clobbered rival chess app on non-level playing...board". The Register (December 14, 2017).

[14] "Some concerns on the matching conditions between AlphaZero and Shogi engine". コンピュータ将棋レーティング. "uuunuuun" (a blogger who rates free shogi engines). Retrieved December 9, 2017. (via "瀧澤誠@elmo (@mktakizawa) | Twitter". mktakizawa (elmo developer). December 9, 2017. Retrieved December 11, 2017.)

[15] "DeepMind社がやねうら王に注目し始めたようです". The developer of YaneuraOu, a search component used by elmo. December 7, 2017. Retrieved December 9, 2017.

[tol-16] The Times of London
. Retrieved December 7, 2017.

[17] "Alphabet's Latest AI Show Pony Has More Than One Trick". WIRED. December 6, 2017. Retrieved December 7, 2017.

[18] Gibbs, Samuel (December 7, 2017). "AlphaZero AI beats champion chess program after teaching itself in four hours". The Guardian. Retrieved December 8, 2017.

[19] "Talking modern correspondence chess". Chessbase. June 26, 2018. Retrieved July 11, 2018.

[20] DeepMind社がやねうら王に注目し始めたようです | やねうら王公式サイト, 2017年12月7日

[21] As given in the Science paper, a TPU is "roughly similar in inference speed to a Titan V GPU, although the architectures are not directly comparable" (Ref. 24).

[pete-22] "AlphaZero Crushes Stockfish In New 1,000-Game Match". December 6, 2018.

[23] Sean Ingle (December 11, 2018). "'Creative' AlphaZero leads way for chess computers and, maybe, science". The Guardian.

[24] Albert Silver (December 7, 2018). "Inside the (deep) mind of AlphaZero". Chessbase.

[25] "Komodo MCTS (Monte Carlo Tree Search) is the new star of TCEC". Chessdom. December 18, 2018.

[26] TCEC and Leela Chess Zero
.

[27] "Could Artificial Intelligence Save Us From Itself?". Fortune. 2019. Retrieved February 29, 2020.

[28] "DeepMind's MuZero teaches itself how to win at Atari, chess, shogi, and Go". VentureBeat. November 20, 2019. Retrieved February 29, 2020.

[1]

[2]

[3]

[4]

[5]

[6]

[9]

[10]

[11]

[8]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]