Other game engines use handcrafted rules written by professional players; AlphaZero
was only given the basic rules of the game. By playing millions of games against itself using a process of trial and error, AlphaZero was about to learn the games chess, shogi, and go without the use of opening books and endgame tables.
From the DeepMind website: “At first, it plays completely randomly, but over time the system learns from wins, losses, and draws to adjust the parameters of the neural network, making it more likely to choose advantageous moves in the future.”
It took nine hours for AlphaZero to learn chess, twelve hours to learn shogi, and thirteen daysto learn go, highlighting the complexity of the ancient Chinese game.
AlphaZero is a follow up of AlphaGo, the artificial intelligence developed by DeepMind which was the first computer program to beat a human professional Go player of the highest rank. In AlphaGo’s second game of a five-game match, it played a highly unorthodox move which went against centuries of common knowledge. This move, move 37 in game two, set the computer program up to later win the game.
By removing human input and letting AlphaZero learn for itself the program is able to teach modern human players what’s possible.