Have you heard that the newest board game Go’s champion is now an AI?
In 2015, the neural network known as AlphaGo defeated the reigning European champion in a 5-0 game. It later defeated the decade’s greatest player, Mr. Lee Sedol in a 4-1 game in 2016 giving the network a 9 dan rating, the highest a player could receive. How did it do this?
According to DeepMind, a Google subsidiary that deals specifically with machine learning and artificial intelligence, the current reigning model known as AlphaGo Zero defied conventional training methods by analyzing past professional and amateur performances and instead played against itself, learning from its own errors as it slowly acclimated to learn the best moves to win every time. Similar to how we examine a game board to find the best move to do to further our chances of winning, so too must the computer, which is the basis of reinforcement learning and its major algorithm called Deep Q-Networks.
Deep-Q Networks use a reward-based system to increase the accuracy of neural networks. At first, the algorithm starts off with no prior knowledge of the game being played and moves erratically, like pressing all the buttons in a fighting game. If the algorithm collects the game’s currency such as pellets in Pac-Man or survives for a certain amount of time, it gets saved to a positive or a negative reward, otherwise known as the Q value. The neural network’s goal is to try to optimize this Q value to obtain better and better rewards over time. Eventually, after a lot of training episodes, the network will become proficient in the game and manage to receive optimal rewards over time.
Though Go is only one board game, AI which combine reinforcement learning with convolutional neural networks or recurrent neural networks have also achieved high scores in old Atari and Nintendo video games, solved an simulated 3D Rubik’s Cube in a record amount of moves [most teenagers take at most 50, this network completed it in 20] and even create recommendation systems based on a person’s recent product sales and likes.
Whether reinforcement learning will continue to surpass all human records in video games and board games is unknown, but it certainly has an important place in advancing research in how humans think and the optimal way to play games. Maybe your own favorite game will be optimized in the near future with reinforcement learning.