Have you heard that the newest board game Go’s champion is now an AI?

In 2015, the neural network known as AlphaGo defeated the reigning European champion in a 5-0 game. It later defeated the decade’s greatest player, Mr. Lee Sedol in a 4-1 game in 2016 giving the network a 9 dan rating, the highest a player could receive. How did it do this?

According to DeepMind, a Google subsidiary that deals specifically with machine learning and artificial intelligence, the current reigning model known as AlphaGo Zero defied conventional training methods by analyzing past professional and amateur performances and instead played against itself, learning from its own errors as it slowly acclimated to learn the best moves to win every time. Similar to how we examine a game board to find the best move to do to further our chances of winning, so too must the computer, which is the basis of reinforcement learning and its major algorithm called Deep Q-Networks.

Deep-Q Networks use a reward-based system to increase the accuracy of neural networks. At first, the algorithm starts off with no prior knowledge of the game being played and moves erratically, like pressing all the buttons in a fighting game. If the algorithm collects the game’s currency such as pellets in Pac-Man or survives for a certain amount of time, it gets saved to a positive or a negative reward, otherwise known as the Q value. The neural network’s goal is to try to optimize this Q value to obtain better and better rewards over time. Eventually, after a lot of training episodes, the network will become proficient in the game and manage to receive optimal rewards over time.

Though Go is only one board game, AI which combine reinforcement learning with convolutional neural networks or recurrent neural networks have also achieved high scores in old Atari and Nintendo video games, solved an simulated 3D Rubik’s Cube in a record amount of moves [most teenagers take at most 50, this network completed it in 20] and even create recommendation systems based on a person’s recent product sales and likes.

Whether reinforcement learning will continue to surpass all human records in video games and board games is unknown, but it certainly has an important place in advancing research in how humans think and the optimal way to play games. Maybe your own favorite game will be optimized in the near future with reinforcement learning.

 

Sources:

https://web.stanford.edu/class/psych209/Readings/MnihEtAlHassibis15NatureControlDeepRL.pdf

https://tinyurl.com/rl-apps

8 Replies to “AlphaGo and Reinforcement Learning”

  1. Ahaa, its good discussion about this paragraph here at this website, I have read all that, so at this time me also commenting at this place.|

  2. It’s very effortless to find out any topic on net as compared to textbooks, as I found this post at this site.|

  3. My brother recommended I might like this website. He used to be entirely right. This submit truly made my day. You can not believe simply how a lot time I had spent for this information! Thanks!|

  4. I really like it when people come together and share views. Great site, continue the good work!|

  5. I take great pleasure in reading articles with quality content. This article is one such writing that I can appreciate. Keep up the good work.

  6. I was curious if you ever thought of changing the page layout of your site? Its very well written; I love what youve got to say. But maybe you could a little more in the way of content so people could connect with it better. Youve got an awful lot of text for only having 1 or two images. Maybe you could space it out better?

Leave a Comment

Your email address will not be published. Required fields are marked *