AlphaGo and Reinforcement Learning

Michael Batavia June 16, 2020 Artificial Intelligence ML Algorithms, Reinforcement Learning

Have you heard that the newest board game Go’s champion is now an AI?

In 2015, the neural network known as AlphaGo defeated the reigning European champion in a 5-0 game. It later defeated the decade’s greatest player, Mr. Lee Sedol in a 4-1 game in 2016 giving the network a 9 dan rating, the highest a player could receive. How did it do this?

According to DeepMind, a Google subsidiary that deals specifically with machine learning and artificial intelligence, the current reigning model known as AlphaGo Zero defied conventional training methods by analyzing past professional and amateur performances and instead played against itself, learning from its own errors as it slowly acclimated to learn the best moves to win every time. Similar to how we examine a game board to find the best move to do to further our chances of winning, so too must the computer, which is the basis of reinforcement learning and its major algorithm called Deep Q-Networks.

Deep-Q Networks use a reward-based system to increase the accuracy of neural networks. At first, the algorithm starts off with no prior knowledge of the game being played and moves erratically, like pressing all the buttons in a fighting game. If the algorithm collects the game’s currency such as pellets in Pac-Man or survives for a certain amount of time, it gets saved to a positive or a negative reward, otherwise known as the Q value. The neural network’s goal is to try to optimize this Q value to obtain better and better rewards over time. Eventually, after a lot of training episodes, the network will become proficient in the game and manage to receive optimal rewards over time.

Though Go is only one board game, AI which combine reinforcement learning with convolutional neural networks or recurrent neural networks have also achieved high scores in old Atari and Nintendo video games, solved an simulated 3D Rubik’s Cube in a record amount of moves [most teenagers take at most 50, this network completed it in 20] and even create recommendation systems based on a person’s recent product sales and likes.

Whether reinforcement learning will continue to surpass all human records in video games and board games is unknown, but it certainly has an important place in advancing research in how humans think and the optimal way to play games. Maybe your own favorite game will be optimized in the near future with reinforcement learning.

Sources:

https://web.stanford.edu/class/psych209/Readings/MnihEtAlHassibis15NatureControlDeepRL.pdf

https://tinyurl.com/rl-apps

8 Replies to “AlphaGo and Reinforcement Learning”

Fredric Youla says:

January 4, 2020 at

Thanks in favor of sharing such a fastidious thought, piece of writing is good, thats why i have read it completely|

Reply
Cyril Hoel says:

February 24, 2020 at

Ahaa, its good discussion about this paragraph here at this website, I have read all that, so at this time me also commenting at this place.|

Reply
Ross Appell says:

February 24, 2020 at

It’s very effortless to find out any topic on net as compared to textbooks, as I found this post at this site.|

Reply
Columbus Sickmeir says:

March 9, 2020 at

My brother recommended I might like this website. He used to be entirely right. This submit truly made my day. You can not believe simply how a lot time I had spent for this information! Thanks!|

Reply
Erna Karnish says:

March 9, 2020 at

I really like it when people come together and share views. Great site, continue the good work!|

Reply
Kassie Piermont says:

April 4, 2020 at

I take great pleasure in reading articles with quality content. This article is one such writing that I can appreciate. Keep up the good work.

Reply
Alma Barthen says:

May 17, 2020 at

I was curious if you ever thought of changing the page layout of your site? Its very well written; I love what youve got to say. But maybe you could a little more in the way of content so people could connect with it better. Youve got an awful lot of text for only having 1 or two images. Maybe you could space it out better?

Reply
cialis price says:

May 19, 2020 at

Way cool! Some very valid points! I appreciate you penning this post
and the rest of the website is very good.

Reply

8 Replies to “AlphaGo and Reinforcement Learning”

Leave a Comment Cancel reply