How AlphaGo Zero works
Tim Wheeler has a good tutorial on How AlphaGo Zero works.
Tim Wheeler has a good tutorial on How AlphaGo Zero works.
From this Cornell University page, Google’s AlphaZero algorithm has been generalized to learn new games given only the game rules: “In this paper, we generalise this approach into a single AlphaZero algorithm that can achieve, tabula rasa, superhuman performance in many challenging domains.
I was curious what kind of hardware AlphaGo Zero uses, and I found this table on this Wikipedia page. The “TPU” references in that table refer to Tensor processing units.
deepmind.com has a great new article titled, “AlphaGo Zero: Learning from scratch.”
NPR reports that a new version of Google’s AlphaGo Zero software became a Go master by learning to play the game only by playing itself, i.e., only by using reinforcement learning (as opposed to supervised learning). Per the report in Nature.com, “AlphaGo Zero achieved superhuman performance, winning 100–0 against the previously published, champion-defeating AlphaGo.”
With all the interest in AI, AlphaGo, the game of Go, and the human (Lee Sedol), I thought this image was a good description of AlphaGo’s strategy. It comes from this gogameguru.com page.