Since the 1990s, computers have continually gotten better at beating us at our own games, like chess, checkers, poker, and Jeopardy. But expert human players continue to dominate machines at one game: Go. The more than 2,500 years old board game, in which two players use black and white stones to try to capture more territory than their opponent, is extremely complex, which has made it difficult for computers to master. But it seems as though our supremacy in Go may have finally ended—researchers at Google DeepMind announced today that they’ve created a sophisticated AI computer program—a combination of deep neural networks and a search techniques—that has beaten a Go champion for the first time in history.
Last October in London, the DeepMind team invited the European Go champion, Fan Hui, to play against their computer program, AlphaGo. The match was private, with just a few spectators to witness it. Hui and AlphaGo played a full size game on a 19 by 19 grid board. AlphaGo had already been tested against state-of-the-art Go programs, like Crazy Stone and Zen, and had won 494 out of 495 of those games. But playing against a human expert is a much greater challenge than playing other computers because, well, the pros are still so much better—they have years of experience with the game, and a kind of intuition about how to play it. So when AlphaGo won the game 5-0, it was a big deal.
Many predicted that computers wouldn’t beat a champion Go player for at least another ten years. “This leap in performance is just completely unexpected and unprecedented,” says David Fotland, a software engineer at Amazon and also creator of a computer Go program, who wasn’t involved in the study.
To understand what the DeepMind researchers did to create such an impressive program, you first have to appreciate why Go is such a difficult game for computers to play well. First, Go has a ridiculous number of possible moves and outcomes—according to the researchers, there are more possible positions in Go than the number of atoms in the Universe. One of the study authors, Demis Hassabis, co-founder of DeepMind, made a comparison to chess, saying that in a game of chess, you have an average of 20 possible moves in a turn, whereas in Go you have an average of 200 possible moves in a turn. This means that if a computer were to search all the possible moves and outcomes in Go, it would take an enormous amount of computing power to do so, one that some say may not even be possible.
Another reason computers have a hard time with Go, explains computer scientist Jonathan Schaeffer, is that Go players need a large pool of knowledge—past experiences with the game—to draw from. “With chess, you can put in a small amount of knowledge and you can built a strong game playing program,” says Schaeffer, a professor at the University of Alberta, who wasn’t involved in the study, “In Go, you can’t.” That’s because in chess, a computer can follow pre-programmed rules, but using this strategy for Go isn’t workable because the game is largely about patterns, rather than a set of logical rules that could be written down.
The DeepMind team’s system addresses both the massive search problem and the lack of knowledge problem. In a new study in published in this week’s Nature, they describe combining a search technique and deep learning to overcome these obstacles.
For the knowledge problem, they used what are called deep neural networks—in this case, two 13 layer-deep networks that consist of millions of connections, akin to neural connections in the human brain. The researchers trained these networks with two methods: For one network, they showed the computer more than 30 million moves from games played by human experts (this helped the system learn how the best players win); and for both of the networks, the researchers had the computer play thousands of games with itself so that it could discover new strategies and learn the game on its own. These two training strategies allowed the computer to recognize patterns in the game, and identify what moves gave it the best chance of winning.
For the intractable search problem, the researchers exploited a special search technique, called the Monte Carlo Tree Search. This search method, which has been around for years and is used in other computer game programs, essentially allows the system to use statistics as a short-cut to determine the best move, rather than playing out each and every possible outcome of a given move (which in Go would take forever).
The search technique and deep learning tools used by the DeepMind team aren’t new. Many computer Go programs already use the Monte Carlo Tree Search, and neural networks have been employed as well. But what makes DeepMind’s AlphaGo so advanced is the way they put together these tools, along with the high performance of the deep neural networks. “The main novelty is in how they’ve combined these different ingredients together—they’ve innovated in doing that,” says Yoshua Bengio, a professor of computer science at the University of Montreal. Jonathan Schaeffer says he’s impressed with the results: “This is a simpler, more comprehensive approach than what people have done in the past, and it’s more elegant,” says Schaeffer, “I see this as a huge leap forward.” And it’s exactly what gave AlphaGo the edge over Fan Hui in their match—the computer won 5-0.
And while maybe not everyone cares that a computer beat a champion Go player, this advance is important in other fields as well. The researchers who built the system with “general-purpose methods” instead of creating something that’s specifically made to only play Go, intend to “ultimately apply these techniques to important real-world problems,” says Hassabis, “Our hope is that one day they could be extended to help us address some of society’s toughest and most pressing problems, from climate modeling to complex disease analysis.” Yoshua Bengio says that another possible important application is in computerized dialogue, and Schaeffer says that in the future, these programs might be able to come up with answers to abstract social issues that can be expressed as games, like national politics or international climate negotiations.
But AlphaGo first has a more immediate problem: How to beat the world’s best Go player, Lee Sedol. This March, the two will play each other in Seoul, South Korea. And although AlphaGo played well against Fan Hui, Schaeffer and David Fotland still predict Sedol will win the match. “I think the pro will win,” says Fotland, “But I think the pro will be shocked at how strong the program is.” For now, at least some people are still placing their bets on humans.