The Tao of Gaming

Boardgames and lesser pursuits

Posts Tagged ‘Go

Re: Mastering Chess and Shogi by Self-Play (etc etc)

More stunning news from the DeepMind crew:

In this paper, we generalise this approach into a single AlphaZero algorithm …. Starting from random play, and given no domain knowledge except the game rules, AlphaZero achieved within 24 hours a superhuman level of play in the games of chess and shogi (Japanese chess) as well as Go, and convincingly defeated a world-champion program in each case. –Abstract

And how did they do?

In chess, AlphaZero outperformed Stockfish after just 4 hours; in shogi, AlphaZero outperformed Elmo after less than 2 hours; and in Go, AlphaZero outperformed AlphaGo Lee after 8 hours.

(Stockfish and Elmo are the best computer programs in the world. AlphaGo Lee beat Lee Sedol, but was later improved on by AlphaGo Zero).

And after 3 days, AlphaZero played a 100 game match against Stockfish and won 28, drew 72 and lost zero.

It did this in spite of a slower search speed:

AlphaZero searches just 80 thousand positions per second in chess and 40 thousand in shogi, compared to 70 million for Stockfish and 35 million for Elmo. AlphaZero compensates for the lower number of evaluations by using its deep neural network to focus much more selectively on the most promising variations – arguably a more “human-like” approach to search

The full paper is worth a read. I don’t understand all the details, although its non-linear evaluation function apparently may have biases (like a hand crafted evaluation function) but the authors argue that their Monte-Carlo search (with its inherent randomness) will “average over [AlphaZero’s] approximation errors” but the traditional Alpha-Beta search propagates them.

Looking at some of the commentary by more experienced chess players, this jumped out at me:

The paper also came accompanied by ten games to share the results. It needs to be said that these are very different from the usual fare of engine games. If Karpov had been a chess engine, he might have been called AlphaZero. There is a relentless positional boa constrictor approach that is simply unheard of. Modern chess engines are focused on activity, and have special safeguards to avoid blocked positions as they have no understanding of them and often find themselves in a dead end before they realize it. AlphaZero has no such prejudices or issues, and seems to thrive on snuffing out the opponent’s play. It is singularly impressive, and what is astonishing is how it is able to also find tactics that the engines seem blind to….

In this position from Game 5 of the ten published, this position arose after move 20…Kh8. The completely disjointed array of Black’s pieces is striking, and AlphaZero came up with the fantastic 21.Bg5!! After analyzing it and the consequences, there is no question this is the killer move here, and while my laptop cannot produce 70 million positions per second, I gave it to Houdini 6.02 with 9 million positions per second. It analyzed it for one full hour and was unable to find 21.Bg5!!

(Emphasis mine, also to note that AlphaZero and stockfish had 1 minute per move).

The DeepMind team just keeps delivering stunning advances. Someone is probably jumping with a reasonable approximation of joy.

Written by taogaming

December 7, 2017 at 11:08 pm

Posted in Artificial Opponents

Tagged with , ,

Computer Bridge

There is currently a bridgewinners discussion on “When will computers beat human bridge experts?“. This is (unsurprisingly) triggered by the recent advances in Go playing computers, based on the deep learning system. The news from Google — taking time out of their military robotics schemes to focus on less Skynet-y ventures — was an interesting demonstration. My only expertise in this (apart from the fact that I’m not exactly a stranger to military robotics programs, but also medical robotics!) is that I’ve followed computer opponents in classic games somewhat.

There are three salient points to the system — the training method, the use of monte carlo systems in evaluation, and the hybrid engine.  For now, lets just consider a simplified bridge AI. It plays standard american, and expects its opponents to do the same. Teaching a program to handle multiple bidding systems is one of scale and scope, and not that different (in practice).

Training — The Go program was trained with 30 million expert positions, then played against itself to bootstrap. This method could be used with bridge, assuming a large enough corpus of expert deals exists. However, there are some issues.

Every go (and chess) program starts from the same board position, a fact that isn’t true of Bridge. To counter balance that the search space for an individual deal is much much smaller. Still, it’s not clear that 30 million deals is enough. Presumably you could use some non-expert deals for bidding (take random BBO hands and if enough people bid them the same way, that’s probably good enough). Top level deals can be entered, especially those with auctions duplicated at two tables.

Card play could use a similar method — for a hand and auction, if the opening lead is standard, you could assume (absent further training) that it is right. A clever AI programmer could have a program running on BBO playing hands, and then comparing it’s results (already scored, no less!) against others. Your scoring system may want to account for weird results (getting to good slams that fail on hideous breaks, etc), but that’s pretty simple.

So, there may be a problem getting enough expert deals, but there should be enough to get a large corpus of good deals (particularly if the engine weights others and then uses better players as a benchmark).

Randomness — Some people on the BW thread are saying that randomness will stop an AI.

No.

The news out of Google is ahead of schedule, but it didn’t surprise me as much as Crazy Stone (the precursor to Alpha Go). Crazy Stone’s innovation was that if it couldn’t decide between two moves (because they were strategic, not tactical, or if the search depth got too great), it would simply play a few hundred random games from each position, and pick the move that scores better. Adding randomness to the evaluation function (of a non-random game!) greatly improved the structure, so much so that I believe I commented on it at the time. (Sadly, that was before the move, so I don’t have a tagged post. See my posts tagged go for some tangential comments.

Randomizing bridge hands would present different challenges, but the idea of just saying, “I don’t know, let’s just try each lead a few hundred times against random hands (that match with what we expect” is obvious, as well as using randomness (to decide whether to continue or shift suits). Because bridge doesn’t have Go’s massive search depth, you could also drop each hand into a single dummy solver for each position, or have it play randomly only until breaks are none (so plays randomly but not with a known position).

The thing about random play is that it’s fast. So you’ve won the opening lead, what to play? Whip up 100 random deals (not hard since you can see two hands plus a few other cards, plus all your bidding inferences) and try them out.

Hybridization — The trick is that you only resort to randomness if your trained algorithm isn’t confident of its training. This happens quite a bit in Go. (Go is amazingly frustrating in that expert or even master level players will be unable to communicate why a play is correct. I remember a lecture at the Pittsburgh Go Association and the lecturer, an amatuer 3 dan or so, was reviewing a game between two pros. And someone asked “Why did so-and-so play that move on that spot. Isn’t one space to the right better?”

Neither move had a tactical flaw, and the lecturer stumbled, then called out to a late arrival (a graduate student from Japan and — I believe — soon to turn Pro after getting his degree). The arrival went up to the big magnetic board, stared, said “Ah! It’s because of” and then laid out 10 moves for each side. Then reset, shifted the stone, and laid out ten different moves for each side then walked the few people who could understand the differences through it.

The point of my story? Go is hard. Go is hard enough so that the professional players routinely make moves that  amateur experts cannot reasonably understand. Go experts can look much farther ahead than bridge players (and computers) — yet random simulation coupled with deep learning can handle it.

The Go playing program might very well have learned to play the move on the correct spot, and not one-to-the-right, in our example. How did it learn this? Because the experts did it. It gained a feel for what to do in those situations. But even assuming that it hadn’t learned, and was sitting in the back of the room (like a 20 year old me) and couldn’t see a difference between the two. It might still grope its way to the correct move using a Monte Carlo simulation on both moves. (This is assuming that it’s near term tactical engine couldn’t find both sequences and judge one obviously better).

Right now bridge computers have many advantages, and can play perfectly once enough is known about the hand. You’d never use a random engine at that point. This hybridized strategy would be for your master solver’s club type things where experts disagree.

And, if you are deciding between those two things, you are (by definition) an expert.

So, I stand of the opinion that Bridge hasn’t been solved because nobody has thought to attack it. Or perhaps there is not a large enough body of expert deals that can be conveniently fed into a computer. A clever programmer (which I am not) could probably have a system learn just by having it log onto BBO, assuming that it could learn which players to trust and which to not (and which ones to use as bidding examples). 30 Million deals, each played 4 times by experts may not be enough, but it’s probably in the ballpark.

Why hasn’t this been done? Probably nobody cares. Go is (by far) the sexiest game right now because it’s search space is unfathomably deep. Go players routinely scoff at the simplicity (by comparison) of chess. In terms of search space (for a single hand) bridge doesn’t compare. If Google put its money behind it, I think a Bridge computer would do well in a match against a top team. Also, there were prizes offered for Go programs that could play at a high enough level, which spurred on development over the last 20 years.

Written by taogaming

February 7, 2016 at 12:05 am

Posted in Artificial Opponents, Bridge

Tagged with , ,

Recent Readings

I’ve gotten a lot of game-related books, so I figured I’d mention them.

Bridge

I have most of Mike Lawrence’s “Complete Book of X,” but didn’t have Overcalls. This was just reprinted last year, and updated, so that made it a no-brainer. Lawrence is a bridge instructor whose writings work well across a wide range of skill levels. (Meaning, I thought I understood the books when I read them 20 years ago, and I still get something out of them now). Interestingly, the modern methods section (new for this edition) seems reasonable, for the most part non-conventional, and I don’t think I’ve run into a single pair playing it. None of Lawrence’s books are for casual players, although there are a few ideas you can implement even with no partnership agreements (in particular, identifying death traps). It helps to have just re-read Hand Evaluation recently, since he spreads a lot of those ideas in his other books, but it’s good to have them called out.

For the technical play side, I just picked up Kelsey on Squeeze Play, a four-volume collection of work. As readers know, I’ve been stubbornly trying to find these at the table, and going through volume 1 (Simple Squeezes), I easily solve these … not at the table. To be fair, I did find a squeeze for a second overtrick while playing online this week. (If we had bid the slam, the squeeze would have made it). However, a simple finesse would have worked, too. (I think the squeeze was the superior play, in that it catered to the Queen being offside or doubleton, but I was mainly just playing for the squeeze). These aren’t as hard as they sound (or as exotic as Jeff’s articles), but again, probably not for any but the serious player.

Go — I picked up a dozen or so books at a charity auction. Unlike Bridge, I harbor no illusions that I’ll ever be good, but I played semi-seriously for a few years back in the 90s, and I’ve read a lot of go. But that mainly reminds me of Jamie Lee Curtis’s line to Kevin Kline in Fish Called Wanda. I now have a complete collection of  the “Learn to Play Go” series. If you already know how to play, you can easily skip the first few volumes (although I haven’t read IV and V). But it’s a good library book if you are interested. The Intermediate Power Builder series is based on a Chinese TV series for amatuers. (I have 200 channels of TV, and I can’t think of a single instructional show for any game … not even poker! Amazing. If someone put Caro’s Book of Tells on the discovery channel, I think you’d get reasonable ratings. And how much would it cost to produce these? Hell, the ACBL (or AGA, or USCF) could probably sponsor a TV show. I wonder if there are bridge lectures on youtube? (Well, I found a few videos, not sure if there are any lessons, but I did see several “Duplicate Bridge” items).

Anyway, the Intermediate Power Builder series makes me think I understand basic positional play, as long as I’m holding the book. Honestly, Go’s contradiction for me is that the positional play is so damn unintuitive. (The tactical play is complex, as well, but that’s no more intriguing than Chess). I went to a lecture at the Pittsburgh Go Club (on CMU’s campus) where they were going over a master game. After some move, one other amatuer asked the lecturer (around 2 dan, IIRC), “Well, why not one space over?” He hemmed and hawed, and asked a grad student (later described as “Possibly a professional player in a few years, but he promised his family he’d get his PhD first”). The grad student walked up and said “Because of this variation” and slapped down ~12 moves on the board. I’ve still never met someone who could routinely explain Go positional balance to a non-master. But if you want to pretend to know this stuff, this is a reasonable series.

Written by taogaming

February 6, 2010 at 11:36 am

Posted in Bridge, Misc

Tagged with , ,

“Solving” Caylus

There’s a BGG thread about solving Caylus. While I doubt it will be solved anytime soon (Chess shows no signs of giving up its secrets), I do think a reasonable computer opponent could be developed.

My own technique would be to use some basic genetic algorithms for some decisions, and then use some recent advances in computer Go as a jumping off point.

For example, during worker placement you’ll typically have 15 options (Pass, Castle, Gate, Guild, Joust, Stables, Inn, Pass, Spaces 1-n). You mainly can’t take occupied spaces, and you can eliminate some obviously bad moves (spaces you won’t be able to use, or lose money). Now, if your genetic algorithm (or hardcoded rules) point to a clear decision … take it. You can also have a clear evaluation function (money, favor, goods are positive … wasted workers negative, etc etc).

But if you’ve got 2-3 candidate moves, consider each one. Simulate the turn out, then play out rest of the game 100 (or 200, or 1000) times using random moves for all players. (Possibly keeping the smarts that eliminate completely boneheaded moves). Whichever candidate gives you the best average outcome, take it.

Given how well this works for Go, I think it would be generally applicable.

You’d want to avoid randomly moving the provost (that would probably be hardcoded, and possibly genetic).

My Computer Science theory isn’t quite strong enough for me to set up this framework myself (nor do I feel like spending the time) but if a project got started (say, on SourceForge) I may contribute.

Written by taogaming

March 23, 2007 at 5:33 pm

Posted in Artificial Opponents, Caylus

Tagged with ,

January Notes

First off, I’m sorry to see Chris Farrel retire from blogging. One of the most insightful commenter on gaming today. I wish him joy in whatever endeavor replaces the time.

Computer go programs have made big strides recently. Interestingly, the computer uses an improved Monte Carlo algorithm … once it’s searched a few moves, it just plays out a few hundred games using random moves, scores them, and averages them. Some more information is available.

My Here I Stand PBEM just finished the first turn. A bit slow, but we had a player drop out (and a business trip or two) and everyone was new to PBEM/Cyberboard (I think). We’ll probably be able to get a turn in around two weeks or so. Leisurely, but it allows for more analysis/diplomacy. I could have crunched more numbers (as the Protestant) … but like Barbie says, “Math is hard.”

Written by taogaming

January 31, 2007 at 8:47 pm

Posted in Artificial Opponents

Tagged with , ,

Reviews of Bridge and Go

Be sure to check out Ekted’s reviews of Bridge and Go.

All right thinking people (and at least 10 million who are certifiable) play Bridge, but it’s a daunting game to learn. In my first week of college, I approached the bridge club. But I was late. The president of the club was teaching all of the new players, so an alumni gave me the 10 minute crash course. In general, the crash course is better than the hour lecture. To get started, you are going to have to skip a lot (of the “Why does that bid mean that?” nature, in particular), so may as well skip as much as possible and get started playing. And for those who think that Bridge is all about the bidding, check out some of Jeff Goldsmith’s articles, called “Bridge without Sam.” I’m a weak intermediate, Jeff plays at the regional champion level, I believe. And hangs out with International players. He often finds plays with half the deck hidden that I can’t see with all the cards visible.

Go doesn’t have many existential mysteries up front, but every time you peel back a layer you find another one. I’ve often found intermediate (say single digit kyu to low dan) players have no ability to explain why a move is right or wrong. I sat in on a lecture at the CMU Go club where a medium amatuer master player was replaying a game (between two professionals) and got asked “Why here? Why not the space next to it” and he just stopped. Fortunately, a grad student (from Japan, naturally) walked into the room and heard the question. He walked up to the board, stared for a minute, and quickly played ~20 moves … and said “And that’s bad”. [Said student was described as ‘A near professional player who promised his parents he’d finish school first’].

I don’t mention (or play) either one often, but these two games are worthy of intense study. But, like a true dilettante, I just play them these days.

Written by taogaming

December 9, 2005 at 5:39 pm

Posted in Bridge, Reviews

Tagged with , ,

Teaching Go

Last night I played a 3-player game of Power Grid, then our 3rd left (feeling a bit under the weather). So I taught some Go. I use the phrase ‘taught’ loosely, as I’m not really qualified to teach the game. I’ve played one tournament, and got a provisional AGA 13 kyu rating. That was years ago, so I’m probably around 15-20k.

But my opponent had very limited experience, so I gave him 9 stones. And then I floundered trying to explain why moves were good or bad. I’ve been on the other side too … Go strikes me as one of the most counter-intuitive games devised. I tried to explain this, but I don’t really understand a lot of it myself.

However, today I realize that I didn’t give one of the best pieces of advice I do know — “Don’t automatically answer your opponent’s move.” If your opponent makes a move that appears threatening you are better off ignoring it if a bigger move exists on a different side of the board.

Realistically, a new player has no idea if a move is big or not, and sometimes the experienced player isn’t much better off. A new player will often misjudge; it comes with the territory. Still, the urge to respond to a threat grants the experienced player Sente (the initiative).

Usually I’m capable of distilling a few pearls of wisdom to give to novice players (when asked), but Go left me fumbling, as always.

Written by taogaming

March 16, 2005 at 4:38 pm

Posted in Ramblings

Tagged with

Random computer gaming

A special program has solved 5×5 Go. Researchers still haven’t made a master level program that plays the full (19×19) game. Dr. Peter Drake (Mundungus on BGG) has a page with useful links if you are interested.

And, apropos of my recent thoughts on Can’t Stop (and Larry’s comment), Jim Cobb has released a new computer version.

I have nothing clever to say, but figured a few of you may be interested. (I also figure most of you have already seen these, but hey, you never know).

Update August 4th, 2005 — I’ve updated the link to Roll or Don’t.

Written by taogaming

February 22, 2005 at 8:05 pm

Posted in Artificial Opponents

Tagged with ,

New Algorithm for Computer Go?



Found via Pejman:



An interesting article on Computer Go. Go appears the least solvable of all of the traditional games. World-class computer opponents exist for Chess, Checkers, and Backgammon. The idea of training Neural Networks using a database of professional games and evaluating the position probabalistically seems genius. Who knows, this project may produce the first master level computer go opponent.



Neural nets worked for backgammon, after all. Last year I tried the free version of Jellyfish with staggering losses. Of course, I’m not a master player, but the fact that master Backgammon players consult with their (snarky) computers after a match says volumes.



I don’t play Go much, despite it’s depth and elegance. strong players often can’t explain what makes a good move to weaker players. Learning involves study and bashing your head against the wall, and while I like studying games as much as anyone, I found the progress depressingly slow.










Update:

I tried to post the following as a comment on Pej’s site, but failed.



[Continuing laocoon’s comment], I would imagine that the Neural Net would be used to generate candidate moves for strategic positions (particularly openings and for ‘quiet’ moves in general), the ‘probability’ function would be used to evaluate positions (using the NN to determine moves, likely counters, etc) and a tactical engine would handle localized fighting. [Even my old Sargon chess program in the early 80s always searched through the end of a capture sequence]. Endgames can also be solved handled analytically once the parts of the board cease to interact.

Written by taogaming

January 25, 2005 at 7:03 pm

Posted in Artificial Opponents

Tagged with