There is currently a bridgewinners discussion on “When will computers beat human bridge experts?“. This is (unsurprisingly) triggered by the recent advances in Go playing computers, based on the deep learning system. The news from Google — taking time out of their military robotics schemes to focus on less Skynet-y ventures — was an interesting demonstration. My only expertise in this (apart from the fact that I’m not exactly a stranger to military robotics programs, but also medical robotics!) is that I’ve followed computer opponents in classic games somewhat.
There are three salient points to the system — the training method, the use of monte carlo systems in evaluation, and the hybrid engine. For now, lets just consider a simplified bridge AI. It plays standard american, and expects its opponents to do the same. Teaching a program to handle multiple bidding systems is one of scale and scope, and not that different (in practice).
Training — The Go program was trained with 30 million expert positions, then played against itself to bootstrap. This method could be used with bridge, assuming a large enough corpus of expert deals exists. However, there are some issues.
Every go (and chess) program starts from the same board position, a fact that isn’t true of Bridge. To counter balance that the search space for an individual deal is much much smaller. Still, it’s not clear that 30 million deals is enough. Presumably you could use some non-expert deals for bidding (take random BBO hands and if enough people bid them the same way, that’s probably good enough). Top level deals can be entered, especially those with auctions duplicated at two tables.
Card play could use a similar method — for a hand and auction, if the opening lead is standard, you could assume (absent further training) that it is right. A clever AI programmer could have a program running on BBO playing hands, and then comparing it’s results (already scored, no less!) against others. Your scoring system may want to account for weird results (getting to good slams that fail on hideous breaks, etc), but that’s pretty simple.
So, there may be a problem getting enough expert deals, but there should be enough to get a large corpus of good deals (particularly if the engine weights others and then uses better players as a benchmark).
Randomness — Some people on the BW thread are saying that randomness will stop an AI.
The news out of Google is ahead of schedule, but it didn’t surprise me as much as Crazy Stone (the precursor to Alpha Go). Crazy Stone’s innovation was that if it couldn’t decide between two moves (because they were strategic, not tactical, or if the search depth got too great), it would simply play a few hundred random games from each position, and pick the move that scores better. Adding randomness to the evaluation function (of a non-random game!) greatly improved the structure, so much so that I believe I commented on it at the time. (Sadly, that was before the move, so I don’t have a tagged post. See my posts tagged go for some tangential comments.
Randomizing bridge hands would present different challenges, but the idea of just saying, “I don’t know, let’s just try each lead a few hundred times against random hands (that match with what we expect” is obvious, as well as using randomness (to decide whether to continue or shift suits). Because bridge doesn’t have Go’s massive search depth, you could also drop each hand into a single dummy solver for each position, or have it play randomly only until breaks are none (so plays randomly but not with a known position).
The thing about random play is that it’s fast. So you’ve won the opening lead, what to play? Whip up 100 random deals (not hard since you can see two hands plus a few other cards, plus all your bidding inferences) and try them out.
Hybridization — The trick is that you only resort to randomness if your trained algorithm isn’t confident of its training. This happens quite a bit in Go. (Go is amazingly frustrating in that expert or even master level players will be unable to communicate why a play is correct. I remember a lecture at the Pittsburgh Go Association and the lecturer, an amatuer 3 dan or so, was reviewing a game between two pros. And someone asked “Why did so-and-so play that move on that spot. Isn’t one space to the right better?”
Neither move had a tactical flaw, and the lecturer stumbled, then called out to a late arrival (a graduate student from Japan and — I believe — soon to turn Pro after getting his degree). The arrival went up to the big magnetic board, stared, said “Ah! It’s because of” and then laid out 10 moves for each side. Then reset, shifted the stone, and laid out ten different moves for each side then walked the few people who could understand the differences through it.
The point of my story? Go is hard. Go is hard enough so that the professional players routinely make moves that amateur experts cannot reasonably understand. Go experts can look much farther ahead than bridge players (and computers) — yet random simulation coupled with deep learning can handle it.
The Go playing program might very well have learned to play the move on the correct spot, and not one-to-the-right, in our example. How did it learn this? Because the experts did it. It gained a feel for what to do in those situations. But even assuming that it hadn’t learned, and was sitting in the back of the room (like a 20 year old me) and couldn’t see a difference between the two. It might still grope its way to the correct move using a Monte Carlo simulation on both moves. (This is assuming that it’s near term tactical engine couldn’t find both sequences and judge one obviously better).
Right now bridge computers have many advantages, and can play perfectly once enough is known about the hand. You’d never use a random engine at that point. This hybridized strategy would be for your master solver’s club type things where experts disagree.
And, if you are deciding between those two things, you are (by definition) an expert.
So, I stand of the opinion that Bridge hasn’t been solved because nobody has thought to attack it. Or perhaps there is not a large enough body of expert deals that can be conveniently fed into a computer. A clever programmer (which I am not) could probably have a system learn just by having it log onto BBO, assuming that it could learn which players to trust and which to not (and which ones to use as bidding examples). 30 Million deals, each played 4 times by experts may not be enough, but it’s probably in the ballpark.
Why hasn’t this been done? Probably nobody cares. Go is (by far) the sexiest game right now because it’s search space is unfathomably deep. Go players routinely scoff at the simplicity (by comparison) of chess. In terms of search space (for a single hand) bridge doesn’t compare. If Google put its money behind it, I think a Bridge computer would do well in a match against a top team. Also, there were prizes offered for Go programs that could play at a high enough level, which spurred on development over the last 20 years.