Does White have an advantage in chess?

It seems likely, doesn’t it? After all, as White, you get to develop your pieces first, putting them in position to better attack Black, or at least defend against his/her attacks. And the statistics seems to bear this out (at least, according to Wikipedia, my source for all things true). Though it could turn out that Black has an advantage – it might be that any move by White fatally weakens his/her position, so that they are at a comparative disadvantage to Black.

Whatever the outcome might be, it turns out that if either side has an advantage, that advantage is necessarily complete. Put more formally: either (a) there exists a strategy profile for White that guarantees victory, or (b) a strategy profile for Black that guarantees victory, or (c) strategy profiles for both White and Black that guarantee a draw. It sounds somewhat trivial, but it’s not: for example, if (a) is true, this means that no matter what Black does, the outcome of the game is not in doubt – White will win.

Results such as these are common among many games without exogenous uncertainty, and many games have been solved, so we know which of the analogues to the above possibilities are true. For example, checkers was recently shown to have strategy profiles for both players which guarantee a draw. So that this holds true for chess as well should not come as a surprise.

To show that one of these three possibilities must hold, we can draw a game tree which contains all the possible move sequences in chess[1]. This is because chess ends in a finite number of moves: a draw is automatically declared if the same position is reached three times, or if fifty moves have gone by without a pawn move or a piece capture. Since there are only a finite number of possible pawn moves (given the size of the board) and piece captures (since there are only 32 pieces), the game is finite.

Next, we can use backward induction (as in my post on tic-tac-toe) from each possible ending of a game to determine the outcome from the beginning. At each node, the player involved (White or Black) deterministically selects the branch that leads to the best final outcome for him/her (using tie-breakers if necessary if several outcomes are equally good). We proceed in this manner all the way up to the initial node, corresponding to the starting position of the game. We can then go back down the tree, and since we have already determined the best response to any position, we can deterministically get to the best outcome for Black or White. This automatically yields a win for one of them, or a draw.

Unfortunately, while this works well in theory, in practice it is virtually impossible. Given the combinatorial explosion of positions in chess, the computing necessary to determine which possibility is correct is infeasible. I guess we’ll be stuck with just a good game of chess.


[1] That is, theoretically; the actual game tree is WAY too big to actually depict

Advertisements

What do kittens have to do with rising tuition?

If you read the Financial Times, you might suspect from an article on Monday that kittens have something to do with rising tuition and Prisoners’ Dilemmas. Let me assure you that they don’t.

A friend of mine sent me the article, which cites a model designed by a team of Bank of America consultants who use the Prisoners’ Dilemma to explain rising college tuition. Here is the graphic they used:

kitten

Fig. 1: Things that are pairwise irrelevant to each other:

a kitten, the Prisoner’s Dilemma, and rising tuition.

They explain that the college ranking system (assuming two colleges) is a zero-sum game. If one college moves up, the other one moves down. “A college can move up in the rankings if it can raise tuition and therefore invest in the school by improving the facilities, hiring better professors and offering more extracurricular activities.” And therefore, they conclude, this is why college tuitions have been rising and why student debt will continue to rise.

First glaring problem: (raise, raise) is a Pareto-optimal outcome as they’ve set up this game, but what they probably meant to say was that it is a Nash equilibrium. Or maybe they meant to say that “raise” is the best response for each college. Anyway, in this game, (don’t raise, don’t raise) is also Pareto-optimal (but not a Nash equilibrium)!

Secondly, they’re trying to illustrate a kind of ratcheting problem: both colleges raise tuition to raise the quality of the resources at the school, in order to maintain their rankings. But, this means it’s a repeated game. In repeated games that have a finite horizon, defection happens at every step, but at infinite horizon games, cooperation can occur. Now, let’s just assume that this is an infinite horizon game, which is what the folks at B of A are assuming when they predict that college tuition will keep rising indefinitely, beyond mere inflation. What incentive is there to cooperate and keep tuition low? According to this game, none.  And according to what you might expect in reality, none – is it plausible that, in the absence of antitrust laws, that colleges would want to collude to keep tuition low, and that because they can’t collude, they are doomed to raise tuition every year against their wills? Nope.

Then, we come to the matter that in fact this game can’t be infinite horizon as it is presented here.  The simple reason is that, even if education is becoming a larger and larger share of a household’s spending, and even if the student is taking out loans and borrowing against his future expected earnings, he still has a budget set that he can’t exceed. Furthermore, the demand for attending college at a particular university should drop as soon as the tuition exceeds the expected lifetime earning/utility advantage for whatever the student sees himself doing in 4 (or more) years over the alternative. So, there will be some stage at which the utilities change and it becomes a best strategy for neither school to increase its tuition. So, it’s a finite stage game and the increase will stop somewhere, namely, where price theory says it should. [1]

Finally, it’s not clear that increasing tuition actually has such a strong effect on school rankings or that colleges are in such a huge rankings race. And, even if students at colleges outside the very top schools tend to choose a college based on things like food quality and dorm rooms, students don’t demand infinitely luxurious college experiences at infinite prices. Evidence: Columbia students feel they’re overpaying for food, and feel entitled to steal Nutella.

The lessons here are these: It’s not a Prisoner’s Dilemma in a strong sense if the cooperative result isn’t strictly preferred to the Nash equilibrium. Don’t model a tenuous game where the game isn’t relevant to the ultimate result (tuitions will stop rising at some point). Don’t assume that trends are linear, when they are definitively not linear. And, don’t put a kitten on your figure just because you have some white space — it really doesn’t help.

———————

[1] Actually, the game doesn’t have to be finite horizon. Suppose the upper limit that the colleges know they can charge is A, and the current tuition is B_t. Then, at each stage, they could increase tuition by (0.5)*(A - B_{t-1}). But, as the tuition approaches A, the increases become smaller and smaller until they pretty much just vanish, and it would be the same as stopping, because there is a time at which the tuition would stop affecting rank (a college isn’t going to improve it’s rank by charging each student an extra cent.)


Picking off the runner

You gotta keep him close to first. Otherwise he might as well walk to second. Don’t give him anything for free.

But what’s the best way to do that? How often should one throw over to the first baseman to make sure that he doesn’t steal?

Let’s model this through the payoff to the pitcher and the runner. The runner, obviously, wants to steal second, or advance an extra base on a ball put in play. The pitcher wants the exact opposite: he wants to prevent that from happening. To do so, he will attempt to pick off the runner at first if he gets too greedy.

So let’s now assign values to this, as a function of how big a lead that the runner takes (x), and how often the pitcher throws the ball over to first (t). We say that the runner gets payoff  from gaining an extra base, and does so with probability F(x), with \frac{\partial F(x)}{\partial x}>0. Meanwhile, the pitcher gets payoff v from picking off the runner, which occurs with probability P(x,t), where \frac{\partial P(x,t)}{\partial x} > 0 and \frac{\partial P(x,t)}{\partial t} > 0 for x>0, while P(0,t)=0. In other words, the probability you get picked off is greater the bigger a lead you take, and the more often the pitcher throws over to first, assuming you’re not actually standing on first. Finally, since you win in baseball if and only if the other team loses, we model this as a zero-sum game.

If this were our entire expression, it would be obvious that the pitcher should throw to first base as much as possible. In other words, he should try to pick off the runner at first at literally every single moment, and ignore the batter at home. So why don’t we see this?

The reason is because the pitcher is only human. Every time he throws the ball to first, there’s a chance that it will get away from him and sail over the first baseman’s head. If so, the runner will get his extra base anyway as the team on the field scrambles to recover the ball. This means that by throwing more frequently, there are more opportunities to screw up. Assuming the probability of throwing the ball away is the same each time, this gives the runner an additional payoff of  given how often the ball is thrown. Thus, our expressions for the payoffs of the runner and the pitcher are:

\pi_{r}=uF(x)+ut-vP(x,t)

\pi+{p}=vP(x,t)-uF(x)-ut

A Nash equilibrium will occur whenever the pitcher cannot make it better for his team to throw more frequently, and the runner cannot improve his chances anymore of taking an extra base without risking himself too much. Thus, in equilibrium:

u\frac{\partial F(x)}{\partial x}-\frac{\partial P(x,t)}{\partial x}=0

u-v\frac{\partial P(x,t)}{\partial t}=0

Of course, it’s possible that the pitcher is so good at picking people off that the runner doesn’t dare step off the bag, in which case we have an equilibrium which doesn’t satisfy the above two equations. Alternatively, the runner could be so good that no matter how often the pitcher tries to pick him off, he gets the extra base anyway, in which case the equations don’t hold either. But these cases are not normally found in the big leagues, so we’ll leave it as is.


How to Win at “Shotgun” (or not)

I’ve known about this game for quite a while. I learned it in camp as a kid, and loved it at the time. In college, some of my friends rediscovered this game, and would play it at the dining hall table. So, I thought I would write about it in a post.

This game is a little bit more obscure than some of the others I’ve covered, so let me go through some of the details (which can be found here, anyway). Basically, you move simultaneously, by slapping your lap twice, then doing one of three moves: reload, shoot, or shield. Reloading gives your gun an additional bullet. Shooting means you’re trying to kill the other guy. Shielding blocks the (potential) bullet that the other guy is shooting at you. You win by shooting the other guy while he’s reloading; if you shoot each other simultaneously, the game ends in a draw.

Let’s assign the winner 1 point, the loser 0 points, and each player 1/2 a point in case of a draw. That seems pretty reasonable, right?

So, first thing, we can see that there’s a very simple subgame perfect Nash equilibrium (SPNE): each player always shields, no matter how many bullets he/she or the other player have. That this is a SPNE is pretty clear: nobody can do better by ever doing something else, since they’ll never get the opportunity to successfully shoot the other guy anyway, since his shield will always be up. Thus the game ends in a draw (or goes on forever, you pick). It’s pretty boring, but it works.

The question is, can we come up with something more interesting? Can anyone actually ever win in an SPNE? As we’ll see, the answer is NO.

Notice that in the SPNE, the expected number of points at any time cannot be less than 1/2 for each player: if it were, then they could do better by simply shielding forever, guaranteeing them at least 1/2 a point. Since the total number of points is 1, this means that each player can always expect exactly 1/2 a point.

This immediately implies that neither player will ever reload when the other guy has a bullet in the chamber of his gun (at least, not with a non-zero probability). If there ever were the slightest chance that he would, the other guy could shoot him, guaranteeing himself an expected number of points greater than 1/2. This is because, in the slight probability that he was reloading, the other guy would win (and get one point); if he ended up shielding while the other guy shot, the other guy could just shield forever. But this goes against the point in the previous paragraph: in any SPNE, the expected number of points for each player is exactly 1/2. Thus one will never reload, when there’s a possibility one could be shot, in any SPNE.

But this means that no one ever wins. Sounds pretty boring. I don’t think I’ll be playing this anymore.


How did the chicken cross the road?

While reading an open newspaper, according to my host for lunch a couple of weeks ago. Apparently this is the most efficient way for a pedestrian to cross a street in Boston, where signs are few and confusing and drivers aren’t fond of traffic laws. (Sound like any other city we know?) An open paper is a big gray signal to the driver that you have no idea she’s there, and if she doesn’t want to run you over, she has to stop. It’s not unlike ripping out your steering wheel and waving it out the window in a game of Chicken.

The city of Philadelphia is aware of this, of course, and the Philly police are cracking down on pedestrians who text and walk. Clearly a cell phone is just not a visible enough signal.

Disclaimer: Play at your own risk, and look both ways. We do not endorse breaking traffic laws.


How much should you wager in Final Jeopardy?

I think you all know how this one works, but let me sketch it out again anyway. You’ve been answering trivia questions, and now you have some dollar score going into the last round. Your two opponents have been doing the same. In the last round, you’re asked one question, for which you write down your answer, and how much you’re willing to bet on it, up to the amount of money you have; your opponents do the same. You’re trying to get as much money as you can from this game, so winning is not all that matters: you want to both win and have a higher score. The question is: how much should you wager? Your mother, Trebek.

We can model this by positing that each of players 1,2,3 have scores of D_{1}, D_{2}, D_{3}, respectively, going into the final round. They expect to answer correctly with probabilities p_{1}, p_{2}, p_{3}, respectively; we assume that these values are common knowledge. The wagers of each player are w_{1}, w_{2}, w_{3}, respectively, where for each i, w_{i}\leq D_{i}.

We describe F_{i}(N_{i}|D_{-i},s_{-i}) to be the probability that player i wins with a certain final score N_{i} (after the question has been answered correctly or incorrectly), given the strategies s_{-i} that the other players use, and the scores they have D_{-i} going into Final Jeopardy. Obviously, since the person with the highest score wins, you have a better chance of winning if you have a higher final score. Thus, F_{i}(\cdot|D_{-i},s_{-i}) is nondecreasing in N_{i}, no matter what s_{-i} and D_{-i} are.

Claim: If p_{i}\geq\frac{1}{2}, then it is a dominant strategy to bid all your money; namely, w_{i}=D_{i}.

Proof of the claim:Fix strategies s_{-i} for the other players. Suppose you wager w_{i}. Then your expected amount of money that you win is

\Pi_{i}(w_{i},D_{i}|D_{-i},s_{-i})=(D_{i}-w_{i})F_{i}(D_{i}-w_{i}|D_{-i},s_{-i})(1-p_{i})+(D_{i}+w)F_{i}(D_{i}+w|D_{-i},s_{-i})p_{i}.

Meanwhile, if you wager D_{i}, you can expect to win 2D_{i}F_{i}(2D_{i}|D_{-i},s_{-i})p_{i}. Notice, though, that since F_{i}(\cdot|D_{-i},s_{-i}) is nondecreasing, and p_{i}\geq\frac{1}{2},

\Pi_{i}(w_{i},D_{i}|D_{-i},s_{-i})\leq(D_{i}+w)F_{i}(D_{i}+w|D_{-i},s_{-i})p_{i}+(D_{i}-w_{i})F_{i}(D_{i}+w_{i}|D_{-i},s_{-i})p_{i}

=2D_{i}F_{i}(D_{i}+w|D_{-i},s_{-i})p_{i}\leq2D_{i}F_{i}(2D_{i}|D_{-i},s_{-i})p_{i}.

Hence, regardless of what D_{-i}, s_{-i} are, you can expect to get the most amount of money by bidding all you’ve got. Indeed, this argument is so general that it applies no matter how many people are playing Final Jeopardy against you – as long as you think you’re more likely to get it right than wrong, you should bid everything.

Now what should you do if you’re not so certain you’re going to get it right? If both other players are still more likely than not to answer correctly, then what you should do is pretty straightforward – you know what they’re going to wager (which is everything they’ve got), and so you can precisely calculate F_{i}(\cdot|D_{-i},s_{-i}). All you then have to do is choose the value of w_{i} which maximizes \Pi_{i}(w_{i},D_{i}), as defined above.

Unfortunately, the equilibrium analysis is in general not very tractable for a 3-player round of Final Jeopardy. That is, it doesn’t give a very nice closed-form solution (or at least, I can’t find one). So, while you should go by the advice in the previous paragraph (assuming these probabilities are commonly known), I can’t say much about what’s going to happen in general. Also, be aware that the advice doesn’t hold if the probability is not commonly known, in which case you will need to consider the entire hierarchy of beliefs.

In the spirit of academic honesty, I am heavily indebted to the online Google Books edition of “Strategy and Games: Theory and Practice,” by Prajit K. Dutta for his analysis for when p_{i}\geq\frac{1}{2}, though the proof I use of the strategic dominance of wagering everything is more formal than his. The analysis for when p_{i}<\frac{1}{2} is my own, and so should probably be taken with a grain of salt.

Edit: A previous edition of this post said that Final Jeopardy violated the conditions of the Kakutani fixed point theorem, and so there was likely no equilibrium. While it is true that these conditions are violated, it is not for the reason claimed, and so there actually will be an equilibrium, as demonstrated by Dasgupta & Maskin (1986). It will likely involve ties in some cases, for reasons that I’m too lazy to show.