space and games

September 21, 2009

Will conditional commitments change politics?

Filed under: Voting — Peter de Blanc @ 12:44 pm

If you think that Condorcet voting would be a good thing, then you should also be in favor of voter collusion, such as Nader Trading, or political analogues of Facebook groups like “Once we reach 4,096 members, everyone will donate $256 to SingInst.org” or “1 million people, $100 million to defeat aging.” When groups can collectively change strategies to benefit group members, the global situation will start to look like a cabal equilibrium, and cabal equilibria in plurality or approval votes always elect a Condorcet winner. Of course, this only works to the extent that people follow through on their promises.

September 11, 2009

Base Rates: A Cautionary Tale

Filed under: General — Peter de Blanc @ 3:01 pm

The other day, I was reading a wikipedia article related to a topic we had been discussing in one of my classes. One of the statements in the second section confused me, and after a bit of thought I was convinced that it was indeed a mistake. Looking at the history, I noticed that this mistake was the result of an edit that had been made the day before.

Naturally, I reverted the article to the previous version. Looking at the history again, I noticed that the mistake had come from someone with an IP address very similar to my own. A quick search revealed that this person was in Philadelphia.

I decided that I was about 60% sure that it was someone in my class. Immediately I singled out a single person with 30% confidence.

There are about 1.5 million people in Philadelphia. There are about 15 people in my class. It would take a likelihood ratio of about 100,000 to pick out my class, and a likelihood ratio of about 1.5 million to pick out one person.

In class the next day, when I asked if anyone had edited wikipedia recently, they all said no.

And that’s how I lost 1.3 bits from my Bayes score.

August 31, 2009

Summer Research, Singularity Summit

Filed under: Decision Theory, General — Peter de Blanc @ 3:43 pm

This summer, I was involved in a summer research program at the Singularity Institute. Here we are:

SIAI Lunge

While I was there, I wrote a follow-up to my old Expected Utility paper. The new paper says basically the same thing as the old paper, but for repeated decisions rather than one-off decisions.

Roko Mijic and I have also started a paper about the problem of generalizing utility functions to new models – the sort of problem I call an “ontological crisis.” These situations arise for humans when we discover that the goals and values which we ascribe to ourselves do not correspond to objects in reality. Obvious examples include god, souls, and free will, but we’re just as interested in how AIs can deal with more mundane problems such as generalizing the notion of “temperature” from a classical to a quantum model. Unfortunately, we didn’t have time to finish the paper this summer, but you can expect to see it soon.

Towards the end of the summer, I made a few resolutions for the new year (as a grad student, my year starts in late August). In particular, I’ve resolved to write a popular blog. In the short term this will mean reducing the quality of my posts in exchange for much greater quantity, but in the long term I expect quality to rise again as I gain more experience writing. I’ll probably write about some of my less serious projects, such as the computer game I’ve been developing on and off for the past year.

In other news, the Singularity Summit will be in New York this year, on October 3-4. Anyone who wants to chat me up can do so at the summit, if you can find me among the horde of attendees. See you there!

March 4, 2009

A Stronger Type of Equilibrium

Filed under: Decision Theory, Voting — Peter de Blanc @ 5:57 pm

In game theory, a Nash equilibrium is an assignment of strategies to players, such that no player can benefit from changing vis own strategy (assuming the other players’ strategies remain the same). For example, in the one-shot Prisoner’s Dilemma, mutual defection is the only Nash equilibrium. There are several other ideas for how to define an “equilibrium” of a game, many of which are “refinements” of Nash Equilibrium, in the sense that the set of equilibria is smaller.

Here’s another one: a “cabal equilibrium” is an assignment of strategies to players, such that no subset (or “cabal”) K of players can simultaneously change strategies in such a way that some players in K benefit and no players in K are worse off. In other words, each cabal of players is playing a strategy which is Pareto optimal for the players within that cabal. Note that any subset of players is considered a cabal, so if there are N players, then there are 2N cabals.

Any cabal equilibrium is also a Nash equilibrium. A single player is just a cabal of size one, so if no cabal of players wishes to change strategies, then single no player wishes to change strategy. So this definition of equilibrium is “stronger” than the Nash equilibrium, in the sense that it is a more stringent test.

For example, the Prisoner’s Dilemma has no cabal equilibrium. First observe that mutual defection is the only Nash equilibrium, so there is at most one cabal equilibrium. But mutual defection is not an cabal equilibrium, because both players would prefer to simultaneously switch their strategy to cooperation.

Stag hunt has two Nash equilibria but only one cabal equilibrium. Battle of the Sexes has two Nash equilibria, both of which are cabal equilibria.

Mike Dobbins and I talked about cabal equilibria in voting. Because the outcome of an election depends on the voters in such a coarse way, the Nash equilibrium does not seem like a strong enough definition to apply to voting situations. For example, any election in which the winner beats the first loser by more than one vote is a Nash equilibrium, because no single voter can change the outcome by changing vis strategy. So not only are there multiple Nash equilibria, but in fact almost any election is a Nash equilibrium.

So let’s consider cabal equilibria in a plurality or approval vote. Suppose there is an equilibrium, and this equilibrium elects candidate A.

Suppose that more than half of the voters preferred B over A. Then they could all vote for B (and disapprove of A in an approval vote), resulting in a better outcome from their point of view. Then the original election was not a cabal equilibrium: contradiction! So there is no candidate that is preferred over A by a majority.

So if we ignore the possibility that exactly half the voters prefer some other candidate over A, this would mean that A is a Condorcet winner!

Now let’s consider the converse. Suppose that there is no cabal equilibrium. Then, in particular, it is not a cabal equilibrium for everyone to vote for the same candidate A. Then there exists some cabal K of voters which could make a Pareto improvement by changing votes. In order for the new strategy to be a Pareto improvement, it must affect the outcome of the election somehow; thus it must cause some other candidate B to be elected instead of A. In order to change the outcome of the election, we must have changed at least half of the votes (since originally everyone was voting for A), so K must contain at least half of the voters. Since no member of K prefers B over A, we know A is not preferred over B by a majority, and thus is not a Condorcet winner.

So if there is no cabal equilibrium, then A is not a Condorcet winner. Since the same argument works for any other candidate (not just candidate A), this means that no cabal equilibrium implies no Condorcet winner. Taking the contrapositive, this means that in an election with a Condorcet winner, there is a cabal equilibrium!

So we’ve just proven a theorem: A plurality or approval vote with an odd number of voters has a cabal equilibrium iff there is a Condorcet winner, and a cabal equilibrium always elects the Condorcet winner.

January 20, 2009

Intensional vs. Extensional Goals

Filed under: General — Peter de Blanc @ 12:39 am

Two types of goals for an agent are intensional and extensional goals. An intensional goal can be defined in purely mathematical terms, while an extensional goal depends on the universe in which the agent finds itself.

Some examples of intensional goals:

  • Find a prime number at least 200 bits long.
  • Prove Fermat’s Last Theorem.
  • Unscramble a Rubik’s Cube.
  • Fill in a Sudoku puzzle.

Some examples of extensional goals:

  • Predict the orbit of Mercury.
  • Drive a car across the Mojave Desert.
  • Win a trivia game.
  • Earn at least $500.

If we were coding a Go AI, we could try to build it to achieve either an extensional or an intensional goal. The obvious extensional goal is “win the game.” One possible intensional goal is “output a move that a minimax player would output.” In both cases we would probably include some sort of time limit.

“Output a move that a minimax player would output” stands out from the other examples of intensional goals listed above. In all of the other examples, the agent can be sure that its output is correct before it returns, but if I tell you to “output a move that a minimax player would output,” I haven’t given you an implementable procedure for checking whether you’ve achieved the goal.

It’s not so hard to think of other intensional goals with this property. Instead of asking an agent for a proof of Fermat’s Last Theorem, I could ask it to output 1 if a proof exists, and 0 otherwise.

Let’s say that these two are examples of “the hard kind of intensional goal,” and the four listed at the top of the page are “the easy kind of intensional goal.” The “easy” one are not necessarily easier individually than the “hard” ones; it’s easier for me to output a 1 than to output a proof of Fermat’s Last Theorem. But the “easy” ones are easier to think about, and as a class they’re easier.

In fact, the “hard” intensional goals are so hard that, from a certain point of view, they include the extensional goals! Take any extensional goal, and replace all the unknown parts with a probability distribution (perhaps based on algorithmic complexity), and you have an intensional goal.

Well, that’s a bit of a cop-out, because I don’t actually know how to do that for any of the extensional goals I listed at the top of the page. But we can do it for any goal for which we have already coded a success-detector – a piece of code which can determine, after the fact, if we have achieved our goal. In the Go example, we can do this. The Go AI interacts with its opponent through a a predefined protocol in which the AI outputs its moves and the opponent inputs vis moves to the AI. The board state can be built up from the list of moves, and so after any hypothetical series of plays, the AI can determine whether the game is over, and who won.

So to reformulate “win the game” as an intensional goal, we can suppose that our opponent’s moves are generated by some unknown Turing machine drawn from a given probability distribution. We use the list of moves which have occurred so far to do a Bayesian update on this distribution. Then any possible policy for generating moves has a probability of winning, and we output the move recommended by the policy with the greatest winning probability.

This way we can specify a program (called AIXI) which, if run on a big enough computer, would output what we want to output. And then our goal can be intensionally defined as “output whatever AIXI would output.”

A minimax player is a tall order. AIXI is an even taller order. We can’t actually run these programs, but we want to output, in a reasonable amount of time, whatever they would output. This may require uncertain reasoning about mathematics.

December 9, 2008

What Makes a Hint Good?

Filed under: General — Peter de Blanc @ 1:40 am

Nick Hay, Marcello and I discussed this question a while ago: if you had a halting oracle, how could you use it to help you prove a theorem, such as the Riemann Hypothesis? Let’s say you are only allowed to ask one question; you get one bit of information.

Of course, you might simply ask if the Riemann Hypothesis is true. If you trust your oracle, this may be good enough for your purposes. But let’s suppose what we really care about is the proof. Maybe we want to win a prize for our proof, and they won’t accept a proof by oracle.

Clearly, a complete proof of the Riemann Hypothesis is longer than one bit, so we can’t ask the oracle to output the proof for us. So what can we do? Well, we could ask it for a hint. Maybe we have some conjecture (which we can call Conjecture A) that could be useful in proving the RH. But we don’t know if A is true, nor do we know if proving A would actually help us prove the RH. So we could ask the oracle if it’s worth our time to try to prove A.

What we really want to know is if we would save time by trying to prove A, and then trying to use A to prove the RH, compared to some other strategy. To figure out how much time it would take, we need some sort of model of how we write proofs. Here’s a really simple model: “mathematicians write proofs by brute force; they look at every possible proof, from short to long, until finding one which proves their goal hypothesis, at which point they halt.”

Let’s suppose this model is true (lol). Let’s define the “difficulty” of a theorem as the length of its shortest proof. If our proofs are written in binary, then to prove a theorem of difficulty D, it takes time t = O(2D).

Let’s say D(RH) is the difficulty of the Riemann Hypothesis, D(A) is the difficulty of Conjecture A, and D(RH | A) is the difficulty of proving the RH once we have proved A. Then a mathematician who attempted to prove A before proving the RH would take time tA,RH ~ 2D(A) + 2D(RH | A), while a mathematician who attempted to prove the RH directly would take time tRH ~ 2D(RH).

So we could ask the oracle:

True or False? 2D(A) + 2D(RH | A) < 2D(RH)

If the oracle returns True, then we would start thinking about how to prove A.

March 24, 2008

Ordinal Performance Games

Filed under: Decision Theory — Peter de Blanc @ 9:56 pm

There are games (such as racing or DDR) in which players compete with one or more opponents, but the opponents do not interact. Each player performs individually, and receives some score for vis performance, and the player with the best score wins. I’ll call these “ordinal performance games.”

One might naively think that the best strategy is to ignore one’s opponents, and think only of maximizing one’s own score. After all, the players never interact, so your score depends only on your performance, right? That’s true, but the object is not to maximize your own score; it is to have a higher score than your opponents, and this depends on your opponent’s score as well.

For example, suppose we have a game with three player types, and possible scores of 1 through 5.

  • Alf is consistently mediocre, always receiving a score of 3.
  • Beth has adopted a risky strategy; she receives a score of 4 with probability 0.6 and a score of 1 with probability 0.4.
  • Gam may succeed spectacularly, but usually plays poorly. He receives a score of 5 with probability 0.4 and a score of 2 with probability 0.6.

Suppose Alf and Beth play. Alf will always score 3. Beth has a probability of 0.6 of scoring 4, and thus winning, and a probability of 0.4 of scoring a 1, and thus losing. So p(Alf < Beth) = 0.6.

Suppose Beth and Gam play. Beth has a probability of 0.4 of scoring 1, in which case Gam is guaranteed to win. The other way for Gam to win is for Beth to score a 4 while Gam scores a 5; the probability of this is 0.4 * 0.6 = 0.24. Thus p(Beth < Gam) = 0.4 + 0.24 = 0.64.

Suppose Gam and Alf play. Gam can only win by scoring a 5, with probability 0.4. So p(Gam < Alf) = 0.6.

So Alf usually loses to Beth, who usually loses to Gam, who usually loses to Alf. This shows that your optimal strategy in an ordinal performance game can depend on your opponents.

January 9, 2008

Infinite Certainty

Filed under: General — Peter de Blanc @ 9:26 pm

On Overcoming Bias, Eli says:

I don’t think you could get up to 99.99% confidence for assertions like “53 is a prime number”. Yes, it seems likely, but by the time you tried to set up protocols that would let you assert 10,000 independent statements of this sort – that is, not just a set of statements about prime numbers, but a new protocol each time – you would fail more than once. Peter de Blanc has an amusing anecdote on this point, which he is welcome to retell in the comments.

Here’s the anecdote:

Conversation with squallmage at 2006-07-28 23:46:54 on OneTrueCalculus (aim)
(23:47:19) OneTrueCalculus: oi
(23:47:23) SquallMage: oi
(23:47:33) OneTrueCalculus: how likely would you rate it that 81,241 is prime?
(23:47:59) SquallMage: very
(23:48:03) OneTrueCalculus: you can go ahead and calculate it if you want, or just give me a probability
(23:49:03) OneTrueCalculus: obviously this would be the subjective sort of probability
(23:49:11) SquallMage: yes.
(23:50:30) OneTrueCalculus: well?
(23:50:35) OneTrueCalculus: or are you busy calculating?
(23:50:38) SquallMage: I said ‘very’.
(23:50:42) OneTrueCalculus: ah
(23:50:46) SquallMage: Quantify that if you must, I’m too tired to.
(23:50:55) OneTrueCalculus: ok
(23:51:10) OneTrueCalculus: I won’t quantify it for you, since I don’t know how you are calibrated
(23:51:22) OneTrueCalculus: okay, how about 7?
(23:51:46) SquallMage: the probability of it’s primacy, or as a quantification of ‘very’?
(23:51:55) OneTrueCalculus: the probability of 7 being prime
(23:52:02) SquallMage: 7 is prime.
(23:52:08) OneTrueCalculus: with probability 1?
(23:52:18) SquallMage: Yes.
(23:52:41) OneTrueCalculus: so will you accept this deal? If you ever find out that 7 is not prime, you will give me $100.
(23:53:10) SquallMage: Only if you explain to me in detail what brought you to propose that deal to me.
(23:53:37) OneTrueCalculus: I am trying to swindle you out of your cash and/or teach you a valuable lesson
(23:54:09) OneTrueCalculus: also, I am trying to figure out how overconfident people are
(23:55:42) SquallMage: Well. I would make a deal with you that I would give you $100 if it were ever proven to me that it was possible in base-ten to generate the quantity 7 by multiplying together any two integers other than 1 and 7.
(23:55:55) OneTrueCalculus: okay
(23:56:05) OneTrueCalculus: I am only talking about the standard natural numbers. No weird groups or anything
(23:56:20) OneTrueCalculus: base 10
(23:56:24) SquallMage: No, ‘7 and 1′ is not different than ‘1 and 7′ also.
(23:56:32) OneTrueCalculus: of course not
(23:56:39) SquallMage: Just checking.
(23:56:42) OneTrueCalculus: I wouldn’t use a cheap technicality like that
(23:56:44) SquallMage: Since I’m a language bastard.
(23:56:49) SquallMage: And I completely would.
(23:56:54) OneTrueCalculus: okay
(23:57:13) OneTrueCalculus: If you even honestly feel that it was a cheap technicality, I wouldn’t expect you to pay me.
(23:57:22) SquallMage: Anyways.
(23:57:23) OneTrueCalculus: Under those conditions, will you accept my offer?
(23:57:33) SquallMage: Yes.
(23:57:37) OneTrueCalculus: Okay.
(23:57:44) SquallMage: Tell me how I owe you $100 now
(23:57:48) OneTrueCalculus: you don’t
(23:57:52) SquallMage: Good.
(23:57:56) OneTrueCalculus: now
(23:58:03) OneTrueCalculus: will you accept the same offer, but for 11 this time?
(23:58:05) SquallMage: Not for 81241
(23:58:34) OneTrueCalculus: i.e. if you ever find out that 11 is not prime, you will give me $100
(23:58:49) SquallMage: I would make that deal under equivalent conditions
(23:59:00) OneTrueCalculus: okay. I’m asking you to make that deal.
(23:59:44) SquallMage: I did say that I would.
(00:00:00) OneTrueCalculus: okay, thanks
(00:00:04) OneTrueCalculus: how about 13?
(00:00:20) SquallMage: Yes.
(00:00:48) SquallMage: 17 also, and 19, and 23.
(00:00:54) OneTrueCalculus: thanks.
(00:00:57) OneTrueCalculus: What about 27?
(00:01:02) SquallMage: No thanks.
(00:01:05) OneTrueCalculus: 29?
(00:01:11) SquallMage: sure.
(00:01:13) OneTrueCalculus: 31?
(00:01:17) SquallMage: Yep.
(00:01:20) OneTrueCalculus: 33?
(00:01:24) SquallMage: Nah.
(00:01:26) OneTrueCalculus: 37?
(00:01:29) SquallMage: Yah.
(00:01:31) OneTrueCalculus: 39?
(00:01:35) SquallMage: Nah.
(00:01:37) OneTrueCalculus: 41?
(00:01:42) SquallMage: Yah.
(00:01:45) OneTrueCalculus: 43?
(00:01:51) SquallMage: Yah.
(00:01:54) OneTrueCalculus: 47?
(00:02:06) SquallMage: Yah.
(00:02:09) OneTrueCalculus: 49?
(00:02:17) SquallMage: Nah.
(00:02:20) OneTrueCalculus: 51?
(00:02:28) SquallMage: Yah.
(00:02:36) OneTrueCalculus: Thank you. I win.
(00:02:48) SquallMage: You know I’ve been up since this time yesterday.
(00:02:51) OneTrueCalculus: You can donate your money to the Singularity Institute for Artificial Intelligence
(00:03:06) SquallMage: I’ll forward you their message of receipt.

Expected Utilities Paper

Filed under: Decision Theory — Peter de Blanc @ 8:58 am

I still haven’t published it, but it’s now on the arXiv.

October 29, 2007

Athena’s Theorem

Filed under: General — Peter de Blanc @ 2:10 pm

In the Odyssey, Athena says to Telemachus:

It’s true few men are like their fathers. Most of them are worse. Only very few of them are better.

Athena was pretty sharp! Of course, “better” and “worse” are rather vague terms, but we can talk more precisely if we consider reproductive fitness, defined as the number of offspring one has. Then we can restate Athena’s theorem as:

The expected fitness of a random organism is ≤ the expected fitness of its parent.

Intuitively, one can reason as follows: the expected fitness of a random organism is the average fitness of the population. But the expected fitness of its parent is above average, because we already know that the parent has at least one child.

A rigorous proof is left as an exercise to the reader.

« Previous PageNext Page »

Powered by WordPress