space and games

June 14, 2010

Some material from my upcoming book

Filed under: General — Peter de Blanc @ 9:15 pm

I’m writing a book about Go strategy. I’ll be comparing moves by professionals to moves by amateurs in the 1-3k range. My hope is that amateur players may improve by trying to imitate the professionals.

Two-Stone Corner Patterns
Three-Stone Corner Patterns
Four-Stone Corner Patterns

March 14, 2010

The War Club vs. the Ant-Men, part 1

Filed under: General — Peter de Blanc @ 11:19 pm

Paron had been at the academy for far too long. He had switched from geometry to biology, and finally to game theory. When at last he finished his thesis, it was a cause for celebration. The party was far from grand; more than twenty people were packed into Paron’s meager, candle-lit apartment. Like organisms, conversations competed with their kin for limited attention and limited air while sleep-deprived gamers competed to dominate virtual markets. I was in my element.

“The strange thing about Ant War,” I mused, “is the player. We’re willing to anthropomorphize ants to the point of substituting our own decisions for theirs in the game, but ant behavior is completely inhuman.”

“It’s not like normal human behavior,” said my opponent, Nik, “but unusual situations can produce unusual behaviors. We’re fighting a virtual war; humans could also fight a real one.”

This comment drew Paron’s attention. “More unusual than you might think. A war requires extreme cooperation. The ants in a colony are all sisters, and they share 3/4 of their genes because their father is basically a glorified sperm cell. To get humans to cooperate like ants, they’d have to be born of incest.”

“You don’t need common end goals to cooperate,” countered Nik, “only common proximate goals. Not to mention the fact that evolution formed our goal systems imperfectly; we may agree on values aside from inclusive fitness.”

“But a war requires two coalitions,” said Paron. “You’d need everyone in your coalition to share proximate goals – and everyone in the opposing coalition to share an opposing proximate goal. And if both coalitions are composed of humans? — what could ever cause that, apart from inclusive fitness?”

I moved my ants, then re-entered the conversation. “What about competing protocols? Maybe one coalition follows one set of rules, and the other coalition follows different rules. Economic productivity would be boosted if everyone followed the same rules, but whichever coalition is forced to convert has to pay a cost.”

“But then you’d just bid on it,” said Paron.

“Why don’t ants bid on land?” asked Nik.

Paron said, “Ants don’t engage in inter-colony trade. They’re not smart enough to trade, so they war instead.”

“Let’s go back to end goals,” said Nik. “I think there are goals that all humans share; we all value love, life, and laughter, and not just for ourselves and our kin. One coalition could be human. The opposing coalition could be inhuman.”

“Like what? Giant ants?” I asked.

Nik said, “Or the broad-shouldered people across the sea.”

January 19, 2010

RPS Equilibrium Conundrum

Filed under: General — Peter de Blanc @ 10:05 pm

Clearly, it’s absurd that paper beats rock, but if rock beat paper then the game would become pointless.

Suppose we changed the rules such that paper only scores 1/2 point against rock. A full victory (rock against scissors or scissors against paper) scores 1 point, and a loss scores -1 point. Draws score 0 points. What mixed strategy is best in this game?

I found this equilibrium: p(rock) = p(paper) = 2/5, and p(scissors) = 1/5. If the opponent plays this strategy, then anything we do has an expected utility of 0. If both players use this strategy, then neither player has an incentive to change, so it’s an equilibrium.

This result seems strange to me. The rule change makes paper worse, and yet in the resulting equilibrium, we increase the probability of throwing paper. Who wants to explain this?

December 28, 2009

Germs, Selection, and Disease

Filed under: General — Peter de Blanc @ 10:35 am

The germ that infects you was selected for its ability to spread across hosts, but the germ population in your body is being selected for its ability to spread within your body. The latter is more destructive than the former, so the germ population in your body becomes more destructive over time. Thus for any infectious disease, we should expect the period of maximal transmissibility to precede the period of maximal suffering.

November 4, 2009

Go proverbs: “A rich man should not pick quarrels.”

Filed under: General — Peter de Blanc @ 1:19 pm

Go players have hundreds of proverbs — pithy sentences that convey important heuristics. It is not enough to simply read proverbs; you must study them at length to unfold them into procedural knowledge.

Most proverbs are particular to Go (e.g. six die but eight live), but some generalize to other adversarial situations, and a few proverbs contain important lessons about rationality.

One of my favorite proverbs states that a rich man should not pick quarrels. Go, in its most common formulations, is a game of satisficing. The player with more points wins the game, and winning is enough; there is no extra reward for winning by a large margin. The proverb says that if you are currently winning (i.e. you are a rich man), then you should not do things (such as picking quarrels) that make the outcome more random. By decreasing the variance in the probability distribution for your final score, you increase the probability that you will hold onto enough points to win. Anything that makes the game simpler and more predictable is good for you.

We can see this in Chess (the winning player should seek to trade pieces) and in epee fencing (the winning player should seek double-touches).

If, on the other hand, you are a poor man, then you should pick quarrels. There’s a good example of this in Indiana Jones and the Temple of Doom. In one scene, Indy is in the middle of a rope bridge, and swordsmen are approaching from either side, so Indy cuts the bridge.

If you are winning, simplify. If you are losing, complexify.

October 16, 2009

Shock Levels are Point Estimates

Filed under: General — Peter de Blanc @ 10:50 pm

Eliezer Yudkowsky1999 famously categorized beliefs about the future into discrete “shock levels.” Michael Anissimov later wrote a nice introduction to future shock levels. Higher shock levels correspond to belief in more powerful and radical technologies, and are considered more correct than lower shock levels. Careful thinking and exposure to ideas will tend to increase one’s shock level.

If this is really true, and I think it is, shock levels are an example of human insanity. If you ask me to estimate some quantity, and track how my estimates change over time, you should expect it to look like a random walk if I’m being rational. Certainly I can’t expect that my estimate will go up in the future. And yet shock levels mostly go up, not down.

I think this is because people model the future with point estimates rather than probability distributions. If, when we try to picture the future, we actually imagine the single outcome which seems most likely, then our extrapolation will include every technology to which we assign a probability above 50%, and none of those that we assign a probability below 50%. Since most possible ideas will fail, an ignorant futurist should assign probabilities well below 50% to most future technologies. So an ignorant futurist’s point estimate of the future will indeed be much less technologically advanced than that of a more knowledgeable futurist.

For example, suppose we are considering four possible future technologies: molecular manufacturing (MM), faster-than-light travel (FTL), psychic powers (psi), and perpetual motion (PM). If we ask how likely these are to be developed in the next 100 years, the ignorant futurist might assign a 20% probability to each. A more knowledgeable futurist might assign a 70% probability to MM, 8% for FTL, and 1% for psi and PM. If we ask them to imagine a plethora of possible futures, their extrapolations might be, on average, equally radical and shocking. But if they instead generate point estimates, the ignorant futurist would round the 20% probabilities down to 0, and say that no new technologies will be invented. The knowledgeable futurist would say that we’ll have MM, but no FTL, psi, or PM. And then we call the ignorant person “shock level 0″ and the knowledgeable person “shock level 3.”

So future shock levels exist because people imagine a single future instead of a plethora of futures. If futurists imagined a plethora of futures, then ignorant futurists would assign a low probability to many possible technologies, but would also assign a relatively high probability to many impossible technologies, and there would be no simple relationship between a futurist’s knowledge level and his or her expectation of the overall amount of technology that will exist in the future, although more knowledgeable futurists would be able to predict which specific technologies will exist. Shock levels would disappear.

I do think that shock level 4 is an exception. SL4 has to do with the shocking implications of a single powerful technology (superhuman intelligence), rather than a sum of many technologies.

September 22, 2009

Vote matching

Filed under: General — Peter de Blanc @ 6:11 pm

In light of my previous post, I’d like to suggest a vote-matching scheme. Let’s start with an example:

Kodos Kang Washington

Suppose there’s a presidential election between Kodos, Kang, and Washington. Kodos and Kang seem to be the leading candidates.

Alf and Beth are trying to decide who to vote for. They both like Washington, but they don’t want to waste their votes. Alf thinks Kodos is the “lesser of two evils,” while Beth prefers Kang.

If Alf votes for Kodos and Beth votes for Kang, as they are inclined to do, then their two votes will “cancel out,” at least in the race between Kodos and Kang. This means that if they both agree to switch their votes to Washington, the balance of votes between Kodos and Kang will not change. Washington gets two extra votes!

This sort of vote-matching should be able to benefit some third-party candidates in real life, too. The key requirement is that voters who prefer the third-party candidate disagree about which of the two front-runners is worse. In that case, two voters can promise to vote for the third-party candidate instead of their “lesser of two evils.” If this sort of vote-matching scheme took off, I think we could see a big change in politics.

September 11, 2009

Base Rates: A Cautionary Tale

Filed under: General — Peter de Blanc @ 3:01 pm

The other day, I was reading a wikipedia article related to a topic we had been discussing in one of my classes. One of the statements in the second section confused me, and after a bit of thought I was convinced that it was indeed a mistake. Looking at the history, I noticed that this mistake was the result of an edit that had been made the day before.

Naturally, I reverted the article to the previous version. Looking at the history again, I noticed that the mistake had come from someone with an IP address very similar to my own. A quick search revealed that this person was in Philadelphia.

I decided that I was about 60% sure that it was someone in my class. Immediately I singled out a single person with 30% confidence.

There are about 1.5 million people in Philadelphia. There are about 15 people in my class. It would take a likelihood ratio of about 100,000 to pick out my class, and a likelihood ratio of about 1.5 million to pick out one person.

In class the next day, when I asked if anyone had edited wikipedia recently, they all said no.

And that’s how I lost 1.3 bits from my Bayes score.

August 31, 2009

Summer Research, Singularity Summit

Filed under: Decision Theory, General — Peter de Blanc @ 3:43 pm

This summer, I was involved in a summer research program at the Singularity Institute. Here we are:

SIAI Lunge

While I was there, I wrote a follow-up to my old Expected Utility paper. The new paper says basically the same thing as the old paper, but for repeated decisions rather than one-off decisions.

Roko Mijic and I have also started a paper about the problem of generalizing utility functions to new models – the sort of problem I call an “ontological crisis.” These situations arise for humans when we discover that the goals and values which we ascribe to ourselves do not correspond to objects in reality. Obvious examples include god, souls, and free will, but we’re just as interested in how AIs can deal with more mundane problems such as generalizing the notion of “temperature” from a classical to a quantum model. Unfortunately, we didn’t have time to finish the paper this summer, but you can expect to see it soon.

Towards the end of the summer, I made a few resolutions for the new year (as a grad student, my year starts in late August). In particular, I’ve resolved to write a popular blog. In the short term this will mean reducing the quality of my posts in exchange for much greater quantity, but in the long term I expect quality to rise again as I gain more experience writing. I’ll probably write about some of my less serious projects, such as the computer game I’ve been developing on and off for the past year.

In other news, the Singularity Summit will be in New York this year, on October 3-4. Anyone who wants to chat me up can do so at the summit, if you can find me among the horde of attendees. See you there!

January 20, 2009

Intensional vs. Extensional Goals

Filed under: General — Peter de Blanc @ 12:39 am

Two types of goals for an agent are intensional and extensional goals. An intensional goal can be defined in purely mathematical terms, while an extensional goal depends on the universe in which the agent finds itself.

Some examples of intensional goals:

  • Find a prime number at least 200 bits long.
  • Prove Fermat’s Last Theorem.
  • Unscramble a Rubik’s Cube.
  • Fill in a Sudoku puzzle.

Some examples of extensional goals:

  • Predict the orbit of Mercury.
  • Drive a car across the Mojave Desert.
  • Win a trivia game.
  • Earn at least $500.

If we were coding a Go AI, we could try to build it to achieve either an extensional or an intensional goal. The obvious extensional goal is “win the game.” One possible intensional goal is “output a move that a minimax player would output.” In both cases we would probably include some sort of time limit.

“Output a move that a minimax player would output” stands out from the other examples of intensional goals listed above. In all of the other examples, the agent can be sure that its output is correct before it returns, but if I tell you to “output a move that a minimax player would output,” I haven’t given you an implementable procedure for checking whether you’ve achieved the goal.

It’s not so hard to think of other intensional goals with this property. Instead of asking an agent for a proof of Fermat’s Last Theorem, I could ask it to output 1 if a proof exists, and 0 otherwise.

Let’s say that these two are examples of “the hard kind of intensional goal,” and the four listed at the top of the page are “the easy kind of intensional goal.” The “easy” one are not necessarily easier individually than the “hard” ones; it’s easier for me to output a 1 than to output a proof of Fermat’s Last Theorem. But the “easy” ones are easier to think about, and as a class they’re easier.

In fact, the “hard” intensional goals are so hard that, from a certain point of view, they include the extensional goals! Take any extensional goal, and replace all the unknown parts with a probability distribution (perhaps based on algorithmic complexity), and you have an intensional goal.

Well, that’s a bit of a cop-out, because I don’t actually know how to do that for any of the extensional goals I listed at the top of the page. But we can do it for any goal for which we have already coded a success-detector – a piece of code which can determine, after the fact, if we have achieved our goal. In the Go example, we can do this. The Go AI interacts with its opponent through a a predefined protocol in which the AI outputs its moves and the opponent inputs vis moves to the AI. The board state can be built up from the list of moves, and so after any hypothetical series of plays, the AI can determine whether the game is over, and who won.

So to reformulate “win the game” as an intensional goal, we can suppose that our opponent’s moves are generated by some unknown Turing machine drawn from a given probability distribution. We use the list of moves which have occurred so far to do a Bayesian update on this distribution. Then any possible policy for generating moves has a probability of winning, and we output the move recommended by the policy with the greatest winning probability.

This way we can specify a program (called AIXI) which, if run on a big enough computer, would output what we want to output. And then our goal can be intensionally defined as “output whatever AIXI would output.”

A minimax player is a tall order. AIXI is an even taller order. We can’t actually run these programs, but we want to output, in a reasonable amount of time, whatever they would output. This may require uncertain reasoning about mathematics.

Next Page »

Powered by WordPress