space and games

October 20, 2007

Convergence of Expected Utilities

Filed under: Decision Theory — Peter de Blanc @ 8:44 am

I’m working on a paper on this topic, but I figured I would post about it now and clear up some confusion.

Let’s say we have an agent interacting with an environment. The agent sends the environment some number (an action), and the environment sends the agent some number in response (a perception).

When we have an infinite set of hypotheses, the expected utility of some action is determined by an infinite series. It’s important for that series to converge, or else we won’t (in general) be able to compare the expected utilities of two different actions.

Given the following assumptions:

• We have some finite set of data about the environment.
• We assign a nonzero probability to all computable hypotheses consistent with our data.
• Either this probability distribution is computable, or it is bounded below by some positive computable function.
• Our utility function is determined by our perceptions, and is unbounded.
• Either this utility function is computable, or it is bounded below in absolute value by some unbounded computable function.

…then it follows that our expected utilities do not converge.

This seems like a pretty strong reason to stick with bounded utility functions. It certainly seems much more plausible to me now than it did before that my utility function is bounded (to the extent that I can be said to have a utility function).

But we shouldn’t jump to conclusions. Here are some reasons why the above might not apply:

• You might have an uncomputable utility function.
• You might have a utility function which is not determined solely by your perceptions.
• You might assign a probability of zero to some hypotheses which are consistent with your data.

1. Thanks for pointing me to this!

Comment by Nick Tarleton — October 20, 2007 @ 10:28 am

2. Makes sense, now that it’s pointed out. One intuitive way of saying this: no matter what your past observations are, there’s always a non-zero probability that:

1. you’re living in a simulation, and

2. someone will replace the simulation with “Heaven or Hell” (give arbitrarily large negative or positive utility).

Given your assumptions, it seems reasonable that the absolute value of the expected utility, in the general case, grows faster than the probability shrinks (at least for non-edge-case examples) because of Rice’s Theorem-type considerations.

Are you saying it also holds for bizarre edge cases? Example that comes to mind:

1. construct a bizarre utility function where the utility of X always equals the negative of the utility of ~X (the bits-complement of X where 0’s and 1’s are flipped)

2. then, construct an equally bizarre Turing-equivalent machine (or a bizarre variant of Solomonoff Induction) where X and ~X are always equally probable. (For example, if the first bit of the program P is 0 the bit is ignored, but if the first bit of P is 1, the ultimate bits of the output.)

Comment by Rolf Nelson — October 20, 2007 @ 12:31 pm

3. Didn’t finish, meant to say:

2. then, construct an equally bizarre Turing-equivalent machine (or a bizarre variant of Solomonoff Induction) where X and ~X are always equally probable. (For example, if the first bit of the program P is 0 the bit is ignored, but if the first bit of P is 1, the ultimate bits of the output are reversed.)

Comment by Rolf Nelson — October 20, 2007 @ 12:34 pm

4. Let me quickly eat my words, the problem given is the rigorous “whether it converges” and not the fuzzy “whether tiny probabilities might be intuitively reasonable to ignore”. My example doesn’t actually converge, saying something converges is obviously different from saying that adjacent terms cancel out in one particular ordering.

Comment by Rolf Nelson — October 20, 2007 @ 12:42 pm

5. The scenario that Rolf Nelson provides leads to a divergent expected utility, but not in a way that is relevant to decisions–the infinities “exactly cancel out.” It reminds me of renormalization. But I imagine that there are other problems.

Comment by Douglas Knight — October 20, 2007 @ 3:49 pm

6. Douglas: infinities don’t cancel out. I think you’re imagining a situation where the terms of the series remain bounded; that is, they oscillate back and forth but don’t grow arbitrarily large. This can’t actually happen.

Imagine we could construct such an example (the terms are bounded). Then replace the probability distribution with one that goes to zero more quickly than the original distribution – say, exponentially faster. The original terms were bounded, so the terms of the new series go to zero exponentially fast, so the series converges. But by the original theorem, your expected utility over this new distribution is also divergent. Contradiction.

Here’s a quick idea of how the proof of the theorem goes:

We index all of the computable functions, and use this index to construct a sequence analogous to the busy beaver numbers. Then (as Rolf says), we construct heavens or hells which yield outcomes with utilities exceeding the numbers in this sequence. The probabilities for these outcomes go to zero computably fast (not actually computable, but more slowly than some computable sequence). The busy beaver numbers (and our analogous sequence) grow faster than any computable function. So the utilities go to infinity faster than the probabilities go to zero.

(I’m not posting the full paper here because I want to publish it in an academic journal, and they seem to prefer that you don’t publish elsewhere beforehand. Contact me if you want to look at a draft.)

Comment by Peter de Blanc — October 22, 2007 @ 12:40 am

7. If I understand your proof correctly, there appears to be a loophole: an unbounded utility function can converge as long as the utility assigned to any state does not increase faster than C^N, where N is the number of bits of entropy/Kolmogorov complexity. Essentially, you arbitrarily decree that any function with a sufficiently high utility must have a sufficiently low probability. To give a simple example, define U(x) over an n-bit system x to be 2*n. The expected utility sums to 2 * 1/2 + 4 * 1/4 + 6 * 1/8 + 8 * 1/16…. = 4.

Comment by Tom McCabe — October 22, 2007 @ 10:44 pm

8. Sorry if this is a bit off-topic, but why assume that utility is determined solely by perceptions? Real people obviously do not follow this assumption, since they do things whose effects they will never observe，like writing wills. Is it just to make the math easier?

Comment by Wei Dai — October 22, 2007 @ 11:50 pm

9. Tom, your example utility function is not perception-determined.

Wei Dai: I require utility functions to be perception-determined because without this assumption, the theorem would be false. See Tom’s example.

Comment by Peter de Blanc — October 23, 2007 @ 2:42 am

10. “Tom, your example utility function is not perception-determined.”

What’s the technical definition of “perception-determined”?

Comment by Tom McCabe — October 23, 2007 @ 4:59 pm

11. A utility function is perception determined if the argument to the utility function is a perception. That is, the expected utility of an action k is given by:

Σh in H U(h(k)) p(h)

…where H is our set of hypotheses.

Comment by Peter de Blanc — October 23, 2007 @ 5:26 pm

12. I know what “perception-determined” sounds like it means, but I don’t understand why it’s that expression. Wouldn’t that apply to any utility function? What would a non-perception-determined utility function look like?

Comment by Nick Tarleton — October 24, 2007 @ 10:21 pm

13. Nick, Tom’s scenario is a good example of a non-perception-determined utility function because it’s expressed in terms of the *underlying algorithm that produces the perceptions* rather than being expressed in terms of the perceptions themselves. For the theorem to hold, you’re only allowed to base your theory on perceptions, or (equivalently) on things that can be computably determined from those perceptions. (In contrast, Wei Dai’s example of “signing a will” misses the point, since you would arguably try to infer the future well-being of your heirs from the perception that you have currently have a signed will.)

For example, suppose that Tom’s algorithm perceives a random-looking 10^100 digit number, p = 21346…19872 (with the middle digits omitted for brevity.) Tom’s algorithm has to assign this perception a low utility if its Kolmogorov complexity is low. Can we compute the Kolmogorov complexity based on the perception? In the general case, no, it’s uncomputable. If it turns out that p is “the last 10^100 digits of the imaginary part of the first number that violates the Riemann hypothesis”, then Tom’s algorithm has to magically know that p had low Kolmogorov complexity and needed to be assigned a low utility.

Comment by Rolf Nelson — October 25, 2007 @ 1:23 am

14. Rolf: good, you understand. Although now that I think about it, I wouldn’t call Tom’s example non-perception-determined. Instead, I would just say that it’s not computable as a perception-determined utility function (It’s not bounded below by a divergent computable function either – if it were, you could use it to build a halting oracle! Do you see how?). I’ll give you an example of a utility function which is not perception-determined, even uncomputably.

Above I wrote the formula for expected utility of a perception-determined hypotheses. For a non-perception-determined hypothesis, we can write the expected utility of an action k as:

Σh in H U(h, k) p(h)

Here we’ll take the example U(h, k) = k. That is, your utility is equal to your action, always. Now, you can’t figure out exactly what your action was just by looking at your perceptions. Actually, you can’t even get a probability distribution! By Bayes’ Theorem, to get a probability for a given action, we would need:

1. A perception (got it)
2. The likelihood of that perception given any action (got it, although not computably)
3. A prior probability for each action (whoops! We don’t have this!)

Comment by Peter de Blanc — October 25, 2007 @ 6:30 am

15. If the “signing a will” example doesn’t work because you can see yourself signing a will, then what if you can’t? Suppose just before you’re about to sign the will, all of your sensory nerves die, but your motor nerves are left intact. (You know your motor nerves are still functioning because you already knew that you have a disease that affects only sensory nerves.) If you have a perception-determined utility function, do you go through with the signature?

Comment by Wei Dai — October 25, 2007 @ 5:31 pm

16. “If you have a perception-determined utility function, do you go through with the signature?”

This is the technical meaning of perception, referring to “data that is directly perceived”, not “data that you personally have perceived”, as in common usage. My loophole is closed because the Kolmogorov complexity of data is noncomputable, and for the utility function to converge, you have to hack it so that the Kolmogorov complexity of the data sets an upper bound on the utility.

Comment by Tom McCabe — October 25, 2007 @ 6:13 pm

17. I think Wei may be correct. If I were about to sign a will, I can picture myself thinking something like:

“I’m taking this action so that if I die, my assets will go to the Singularity Institute.”

As opposed to:

“I’m taking this action so that I will perceive qualia which will make me believe that I have signed a will bequeathing my assets to the Singularity Institute.”

Comment by Peter de Blanc — October 25, 2007 @ 8:15 pm

18. So let me restate my original question. Do people work with perception-determined utility functions because they think either it’s a good model of how people actually make decisions or it’s an ideal that people (and AIs) should actually try to approximate, or do they consider it a simplification that makes the math easy enough to allow some theorems to be derived?

I think the latter is perfectly fine, but some of the literature in this field is written as if the author(s) believe in the former.

Comment by Wei Dai — October 25, 2007 @ 9:07 pm

19. “I think the latter is perfectly fine, but some of the literature in this field is written as if the author(s) believe in the former.”

In contrast, Yudowsky has advocated basing the utility function on “External reference semantics”, which is another example of a non-perception-based utility function.

Note that an algorithm that cares about external reality, but believes that (with 100% certainty) it can “deduce” external reality from its past, current, and future perceptions, is technically still perception-based for the purposes of the theorem.

“Suppose just before you’re about to sign the will, all of your sensory nerves die, but your motor nerves are left intact.”

I like this example better, in this case it’s clearer that the agent cannot be using a perception-based utility function.

Comment by Rolf Nelson — October 26, 2007 @ 8:30 am

20. “It’s not bounded below by a divergent computable function either…”

This part I might be able to (awkwardly) prove, along the lines of: “If you had a computable divergent function f that bounds the Kolmogorov complexity from below, you could construct a computable total function g such that the Kolmogorov complexity of g(x) is always greater than or equal to x, and that’s just plain impossible.” But:

“…if it were, you could use it to build a halting oracle! Do you see how?”

You mean a full-fledged member of the 0′ Turing Degree? Congratulations, you have stumped me, I cannot figure out how you would do this.

Comment by Rolf Nelson — October 26, 2007 @ 8:40 am

21. “I like this example better, in this case it’s clearer that the agent cannot be using a perception-based utility function.”

Methinks we need to define “perception” better- there’s nothing in the theorem that *requires* the utility function to refer to internal states instead of external states, as long as the improbability of the bitstring remains uncomputable (if I understand correctly).

Comment by Tom McCabe — October 26, 2007 @ 8:47 pm

22. [...] Suppose that you extended linear addition to self-addition: define U(s1 + s1) = U(s1) + U(s1), or more generally, define U(s1 * R) = R * U(s1) for some integer or scalar R. This seems fairly obvious- why can’t you add a state to itself using the same rules as two different states?- but it has a number of annoying consequences. R now can become arbitrarily large, which means that the expected utilities of actions over U will not converge, given a few assorted extra conditions. Trying to calculate preferences under that condition is a mess- the expected utilities end up being dominated by tiny probabilities of huge utilities. [...]

Pingback by Life, the Universe, and Everything » Brief Summary of Utility Functions — October 26, 2007 @ 9:12 pm

23. [...] Peter de Blanc, 2006 summer intern at the Singularity Institute and now with the Temple University Department of Mathematics, has posted an article titled “Convergence of Expected Utilities with Algorithmic Probability Distributions.” [...]

Pingback by The Singularity Institute Blog : Blog Archive : Article by Peter de Blanc — October 2, 2008 @ 2:07 am

24. An interesting article. While no-one seems to have mentioned it, this is closely related to the “Pascal’s Mugger” problem, where Kolmogorov-based priors mean that the probability of a threat being true seems to diminish too slowly in comparison to the size of the threat (since 3^^^3 has complexity much less than itself, or the log of itself). Thus for a certain sized threat ‘Your money or 3^^^3 people’s lives’, your credence in it will be swamped by the stakes and you will have to pay up. Indeed, did you design the proof specifically to formalize this problem?

I’m not quite sure what to conclude from it all (and haven’t checked your mathematics), but on the face of it, I’m not sure which assumption should be questioned. For example, the kolmogorov-based priors might be better to drop than the unbounded utility function (there are some independent reasons to be skeptical of the former as well).

Comment by Toby Ord — October 5, 2008 @ 6:17 pm

25. Toby Ord: Indeed, did you design the proof specifically to formalize this problem?

Yes.

Comment by Peter de Blanc — October 5, 2008 @ 8:48 pm

26. > “…if it were, you could use it to build a halting oracle! Do you see how?”

> You mean a full-fledged member of the 0′ Turing Degree? Congratulations, you have stumped me, I cannot figure out how you would do this.

Now I can’t figure out why I thought that was true. Consider this a retraction.

Comment by Peter de Blanc — February 26, 2010 @ 3:12 pm

27. Brooks gik lille nÃ¥r det kommer til Ã¸jeblikket fjerde kvartal, men han opgav agility i netop den eksakte maling pÃ¥ bruge Perkins i stedet i Ibaka ellers Nick Collison. Perkins kunne have vÃ¦ret samt spille lider fire (derefter pÃ¥ en trovÃ¦rdig fantastisk spil hÃ¸flighed af – Parker, fire ) frispark, hvilket gÃ¸r dem, benhÃ¥rd forsvarsspiller reticient gÃ¸r kontakt. ray ban solbriller
Vores egne euroen steg i forhold en persons dollar flot yen mandag hele Asien midt i en vis lettelse efter GrÃ¦kenland meningsmÃ¥linger viste sÃ¸ndag af, at min pro-bailout Nye typer Demokrati konservative overhalede en ny langt til venstre anti-nÃ¸jsomhed SYRIZA forude for bare en valget sÃ¦t af 17 juni. oakley sunglasses
Dens yen ulempe vÃ¦ret begrÃ¦nset, men pÃ¥ at kÃ¸be midt forventninger til stÃ¸tte for bud startende fra japanske eksportÃ¸rer her ultimo mÃ¥neden, optrÃ¦der i ellers rolig handel der har den sÃ¦rlige amerikanske marked lukket for at en rigtig ferie mandag, Tokyo forhandlere sagde.
http://www.solbrillertilbud.com

Comment by WetBeewayHaby — June 1, 2012 @ 11:39 am

28. Internet, seventy to ninety percent of those who begin to create a obtain don total the procedure (abandoned purchasing carts, and so on.). These had been folks inspired sufficient to attempt to get, and gave up. It unlikely that figures are anyplace close to that higher in retail scenarios, but they nevertheless significantly larger than they want to be.
sell oakley sunglasses http://www.walestouristsonline.co.uk/inc/oakley.asp?p=267

Comment by sell oakley sunglasses — April 2, 2014 @ 4:54 am

29. J’avais négligé pas mal friday propre weblog ces dernières années. Je n’oubliais personne mais il m’en est tombé des choses sur the dos, Des bonnes et des mauvaises. Je suis heureuse nufactured retourner sur friday weblog où je compte for me rendre à nouveau régulièrethent et venir beaucoup pous souvent sur tien.
lunette de soleil ray ban pour femme http://www.dimexprotection.be/includes/rayban.asp?p=5

Comment by lunette de soleil ray ban pour femme — April 16, 2014 @ 4:32 am

30. Towards lead to this newsworthiness concerning the within Moncler jerkin, Moncler jacket, moncler jackets Moncler shoesMoncler young people, Moncler perspirer along with other Moncler programs, acquire so that you can Moncler on the internet shop0 feedback.

31. This is the lighting brown lightly dress I would like to propose, widespread 2 bottle repleat keys style through cozy dog’s fur scruff of the neck, along with the chest enhancement not to mention midsection are generally furnished utilizing installing tailoring, captivating along with beautiful.

32. cash of \$127 million chop down listed below our very own \$131 million compute as a result of political worry that most impeded revenues at the conclusion of sept. 2009 were the 60th year husband’s akin to taiwan country wide trip and the us govenment didn allow the diamond jewelry potential sales for several days before the march. 1st break on the protection inquiries, signifies FUQI was struggling to deliver services firewood income during that menstrual cycle,
nike jordan boxing shoes http://www.skagen-byferie.dk/nikefree.asp?p=2154

Comment by nike jordan boxing shoes — May 5, 2014 @ 2:43 am

33. The best supplier of AliExpresss. Okay embaladinho without blemish, quickly sending, attention and respect to the customer. Congratulations on the work you.design legal, cor foi exatamente a esperada, a qualidade eh um pouco inferior, mas eh resistente, quase como descrito, mas adorei o produto, recomendo
mulberry handbags outlet uk http://www.active-tools.com/Sitemanager/mulberry.asp?id=49

Comment by mulberry handbags outlet uk — May 17, 2014 @ 3:03 pm

34. Even if the layout regarding chief cook jackets will be very similar yow will discover embellishments which really can be totally concerning several chief cook jackets.

35. この目的は、最新のと多くの近代的な技術を使用して靴、ラインルイヴィトン指輪コピーバスケットボールシューズの歴史上、美しい履物これまで外観エリナ·ジョーダンのユニークなデザインの特異的発現のセットだけでなく、靴のカリスマ技術のミックスを生み出すために美しい職人の技を行うにはそしてあなたの動きからインスピレーションは、紙の上で偉大な伝説の本の中で靴モカシンのため18の電源を入れます。大人になってからは、地元を離れ、その土地土地のルイヴィトンベルトスーパーコピー…を楽しんでいますが、最近では、上越の高田城址、高遠、臥竜公園等の有名どころから、知る人ぞ知る池田町陸郷のルイヴィトンポーチコピー仙峡まで、色々なルイヴィトンスーパーコピー激安を求めて動き回っています。
ヴィトン 財布 メンズ コピー http://louisvuitton_59zall.phoenixvfd.com

Comment by ヴィトン 財布 メンズ コピー — June 28, 2014 @ 3:01 am