User Comment Replies — AI Alignment Forum

Judgements: Merging Prediction & Evidence

Disclaimer: I haven't read the Logical Induction paper, which may explain my lack of intuition.

Is there maybe a way to explain your theory of Judgement more directly, just in epistemic terms of belief and probability theory, without falling back to instrumental sounding trading analogies around buying and selling stuff for some price? Similar to the Radical Probabilism post perhaps? (Though that one also mentioned some instrumental arguments like money pumps / Dutch book arguments.)

3Abram Demski8d

Yeah, for better or worse, the logical induction paper is probably the best thing to read. The idea is actually to think of probabilities as prediction-market prices; the market analogy is a very strong one, not an indirect way of gesturing at the idea.

Dream, Truth, & Good

cubefox9d10

Isn't honesty a part of the "truth machine" rather than the "good machine"? Confabulation seems to be a case of the model generating text which it doesn't "believe", in some sense.

3Abram Demski9d

Yeah. I'm saying that the "good machine" should be trained on all three; it should be honest, but, constrained by helpfulness and harmlessness. (Or, more realistically, a more complicated constitution with more details.)

Teleosemantics!

cubefox2y*50

What does it mean to optimize for the map to fit the territory, but not the other way around? (After all: we can improve fit between map and territory by changing either map or territory.) Maybe it's complicated, but primarily what it means is that the map is the part that's being selected in the optimization. When communicating, I'm not using my full agency to make my claims true; rather, I'm specifically selecting the claims to be true.

I don't know whether you are familiar with it, but most speech acts or writing acts are considered to have either a "wor... (read more)

2Abram Demski2y

It's a good point. I suppose I was anchored by the map/territory analogy to focus on world-to-word fit. The part about Communicative Action and Rational Choice at the very end is supposed to gesture at the other direction. Intuitively, I expect it's going to be a bit easier to analyze world-to-word fit first. But I agree that a full picture should address both.

Contra Common Knowledge

cubefox2y810

It's interesting to note that we can still get Aumann's Agreement Theorem while abandoning the partition assumption (see Ignoring ignorance and agreeing to disagree, by Dov Samet). However, we still need Reflexivity and Transitivity for that result. Still, this gives some hope that we can do without the partition assumption without things getting too crazy.

I don't quite get this paragraph. Do you suggest that the failure of Aumanns disagreement theorem would be "crazy"? I know his result has become widely accepted in some circles (including, I think, LessW... (read more)

2Abram Demski2y

I was using "crazy" to mean something like "too different from what we are familiar with", but I take your point. It's not clear we should want to preserve Aumann. Interesting, thanks for pointing this out!

AI ALIGNMENT FORUM
AF

All of cubefox's Comments + Replies