ah, I see! It's an incentive problem! So I guess your funding needs to be conditional on you producing legible outputs.
This rubs me the wrong way. Of course, you can make anyone do X, if you make their funding conditional on X. But whether you should do that, that depends on how sure you are that X is more valuable than whatever is the alternative.
There are already thousands of people out there whose funding is conditional on them producing legible outputs. Why is that not enough? What will change if we increase that number by a dozen?
Makes sense, with the proviso that this is sometimes true only statistically. Like, the AI may choose to write an output which has a 70% chance to hurt you and a 30% chance to (equally) help you, if that is its best option.
If you assume that the AI is smarter than you, and has a good model of you, you should not read the output. But if you accidentally read it, and luckily you react in the right (for you) way, that is a possible result, too. You just cannot and should not rely on being so lucky.
I guess it depends on how much the parts that make you "a little different" are involved in your decision making.
If you can put it in numbers, for example -- I believe that if I choose to cooperate, my twin will choose to cooperate with probability p; and if I choose to defect, my twin will defect with probability q; also I care about the well-being of my twin with a coe... (read more)