I think this is an application of a more general, very powerful principle of mechanism design: when cognitive labor is abundant, near omni-present surveillance becomes feasible.
For domestic life, this is terrifying.
But for some high stakes, arms race-style scenarios, it might have applications.
Beyond what you metioned, I'm particularly interested in this being a game-changer for bilateral negotiation. Two parties make an agreement, consent to being monitored by an AI auditor, and verify that the auditor's design will communicate with the ...
So, when a human lies over the course of an interaction, they'd be holding a hidden state in mind throughout. However, an LLM wouldn't carry any cognitive latent state over between telling the lie, and then responding to the elicitation question. I guess it feels more like "I just woke up from amnesia, and seems I have just told a lie. Okay, now what do I do..."
Stating this to:
Curated.
There are really many things I found outstanding about this post. The key one, however, is that after reading this, I feel less confused when thinking about transformer language models. The post had that taste of deconfusion where many of the arguments are elegant, and simple; like suddenly tilting a bewildering shape into place. I particularly enjoyed the discussion of ways agency does and does not manifest within a simulator (multiple agents, irrational agents, non-agentic processes), the formulation of the prediction orthogonality thesis, ways i...
If someone asks what the rock is optimizing, I’ll say “the actions” - i.e. the rock “wants” to do whatever it is that the rock in fact does.
This argument does not seem to me like it captures the reason a rock is not an optimiser?
I would hand wave and say something like:
"If you place a human into a messy room, you'll sometimes find that the room is cleaner afterwards. If you place a kid in front of a bowl of sweets, you'll soon find the sweets gone. These and other examples are pretty surprising state transitions, that would be highly unlikely i...
An update on this: sadly I underestimated how busy I would be after posting this bounty. I spent 2h reading this and Thomas post the other day, but didn't not manage to get into the headspace of evaluating the bounty (i.e. making my own interpretation of John's post, and then deciding whether Thomas' distillation captured that). So I will not be evaluating this. (Still happy to pay if someone else I trust claim Thomas' distillation was sufficient.) My apologies to John and Thomas about that.
Curated.
I think this post strikes a really cool balance between discussing some foundational questions about the notion of agency and its importance, as well as posing a concrete puzzle that caused some interesting comments.
For me, Life is a domain that makes it natural to have reductionist intuitions. Compared to say neural networks, I find there are fewer biological metaphors or higher-level abstractions where you might sneak in mysterious answers that purport to solve the deeper questions. I'll consider this post next time I want to introduce some...
Here are prediction questions for the predictions that TurnTrout himself provided in the concluding post of the Reframing Impact sequence.
(You can find a list of all 2019 Review poll questions here.)
(Nitpick: I'd find the first paragraphs would be much easier to read if they didn't have any of the bolding)