So, when a human lies over the course of an interaction, they'd be holding a hidden state in mind throughout. However, an LLM wouldn't carry any cognitive latent state over between telling the lie, and then responding to the elicitation question. I guess it feels more like "I just woke up from amnesia, and seems I have just told a lie. Okay, now what do I do..."
Stating this to:
Curated.There are really many things I found outstanding about this post. The key one, however, is that after reading this, I feel less confused when thinking about transformer language models. The post had that taste of deconfusion where many of the arguments are elegant, and simple; like suddenly tilting a bewildering shape into place. I particularly enjoyed the discussion of ways agency does and does not manifest within a simulator (multiple agents, irrational agents, non-agentic processes), the formulation of the prediction orthogonality thesis, ways i... (read more)
If someone asks what the rock is optimizing, I’ll say “the actions” - i.e. the rock “wants” to do whatever it is that the rock in fact does.
This argument does not seem to me like it captures the reason a rock is not an optimiser?
I would hand wave and say something like:
"If you place a human into a messy room, you'll sometimes find that the room is cleaner afterwards. If you place a kid in front of a bowl of sweets, you'll soon find the sweets gone. These and other examples are pretty surprising state transitions, that would be highly unlikely i... (read more)
An update on this: sadly I underestimated how busy I would be after posting this bounty. I spent 2h reading this and Thomas post the other day, but didn't not manage to get into the headspace of evaluating the bounty (i.e. making my own interpretation of John's post, and then deciding whether Thomas' distillation captured that). So I will not be evaluating this. (Still happy to pay if someone else I trust claim Thomas' distillation was sufficient.) My apologies to John and Thomas about that.
Cool, I'll add $500 to the distillation bounty then, to be paid out to anyone you think did a fine job of distilling the thing :) (Note: this should not be read as my monetary valuation for a day of John work!)
(Also, a cooler pay-out would be basis points, or less, of Wentworth impact equity)
How long would it have taken you to do the distillation step yourself for this one? I'd be happy to post a bounty, but price depends a bit on that.
Jaan/Holden convo link is broken :(
Curated. I think this post strikes a really cool balance between discussing some foundational questions about the notion of agency and its importance, as well as posing a concrete puzzle that caused some interesting comments.
For me, Life is a domain that makes it natural to have reductionist intuitions. Compared to say neural networks, I find there are fewer biological metaphors or higher-level abstractions where you might sneak in mysterious answers that purport to solve the deeper questions. I'll consider this post next time I want to introduce some... (read more)
Here are prediction questions for the predictions that TurnTrout himself provided in the concluding post of the Reframing Impact sequence.
(You can find a list of all 2019 Review poll questions here.)