Message

topherhunt

Message

topherhunt

6 reasons why “alignment-is-hard” discourse seems alien to human intuitions, and vice-versa

Thanks for the thoughtful reply. It took me a lot of squinting, but IIUC you're saying:

Different kinds of minds, produced by different kinds of architectures, should likely exhibit very different levels of scary traits such as monomaniacal sociopathy.
Stop focusing on LLMs so much; they're not the main threat. Yes they seem to exhibit more value-roundedness because they're trained to imitate humans, but they aren't likely to reach AGI anytime soon.
Focus more on RL agents and "brain-like" architectures; those are built very differently and plausibly would ha

what exactly is it about human brains^[1] that allows them to not always act like power-seeking ruthless consequentialists?

By asking this question, you've already lost me. The question tells me that "ruthless consequentialist" is your default mentality for how rational thinking beings operate, absent wiring / training / reward systems that limit the default outcome. And if that worldview is representative of the "technical-alignment-is-hard" camp, then of course the only plausible outcome of AI advance is "AIs eventually break free of those limite... (read more)