Sodium — AI Alignment Forum

Semi-anon account so I could write stuff without feeling stressed.

“Decade or so” is not the crux.

Ok yeah that's fair.

I get a bit sad reading this post. I do agree that a lot of economists sort of "miss the point" when it comes to AI, but I don't think they are more "incorrect" than, say, the AI is Normal Technology folks. I think the crux more or less comes down to skepticism about the plausibility of superintelligence in the next decade or so. This is the mainstream position in economics, but also the mainstream position basically everywhere in academia? I don't think it's "learning econ" that makes people "dumber", although I do think economists have a (generally healthy) strong skepticism towards grandiose claims (which makes them more correct on average).

Another reason I'm sad is that there is a growing group of economists who do take "transformative" AI seriously, and the TAI field has been growing and producing what I think are pretty cool work. For example, there's an economics of transformative AI class designed mostly for grad students at Stanford this summer, and BlueDot also had an economics of transformative AI class.

Overall I think this post is unnecessarily uncharitable.

Re: Black box methods like "asking the model if it has hidden goals."

I'm worried that these methods seem very powerful (e.g., Evans group's Tell me about yourself, the pre-fill black box methods in Auditing language models) because the text output of the model in those papers haven't undergone a lot of optimization pressure.

Outputs from real world models might undergo lots of scrutiny/optimization pressure^[1] so that the model appears to be a "friendly chatbot." AI companies put much more care into crafting those personas than model organisms researchers would, and thus the AI could learn to "say nice things" much better.

So it's possible that model internals will much more faithful relative to model outputs in real world settings compared to in academic settings.

^{^}
or maybe they'll just update GPT-4o to be a total sycophant and ship it to hundreds of millions people. Honestly hard to say nowadays.

Thanks for putting this up! Just to double check—there aren't any restrictions against doing multiple AISC projects at the same time, right?

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

Posts

Wikitag Contributions

Comments