AI ALIGNMENT FORUM
AF

RS
000
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Soares, Tallinn, and Yudkowsky discuss AGI cognition
RS4y20

I think this came up in the previous discussion as well that a AI that was able to competently design a nanofactory could have the capability to manipulate humans as at a high level as well. For example:

Then when the system generalizes well enough to solve domains like "build a nanosystem" - which, I strongly suspect, can't be solved without imaginative reasoning because we can't afford to simulate that domain perfectly and do a trillion gradient descent updates on simulated attempts - the kind of actions of thoughts you can detect as bad, that might have provided earlier warning, were trained out of the system by gradient descent; leaving actions and thoughts you can't detect as bad.

Even within humans, it seems we have people e.g on the autistic spectrum etc, who I can imagine as having the imaginative reasoning & creativity required to design something like a nano-factory(at 2-3 SD above the normal human) while also being 2-3SD below the average human in manipulating other humans. At least it points to those 2 things maybe not being the same general-purpose cognition or using the same "core of generality"

While this is not by-default guaranteed in the first nanosystem-design capable AI system, it seems like it shouldn't be impossible to do so with more research.

Reply
No posts to display.