AI ALIGNMENT FORUM
AF

anon1
Ω4000
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Conditions under which misaligned subagents can (not) arise in classifiers
anon17y20

Re: first point, I think this is a difference in intuition about how simple / easy to find agents are in search space. My intuition is that they are would be harder to find than regular functions doing something - I think this is generated by a more general intuition that finding a function that does A is easier than finding a function that does both A and B.

Re: second point, I agree - there will be some agents in the search space. Claim 3 is that if claim 1 and 2 are true, then (for the specified type of task) it is very unlikely that the optimization process will find an agent; however, there is still a nonzero probability that it does.

Reply
4Conditions under which misaligned subagents can (not) arise in classifiers
7y
2