All of sairjy's Comments + Replies

We could study such a learning process, but I am afraid that the lessons learned won't be so useful. 

Even among human beings, there is huge variability in how much those emotions arise or if they do, in how much they affect behavior.  Worst, humans tend to hack these feelings (incrementing or decrementing them) to achieve other goals: i.e MDMA to increase love/empathy or drugs for soldiers to make them soulless killers. 

An AGI will have a much easier time hacking these pro-social-reward functions. 

2Alex Turner6mo
Not sure what you mean by this. If you mean "Pro-social reward is crude and easy to wirehead on", I think this misunderstands the mechanistic function of reward [https://docs.google.com/document/d/1mRDkU1nlCxqxmd2eqf48oonh7i6SLsgSSz5KB6X2MSc/edit?usp=sharing]. 

Human beings and other animals have parental instincts (and in general empathy) because they were evolutionary advantageous for the population that developed them. 

AGI won't be subjected to the same evolutionary pressures, so every alignment strategy relying on empathy or social reward functions, it is, in my opinion, hopelessly naive. 

The "Humans do X because evolution" argument does not actually explain anything about mechanisms. I keep seeing people make this argument, but it's a non sequitur to the points I'm making in this post. You're explaining how the behavior may have gotten there, not how the behavior is implemented. I think that "because selection pressure" is a curiosity-stopper, plain and simple.

AGI won't be subjected to the same evolutionary pressures, so every alignment strategy relying on empathy or social reward functions, it is, in my opinion, hopelessly naive. 

Thi... (read more)

There must have been some reason(s) why organisms exhibiting empathy were selected for during our evolution. However, evolution did not directly configure our values. Rather, it configured our (individually slightly different) learning processes. Each human’s learning process then builds their different values based on how the human’s learning process interacts with that human’s environment and experiences.

The human learning process (somewhat) consistently converges to empathy. Evolution might have had some weird, inhuman reason for configuring a learning ... (read more)