AI ALIGNMENT FORUM
AF

sairjy
000
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
0sairjy's Shortform
5y
0
Humans provide an untapped wealth of evidence about alignment
sairjy3y0-2

We could study such a learning process, but I am afraid that the lessons learned won't be so useful. 

Even among human beings, there is huge variability in how much those emotions arise or if they do, in how much they affect behavior.  Worst, humans tend to hack these feelings (incrementing or decrementing them) to achieve other goals: i.e MDMA to increase love/empathy or drugs for soldiers to make them soulless killers. 

An AGI will have a much easier time hacking these pro-social-reward functions. 

Reply
Humans provide an untapped wealth of evidence about alignment
sairjy3y0-1

Human beings and other animals have parental instincts (and in general empathy) because they were evolutionary advantageous for the population that developed them. 

AGI won't be subjected to the same evolutionary pressures, so every alignment strategy relying on empathy or social reward functions, it is, in my opinion, hopelessly naive. 

Reply
No wikitag contributions to display.
No posts to display.