Ramana Kumar — AI Alignment Forum

Is there any rigorous work on using anthropic uncertainty to prevent situational awareness / deception?

Answer by Ramana KumarSep 26, 202410

Vaguely related perhaps is the work on Decoupled Approval: https://arxiv.org/abs/2011.08827

Thanks for this! I think the categories of morality is a useful framework. I am very wary of the judgement that care-morality is appropriate for less capable subjects - basically because of paternalism.

Consent across power differentials

Ramana Kumar1y30

Just to confirm that this is a great example and wasn't deliberately left out.

Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover

Ramana Kumar2y84Review for 2022 Review

I found this post to be a clear and reasonable-sounding articulation of one of the main arguments for there being catastrophic risk from AI development. It helped me with my own thinking to an extent. I think it has a lot of shareability value.

Systems that cannot be unsafe cannot be safe

Ramana Kumar3y30

I agree with this post. However, I think it's common amongst ML enthusiasts to eschew specification and defer to statistics on everything. (Or datapoints trying to capture an "I know it when I see it" "specification".)

Why do we care about agency for alignment?

Answer by Ramana KumarApr 23, 202340

This is one of the answers: https://www.alignmentforum.org/posts/FWvzwCDRgcjb9sigb/why-agent-foundations-an-overly-abstract-explanation

Teleosemantics!

Ramana Kumar3y10

The trick is that for some of the optimisations, a mind is not necessary. There is a sense perhaps in which the whole history of the universe (or life on earth, or evolution, or whatever is appropriate) will become implicated for some questions, though.

AI and Evolution

Ramana Kumar3y32

I think https://www.alignmentforum.org/posts/TATWqHvxKEpL34yKz/intelligence-or-evolution is somewhat related in case you haven't seen it.

$500 Bounty/Contest: Explain Infra-Bayes In The Language Of Game Theory

Ramana Kumar3y90

I'll add $500 to the pot.

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

Posts

Wikitag Contributions

Comments