Posted also on the EA Forum. By “independent” I mean an AI to which an external observer may attribute freedom of thought, or something similar to it. You can think of it as an AI that is not too biased by what its designers or programmers think it’s good or...
Posted also on the EA Forum. I try to communicate the point of this sequence of posts in an intuitive way. Everyone wants to do good; however, sometimes people don’t take into consideration that they might be wrong about what is good. This is particularly important if, for example, one’s...
Posted also on the EA Forum. I think that: For any conscious agent A, there is some knowledge such that A acts morally if A has that knowledge. Or, less formally: With enough knowledge, any conscious agent acts morally. What a bold statement! This post clarifies what I mean with...
Posted also on the EA Forum. In Free agents I’ve given various ideas about how to design an AI that reasons like an independent thinker and reaches moral conclusions by doing so. Here I’d like to add another related idea, in the form of a short story / thought experiment....
Posted also on the EA Forum. 2025 comment: the ideas here are still relevant and worth reading in my opinion. I'm a bit unsatisfied with the toy model: although it does a good job of describing what a kind of agent (that I call free) looks like from the outside,...
This will be posted also on the EA Forum, and included in a sequence containing some previous posts and other posts I'll publish this year. Introduction Humans think critically about values and, to a certain extent, they also act according to their values. To the average human, the difference between...
Originally posted on the EA Forum for the Criticism and Red Teaming Contest. 0. Summary AI alignment research centred around the control problem works well for futures shaped by out-of-control misaligned AI, but not that well for futures shaped by bad actors using AI. Section 1 contains a step-by-step argument...