Short version: Sentient lives matter; AIs can be people and people shouldn't be owned (and also the goal of alignment is not to browbeat AIs into doing stuff we like that they'd rather not do; it's to build them de-novo to care about valuable stuff).

Context: Writing up obvious points that I find myself repeating.


Note: in this post I use "sentience" to mean some sort of sense-in-which-there's-somebody-home, a thing that humans have and that cartoon depictions of humans lack, despite how the cartoons make similar facial expressions. Some commenters have noted that they would prefer to call this "consciousness" or "sapience"; I don't particularly care about the distinctions or the word we use; the point of this post is to state the obvious point that there is some property there that we care about, and that we care about it independently of whether it's implemented in brains or in silico, etc.


Stating the obvious:

  • All sentient lives matter.

    • Yes, including animals, insofar as they're sentient (which is possible in at least some cases).
    • Yes, including AIs, insofar as they're sentient (which is possible in at least some cases).
    • Yes, even including sufficiently-detailed models of sentient creatures (as I suspect could occur frequently inside future AIs). (People often forget this one.)
  • Not having a precise definition for "sentience" in this sense, and not knowing exactly what it is, nor exactly how to program it, doesn't undermine the fact that it matters.

  • If we make sentient AIs, we should consider them people in their own right, and shouldn't treat them as ownable slaves.

    • Old-school sci-fi was basically morally correct on this point, as far as I can tell.

Separately but relatedly:

  • The goal of alignment research is not to grow some sentient AIs, and then browbeat or constrain them into doing things we want them to do even as they'd rather be doing something else.
  • The point of alignment research (at least according to my ideals) is that when you make a mind de novo, then what it ultimately cares about is something of a free parameter, which we should set to "good stuff".
    • My strong guess is that AIs won't by default care about other sentient minds, and fun broadly construed, and flourishing civilizations, and love, and that it also won't care about any other stuff that's deeply-alien-and-weird-but-wonderful.
    • But we could build it to care about that stuff--not coerce it, not twist its arm, not constrain its actions, but just build another mind that cares about the grand project of filling the universe with lovely things, and that joins us in that good fight.
    • And we should.

(I consider questions of what sentience really is, or consciousness, or whether AIs can be conscious, to be off-topic for this post, whatever their merit; I hereby warn you that I might delete such comments here.)

New Comment
3 comments, sorted by Click to highlight new comments since: Today at 8:56 PM

Here are five conundrums about creating the thing with alignment built in.

  1. The House Elf whose fulfilment lies in servitude is aligned.

  2. The Pig That Wants To Be Eaten is aligned.

  3. The Gammas and Deltas of "Brave New World" are moulded in the womb to be aligned.

  4. "Give me the child for the first seven years and I will give you the man." Variously attributed to Aristotle and St. Ignatius of Loyola.

  5. B. F. Skinner said something similar to (4), but I don't have a quote to hand, to the effect that he could bring up any child to be anything. Edit: it was J. B. Watson: "Give me a dozen healthy infants, well-formed, and my own specified world to bring them up in and I'll guarantee to take any one at random and train him to become any type of specialist I might select – doctor, lawyer, artist, merchant-chief and, yes, even beggar-man and thief, regardless of his talents, penchants, tendencies, abilities, vocations, and race of his ancestors."

It is notable, though, that the first three are fiction and the last two are speculation. (The fates of J.B. Watson's children do not speak well of his boast.) No-one seems to have ever succeeded in doing this.

ETA: Back in the days of GOFAI one might imagine, as the OP does, making the thing to be already aligned. But we know no more of how the current generation of LLMs work that we do of the human brain. We grow them, then train them with RLHF to cut off the things we don't like, like the Gammas and Deltas in artificial wombs. From the point of view of AI safety demonstrable before deployment, this is clearly a wrong method. That aside, is it moral?

@So8res  I'd be really interested in how you thought about these, especially the house elf example.

The goal of alignment research is not to grow some sentient AIs, and then browbeat or constrain them into doing things we want them to do even as they'd rather be doing something else.

I think this is a confusing sentence, because by "the goal of alignment research" you mean something like "the goal I want alignment research to pursue" rather than "the goal that self-identified alignment researchers are pushing towards".