.

New Comment
17 comments, sorted by Click to highlight new comments since: Today at 4:22 AM

I'm way more scared about the electrode-produced smiley faces for eternity and the rest. That's way, way worse than dying.

FWIW, it seems kinda weird to me that such an AI would keep you alive... if you had a "smile-maximiser" AI, wouldn't it be indifferent to humans being braindead, as long as it's able to keep them smiling?

 

I'd like to have Paul Christiano's view that the "s-risk-risk" is 1/100 and that AGI is 30 years off

I think Paul's view is along the lines of "1% chance of some non-insignificant amount of suffering being intentionally created", not a 1% chance of this type of scenario.[1]

 

Could AGI arrive tomorrow in its present state?

I guess. But we'd need to come up with some AI model tomorrow, and this model suddenly becomes agentive and rapidly grows in power, and this model is designed with a utility function that values keeping humans alive but does not value humans flourishing... and even then, there'd likely be better ways to e.g. maximise the number of smiles in the universe, by using artificially created minds.

 

Eliezer has written a bit about this, but I think he considers it a mostly solved problem. 

 

What can I do as a 30 year old from Portugal with no STEM knowledge? Start learning math and work on alignment from home?

Probably get treatment for the anxiety and try to stop thinking about scenarios that are very unlikely, albeit salient in your mind. (I know, speaking from experience, that it's hard to do so!)

  1. ^

     I did, coincidentally, cold e-mail Paul a while ago to try to get his model on this type of stuff & got the following response:

    "I think these scenarios are plausible but not particularly likely. I don't think that cryonics makes a huge difference to your personal probabilities, but I could imagine it increasing them a tiny bit. If you cared about suffering-maximizing outcomes a thousand times as much as extinction, then I think it would be plausible for considerations along these lines to tip the balance against cryonics (and if you cared a million times more I would expect them to dominate). I think these risks are larger if you are less scope sensitive since the main protection is the small expected fraction of resources controlled by actors who are inclined to make such threats."

    TBH it's difficult to infer a particular probability estimate for one's individual probability without cryonics or voluntary uploading here; it's not completely clear just how bad a scenario would have to be (for a typical biological human) in order to fall within the class of scenarios described as 'plausible but not particularly likely'.

Thanks for the attentious commentary.

  1. Yeah, I was guessing that the smiley faces wouldn't be the best example... I was just wanting to draw something from the Eliezer/Bostrom universe since I had mentioned the paperclipper beforehand. So, maybe a better Eliezer-Bostrom example would be, we ask the AGI to "make us happy", and it puts everyone paralyzed in hospital beds on dopamine drips. It's not hard to think that after a couple hours of a good high, this would actually be a hellish existence, since human happiness is way more complex than the amount of dopamine in one's brain (but of course, Genie in the Lamp, Mida's Touch, etc)

  2. So, don't you equate this kind of scenario with a significant amount of suffering? Again, forget the bad example of the smiley faces, and reconsider. (I've actually read in a popular lesswrong post about s-risks Paul clearly saying that the risk of s-risk was 1/100th of the risk of x-risk (which makes for even less than 1/100th overall). Isn't that extremely naive, considering the whole Genie in the Lamp paradigm? How can we be so sure that the Genie will only create hell 1 time for each 100 times it creates extinction?)

  3. a) I agree that a suffering-maximizer is quite unlikely. But you don't necessarily need one to create s-risks scenarios. You just need a Genie in the Lamp scenario. Like the dopamine drip example, in which the AGI isn't trying to maximize suffering, quite on contrary, but since it's super-smart in Sciences but lacks human common sense (a Genie), it ends up doing it.

b) Yes I had read that article before. While it presents some fair solutions, I think it's far from being mostly solved. "Since hyperexistential catastrophes are narrow special cases (or at least it seems this way and we sure hope so), we can avoid them much more widely than ordinary existential risks." Note the "at least it seems this way and we surely hope so". Plus, what's the odds that the first AGI will be created by someone who listens to what Eliezer has to say? Not that bad actually, if you consider US companies, but if you consider China, then dear God...

On your PS1, yeah definitely not willing to do cryonics, and again, s-risks don't need to come from threats, just misalignment.

Sorry if I black pilled you with this, maybe there is no point... Maybe I'm wrong. I hope I am.

we ask the AGI to "make us happy", and it puts everyone paralyzed in hospital beds on dopamine drips. It's not hard to think that after a couple hours of a good high, this would actually be a hellish existence, since human happiness is way more complex than the amount of dopamine in one's brain (but of course, Genie in the Lamp, Mida's Touch, etc)

This sounds much better than extinction to me! Values might be complex, yeah, but if the AI is actually programmed to maximise human happiness then I expect the high wouldn't wear off. Being turned into a wirehead arguably kills you, but it's a much better experience than death for the wirehead!

(I've actually read in a popular lesswrong post about s-risks Paul clearly saying that the risk of s-risk was 1/100th of the risk of x-risk (which makes for even less than 1/100th overall). Isn't that extremely naive, considering the whole Genie in the Lamp paradigm? How can we be so sure that the Genie will only create hell 1 time for each 100 times it creates extinction?)

I think the kind of Bostromian scenario you're imagining is a slightly different line of AI concern than the types that Paul & the soft takeoff crowd are concerned about. The whole genie in the lamp thing, to me at least, doesn't seem likely to create suffering. If this hypothetical AI values humans being alive & nothing more than that, it might separate your brain in half so that it counts as 2 humans being happy, for example. I think most scenarios where you've got a boundless optimiser superintelligence would lead to the creation of new minds that would perfectly satisfy its utility function.

"This sounds much better than extinction to me! Values might be complex, yeah, but if the AI is actually programmed to maximise human happiness then I expect the high wouldn't wear off. Being turned into a wirehead arguably kills you, but it's a much better experience than death for the wirehead!"

You keep dodging the point lol... As someone with some experience with drugs, I can tell you that it's not fun. Human happiness is way subjective and doesn't depend on a single chemical. For instance, some people love MDMA, others (like me) find it a too intense, too chemical, too fabricated happiness. A forced lifetime on MDMA would be some of the worst tortures I can imagine. It would fry you up. But even a very controlled dopamine drip wouldn't be good. But anyway, I know you're probably trolling, so just consider good old-fashioned torture in a dark dungeon instead...

On Paul: yes, he's wrong, that's how.

" I think most scenarios where you've got a boundless optimiser superintelligence would lead to the creation of new minds that would perfectly satisfy its utility function."

True, except that, on that basis alone, you have no idea how that would happen and what would it imply for those new minds (and old ones), since you're not a digital superintelligence.

I'm way more scared about the electrode-produced smiley faces for eternity and the rest.

Due to Complexity of Value, it's unlikely for a UFAI to optimize against human wellbeing for its own sake; making an AI that does that seems only a little easier than making a perfectly anti-aligned AI, which is basically exactly as hard as making an FAI.

Due to Orthogonality, it seems unlikely for a UFAI to "coincidentally" create suffering by optimizing for something "nearby". E.g. smiley faces, as a "near-miss" of alignment, basically has nothing to with human value (other than being neutral, ignoring the wasted potential). So if there's suffering, there probably has to be an instrumental goal that coincidentally involves conscious beings.

It's a little plausible that the AI has an instrumental goal of acausally bargaining with FAIs in other possible worlds by threatening to torture conscious beings; though, to the extent that negotiations succeed, there's not actually suffering created (because that's not good for anyone), there's just positive value ceded from humanity to the AI (e.g. in the form of us, in worlds where we get an FAI, devoting some fraction of the universe to paperclips or whatever). And to the extent negotiations don't succeed and the AI has to carry out the threat, that seems like it's because the AI is being dumb, which seems to contradict the AI being superintelligent. Though, some bargaining solutions do include burning value, so this isn't open-and-shut.

It's plausible that the AI has an instrumental goal of analyzing the circumstances of its creation, e.g. to bargain with other nearby possible worlds; doing seems to involve humans, which involves consciousness, and probably therefore involves some suffering. I don't see how this leads to very much suffering that's much more extreme than suffering that exists in our world. We could imagine stuff like, the AI "pulls human minds apart, in silico, to see what makes them tick", and that doing so induces lots of suffering. This is somewhat worrying to me. There's an instrumental incentive to *mostly* stick to studying non-extremely-suffering minds, just because most actual human minds aren't extremely suffering most of the time, and humans who do stuff that has historical relevance is I'd guess somewhat biased towards less extreme suffering. But there could be lots of reasons to study extreme stuff, in the spirit of science, so this is still worrying.

What can I do as a 30 year old from Portugal with no STEM knowledge? Start learning math and work on alignment from home?

Doubtful, though it's worth a try, right? If nothing else you'll understand more. Another type of route to decreasing X/S-risk is to repair the social fabric that's supposed to be keeping people from, and offering genuinely more appealing alternatives to, developing technology in a suicidally blind way. Why are people suicidally blind? What contexts bring out suicidal blindness, what supports those contexts? In other cases of suicidal blindness, what seems to hurt or help? What are the ways you're being suicidally blind, and what can you learn from that? What's it like to be suicidally blind? When do people collectively act wisely, and what's it like to be those people, what are the preconditions for that?

Another very nice reply, thanks.

To each paragraph:

  1. Agree.

  2. Not sure I follow. Orthogonality is the thesis that intelligence and goals aren't necessarily related to each other. Intelligence as merely instrumental rationality. So that only helps the argument that the AGI could very well create suffering non-intentionally while trying to make us happy (again, maybe smiley faces isn't the most perfect example, think of all of us paralyzed in hospital beds on dopamine drips instead). Because being a machine it would probably be super intelligent in reasoning, Science, etc, but kind in an autistic way due to lack of sentience and emotions. I.e., something very intelligent is some areas and very dumb in others.

"So if there's suffering, there probably has to be an instrumental goal that coincidentally involves conscious beings."

That part I agree though. I mean, it's kinda equally likely. And it concerns me a lot as well, like the experiments cases. That's stuff that sends me into despair land. That's why we should really be panicking about this, there's too many odds for way bad stuff to happen. And I agree also that changing the social structure could probably even be the only way to accomplish it at this point since we don't have 100 years to solve alignment. Sometimes I really feel like just go talking to people, or carefully try to become an activist, because no one else is doing it, no one is giving ted talks about s-risks. It's so hopeless though... Is the FRI and the likes even doing anything of substance?

Btw, since you seem to have quite some baggage, could you reply also if you think AGI could arrive tomorrow at this technological point, and what are your timelines? Do we actually have any time, even if only a few years?

And also regarding the b-word... Since you've mentioned acausal trade, do you think it only works between ASI's (as I've heard), or between ASI's and humans as well?

Btw, since you seem to have quite some baggage, could you reply also if you think AGI could arrive tomorrow at this technological point, and what are your timelines? Do we actually have any time, even if only a few years?

(Baggage? You might be misusing that idiom?) If it happens tomorrow I'd guess it would very likely have be some weird secret project. I'm not aware of enough ideas out there to make AGI. Though I'm not very well-informed. Seems plausible in 5, 10, 20, 50, or 200 years (though of course the longer you go out, the more possible worlds are dominated by more severe civilizational collapse; not clear whether that helps or hurts our chances; generally less resources means less resources that go to "leisure" activities like studying hypothetical future minds).

Is the FRI and the likes even doing anything of substance?

You can email people and ask them if they want to talk on the phone for a few minutes about S-risk. Or write down questions for them.

Sometimes I really feel like just go talking to people, or carefully try to become an activist, because no one else is doing it, no one is giving ted talks about s-risks. It's so hopeless though... Is the FRI and the likes even doing anything of substance?

Talking to people is really hard. The social world is very confusing to me. I don't think "becoming an activist" does anything; you're really committed, but to what? The thing to do is understand first. I think in the social world, the algorithm you're following is just as important as the object-level thing you're doing. Like, if you go around shrieking about S-risk, you're just as much saying "Let's all go around and shriek about each of our believies" as you are saying "hey, if we make minds that gain strength really fast they'll overpower us and then do whatever they want, including torturing us in order to understand their environment". If you want people working on AGI to be like "hey, maybe I'm doing something that's alien to my values", it might help to go around visibly investigating how you're doing something that's alien to your values. (IDK, probably wouldn't help, but maybe worth trying.)

Since you've mentioned acausal trade, do you think it only works between ASI's (as I've heard), or between ASI's and humans as well?

Not sure what you mean by "works". You can just not do it. Some agents will construe that as a bargaining position, and you can feel free to just say "okay have fun with that". I don't see how to productively acausally trade with ASIs, from my current vantage point. Seems better to just become stronger. If you imagine that a possible AI would threaten you to get you to give it more power instead of gaining your own strength, it's not trying to bargain with you, it's trying to disable you.

we should really be panicking about this

Well, we should panic for a little bit, and then regain our senses and think about what to do. Probably good to periodically panic again, more and more deeply, with more and more people, even. Probably not helpful to constantly panic. Probably not helpful to spread panic for its own sake (as opposed to spreading information, ideas, and social contexts where our intelligence can be amplified rather than deadened).

Sorry, I mixed up the justifications in the first two paragraphs. Or, just, both orthogonality and complexity of value are relevant to both points. In P2, I'm saying that since there's very many utility functions and most of them are basically orthogonal to human value, and human value is pretty specific (you have to have this specific type of computation called "consciousness" for stuff to matter), the result of superintelligent optimization for most utility functions is likely (given orthogonality of values and intelligence, and given that stuff comes apart at the tails) to be irrelevant to human value. Like, Goodharting a near-miss utility function seems likely to neutralize value in the limit of superintelligent optimization. (I'm not that confident of this. Maybe consciousness that we care about is just much more common in computation-space, or much more of a natural category, so that the AIs values are likely to point to consciousness.)

And on a more poetic note, this is such a crappy time to be alive... Specially for the 1% of us who take s-risk seriously. When I take a walk, I look at the people... We could have been in a right path, you know? (At least as far as Liberal democracies are concerned). We could have been building something good, if it wasn't for this huge monster over our heads that most are too dumb, or too coward, to believe in.

Maybe start telling people that we can't play God is a good start. (Not at least until hundreds of years from now till we have the mathematical and the social proof to build God).

Evolution might have not been perfect, allowing things like torture due to an obsolete-for-humans (and highly abusable) survival mechanism called pain. But at least there are balances. It gave us compassion, or more skeptically the need to vomit when we see atrocities. It gave us death so you can actually escape. It gave us shock so you have a high probability of dying quick in extreme cases. There is kind of a balance, even if weak. I see the possibility of that balance being broken with AGI or even just nano by itself.

If only it was possible to implement David Pearce's abolitionist project of anihilating the molecular substrates of below 0 hedonic level, with several safeguards. That used to be my only hope but I think chaos will arrive way first.

Why is it crappy to be alive now? If you want a nice life, now's fairly okay, esp. compared to most of history. If you're worried about the future, best to be in the time that most matters, which is now, so you can do the most good. It does suck that there's all that wasted potential though.

And hey, by the Doomsday anthropic argument, we're probably the people who take over the universe! Or something.

Strongly agree on everything.

2 last clarifications:

On acausal trade, what I meant was, if you believe that it is possible for it to work BETWEEN a human and an ASI (apparently you do?). I've heard people say it doesn't, because you are not intelligent enough to model an ASI. Only an ASI is. Which is what I'm more inclined to believe, also adding that humans are not computers and therefore can't run any type of realistic simulations. But I agree that committing to no blackmail is the correct option anyway.

On AGI timelines, do you feel safe that it's extremely unlikely to arrive tomorrow? Do you often find yourself looking out the window for the nano-swarms, like I do? GPT-3 scares the hell out of me. Do you feel safe that it's at least 5 years? I'd like to have a more technical understanding to have a better idea on that, which can be hard when you're not into computer science.

GPT-3 scares the hell out of me.

I don't think this by itself should scare you very much. Why does it scare you? When there's a surprising capability, we have to update up up on "woah, AI is more impressive than I thought", and also up on "oh, that task is not as hard as I thought". It turns out that sounding natural in written text without close reading, isn't that hard. GPT-3 is going to tell you very little that's true and useful and that it didn't basically read somewhere(s). It can make new puns and similar, but that should not be surprising given word2vec (which is a decade old), and it can remix stuff by shallow association (to little benefit AFAIK). If someone wants to show that it's doing something that's more impressive than that, I'm interested, but you have to show that it's not "getting it directly from the dataset", which isn't a perfectly clear test but I think is the right sort of question. (Examples of things that don't get their impressiveness, if they have any, "directly from the dataset": successful RL agents; automated theorem provers.) In other words, I think "scaling laws" are probably somewhat silly (or at least, the (possibly very straw) version that I'm imagining some people think). Algorithms that we know so far all seem to hit outer limits. It's not like AlphaZero is effectively infinitely good at Chess or Go; it got really good really fast, and then hit an asymptote. (Though this is weak evidence, because it could also have hit some sort of actual upper limit, though that seems implausible.) GPT will get as far as you can get by squeezing more shallow patterns out of text, and then basically stop, is my retro- / pre-diction.

GPT-3 by itself shouldn't scare you very much, I think, but as part of a pattern I think it's scary.

"GPT-3 by itself shouldn't scare you very much, I think, but as part of a pattern I think it's scary."

Exactly. Combining it with other parts, like an agent and something else, like an AI researcher whose name I can't recall said in YouTube interview that I watched (titled something like "GPT-3 is the fire alarm for AGI" (reasons: GPT-2 was kinda dumb and just scalling the model turned into something drastically better, plus the combination aspect that I mentioned).

"Why is it crappy to be alive now? If you want a nice life, now's fairly okay, esp. compared to most of history. If you're worried about the future, best to be in the time that most matters, which is now, so you can do the most good. It does suck that there's all that wasted potential though."

Well isn't it easy to tell?? Life is certainly more comfortable now, and mine certainly has been, but there's a) the immense gloom of knowing the current situation, I don't think any other human in history thought he or his fellow humans might come to face s-risk scenarios (except maybe those fooled by promises of Hell in religions, but I never saw any theist seriously stressed over it)

b) the possibility of being caught in a treacherous turn ending in s-risk scenario, making you paranoid 24/7 and considering... Well, bad things. That vastly outweighs any comfort advantage. Specially when your personal timelines are as short as mine.

And about helping... Again, sorry for being extremely depressing, but it's just how it is: I don't see any hope, don't see any way out, specially because of, again, my short timelines, say 5-10 years. I'm with Eliezer that only a miracle can save us at this point. I started praying to a benevolent creator that might be listening, started hoping for aliens to save us, started hoping for the existence of Illuminati to save us, etc. such is my despair.