Buck Shlegeris

Posts

Sorted by New

Comments

Buck's Shortform

I used to think that slower takeoff implied shorter timelines, because slow takeoff means that pre-AGI AI is more economically valuable, which means that economy advances faster, which means that we get AGI sooner. But there's a countervailing consideration, which is that in slow takeoff worlds, you can make arguments like ‘it’s unlikely that we’re close to AGI, because AI can’t do X yet’, where X might be ‘make a trillion dollars a year’ or ‘be as competent as a bee’. I now overall think that arguments for fast takeoff should update you towards shorter timelines.

So slow takeoffs cause shorter timelines, but are evidence for longer timelines.

This graph is a version of this argument: if we notice that current capabilities are at the level of the green line, then if we think we're on the fast takeoff curve we'll deduce we're much further ahead than we'd think on the slow takeoff curve.

For the "slow takeoffs mean shorter timelines" argument, see here: https://sideways-view.com/2018/02/24/takeoff-speeds/

This
point feels really obvious now that I've written it down, and I suspect it's obvious to many AI safety people, including the people whose writings I'm referencing here. Thanks to Caroline Ellison for pointing this out to me, and various other people for helpful comments.

I think that this is why belief in slow takeoffs is correlated with belief in long timelines among the people I know who think a lot about AI safety.

$1000 bounty for OpenAI to show whether GPT3 was "deliberately" pretending to be stupider than it is
It's tempting to anthropomorphize GPT-3 as trying its hardest to make John smart. That's what we want GPT-3 to do, right?

I don't feel at all tempted to do that anthropomorphization, and I think it's weird that EY is acting as if this is a reasonable thing to do. Like, obviously GPT-3 is doing sequence prediction--that's what it was trained to do. Even if it turns out that GPT-3 correctly answers questions about balanced parens in some contexts, I feel pretty weird about calling that "deliberately pretending to be stupider than it is".

Possible takeaways from the coronavirus pandemic for slow AI takeoff

If the linked SSC article is about the aestivation hypothesis, see the rebuttal here.

Let's talk about "Convergent Rationality"

In OpenAI's Roboschool blog post:

This policy itself is still a multilayer perceptron, which has no internal state, so we believe that in some cases the agent uses its arms to store information.

Aligning a toy model of optimization

Given a policy π we can directly search for an input on which it behaves a certain way.

(I'm sure this point is obvious to Paul, but it wasn't to me)

We can search for inputs on which a policy behaves badly, which is really helpful for verifying the worst case of a certain policy. But we can't search for a policy which has a good worst case, because that would require using the black box inside the function passed to the black box, which we can't do. I think you can also say this as "the black box is an NP oracle, not a oracle".

This still means that we can build a system which in the worst case does nothing, rather than in the worst case is dangerous: we do whatever thing to get some policy, then we search for an input on which it behaves badly, and if one exists we don't run the policy.

Robustness to Scale

I think that the terms introduced by this post are great and I use them all the time

Six AI Risk/Strategy Ideas

Ah yes this seems totally correct

Six AI Risk/Strategy Ideas

Minor point: I think asteroid strikes are probably very highly correlated between Everett branches (though maybe the timing of spotting an asteroid on a collision course is variable).