All Posts

Sorted by Magic (New & Upvoted)

Tuesday, August 11th 2020
Tue, Aug 11th 2020

No posts for August 11th 2020

Sunday, August 9th 2020
Sun, Aug 9th 2020

No posts for August 9th 2020

Saturday, August 8th 2020
Sat, Aug 8th 2020

Shortform
3Adele Lopez3dPrivacy as a component of AI alignment [realized this is basically just a behaviorist genie [https://arbital.com/p/behaviorist/], but posting it in case someone finds it useful] What makes something manipulative? If I do something with the intent of getting you to do something, is that manipulative? A simple request seems fine, but if I have a complete model of your mind, and use it phrase things so you do exactly what I want, that seems to have crossed an important line. The idea is that using a model of a person that is *too* detailed is a violation of human values. In particular, it violates the value of autonomy, since your actions can now be controlled by someone using this model. And I believe that this is a significant part of what we are trying to protect when we invoke the colloquial value of privacy. In ordinary situations, people can control how much privacy they have relative to another entity by limiting their contact with them to certain situations. But with an AGI, a person may lose a very large amount of privacy from seemingly innocuous interactions (we're already seeing the start of this with "big data" companies improving their advertising effectiveness by using information that doesn't seem that significant to us). Even worse, an AGI may be able to break the privacy of everyone (or a very large class of people) by using inferences based on just a few people (leveraging perhaps knowledge of the human connectome [http://www.humanconnectomeproject.org/], hypnosis, etc...). If we could reliably point to specific models an AI is using, and have it honestly share its model structure with us, we could potentially limit the strength of its model of human minds. Perhaps even have it use a hardcoded model limited to knowledge of the physical conditions required to keep it healthy. This would mitigate issues such as deliberate deception or mindcrime. We could also potentially allow it to use more detailed models in specific cases, for example, we co

Friday, August 7th 2020
Fri, Aug 7th 2020

No posts for August 7th 2020

Tuesday, August 4th 2020
Tue, Aug 4th 2020

Shortform
7Buck Shlegeris7dI used to think that slower takeoff implied shorter timelines, because slow takeoff means that pre-AGI AI is more economically valuable, which means that economy advances faster, which means that we get AGI sooner. But there's a countervailing consideration, which is that in slow takeoff worlds, you can make arguments like ‘it’s unlikely that we’re close to AGI, because AI can’t do X yet’, where X might be ‘make a trillion dollars a year’ or ‘be as competent as a bee’. I now overall think that arguments for fast takeoff should update you towards shorter timelines. So slow takeoffs cause shorter timelines, but are evidence for longer timelines. This graph [https://www.dropbox.com/s/camtto747uqyqqq/IMG_1553.jpg?dl=0] is a version of this argument: if we notice that current capabilities are at the level of the green line, then if we think we're on the fast takeoff curve we'll deduce we're much further ahead than we'd think on the slow takeoff curve. For the "slow takeoffs mean shorter timelines" argument, see here: https://sideways-view.com/2018/02/24/takeoff-speeds/ This [https://sideways-view.com/2018/02/24/takeoff-speeds/ This] point feels really obvious now that I've written it down, and I suspect it's obvious to many AI safety people, including the people whose writings I'm referencing here. Thanks to Caroline Ellison for pointing this out to me, and various other people for helpful comments. I think that this is why belief in slow takeoffs is correlated with belief in long timelines among the people I know who think a lot about AI safety.
4Adele Lopez8dHalf-baked idea for low-impact AI: As an example, imagine a board that's lodged directly by the wall (no other support structures). If you make it twice as wide, then it will be twice as stiff, but if you make it twice as thick, then it will be eight times as stiff. On the other hand, if you make it twice as long, it will be eight times more compliant. In a similar way, different action parameters will have scaling exponents (or more generally, functions). So one way to decrease the risk of high-impact actions would be to make sure that the scaling exponent is bounded above by a certain amount. Anyway, to even do this, you still need to make sure the agent's model is honestly evaluating the scaling exponent. And you would still need to define this stuff a lot more rigorously. I think this idea is more useful in the case where you already have an AI with high-level corrigible intent and want to give it a general "common sense" about the kinds of experiments it might think to try. So it's probably not that useful, but I wanted to throw it out there.

Load More Days