It seems relatively plausible that you could use a Limited AGI to build a nanotech system capable of uploading a diverse assortment of (non-brain, or maybe only very small brains) living tissue without damaging them, and that this system would learn how to upload tissue in a general way. Then you could use the system (not the AGI) to upload humans (tested on increasingly complex animals). It would be a relatively inefficient emulation, but it doesn't seem obviously doomed to me.
Probably too late once hardware is available to do this though.
So in a "weird experiment", the infrabayesian starts by believing only one branch exists, and then at some point starts believing in multiple branches?
If there aren't other branches, then shouldn't that be impossible? Not just in practice but in principle.
You can get some weird things if you are doing some weird experiment on yourself where you are becoming a Schrödinger cat and doing some weird stuff like that, you can get a situation where multiple copies of you exist. But if you’re not doing anything like that, you’re just one branch, one copy of everything.
Why does it matter that you are doing a weird experiment, versus the universe implicitly doing the experiment for you via decoherence? If someone else did the experiment on you without your knowledge, does infrabayesianism expect one copy or multiple copies?
If being versed in cryptography was enough, then I wouldn't expect Eliezer to claim being one of the last living descendents of this lineage.
Why would Zen help (and why do you think that)?
This may, perhaps, be confounded by the phenomenon where I am one of the last living descendants of the lineage that ever knew how to say anything concrete at all.
I've previously noticed this weakness in myself. What lineage did Eliezer learn this from? I would appreciate any suggestions or advice on how to become stronger at this.
This came up with Aysajan about two months ago. An exercise which I recommended for him: first, pick a technical academic paper. Read through the abstract and first few paragraphs. At the end of each sentence (or after each comma, if the authors use very long sentences), pause and write/sketch a prototypical example of what you currently think they're talking about. The goal here is to get into the habit of keeping a "mental picture" (i.e. prototypical example) of what the authors are talking about as you read.
Other good sources on which to try this exerci... (read more)
CFAR used to have an awesome class called "Be specific!" that was mostly about concreteness. Exercises included:
[I may try to flesh this out into a full-fledged post, but for now the idea is only partially baked. If you see a hole in the argument, please poke at it! Also I wouldn't be very surprised if someone has made this point already, but I don't remember seeing such. ]
A perfect bayesian doesn't need randomization.
Yet in practice, randomization seems to be quite useful.
How to resolve this seeming contradiction?
I think the key is that a perfect bayesian (Omega) is logically omniscient. Omega can always fully update on all o... (read more)
You're missing the point!
Your arguments apply mostly toward arguing that brains are optimized for energy efficiency, but the important quantity in question is computational efficiency! You even admit that neurons are "optimizing hard for energy efficiency at the expense of speed", but don't seem to have noticed that this fact makes almost everything else you said completely irrelevant!
Going to try answering this one:
... (read more)Humbali: I feel surprised that I should have to explain this to somebody who supposedly knows probability theory. If you put higher probabilities on AGI arriving in the years before 2050, then, on average, you're concentrating more probability into each year that AGI might possibly arrive, than OpenPhil does. Your probability distribution has lower entropy. We can literally just calculate out that part, if you don't believe me. So to the extent that you're wrong, it should shift your probability distributions in the d
This plausibly looks like an existing collection of works which seem to be annotated in a similar way: https://www.amazon.com/Star-Wars-Screenplays-Laurent-Bouzereau/dp/0345409817
That seems a bit uncharitable to me. I doubt he rejects those heuristics wholesale. I'd guess that he thinks that e.g. recursive self improvement is one of those things where these heuristics don't apply, and that this is foreseeable because of e.g. the nature of recursion. I'd love to hear more about what sort of knowledge about "operating these heuristics" you think he's missing!
Anyway, it seems like he expects things to seem more-or-less gradual up until FOOM, so I think my original point still applies: I think his model would not be "shaken out" of his fast-takeoff view due to successful future predictions (until it's too late).
It seems like Eliezer is mostly just more uncertain about the near future than you are, so it doesn't seem like you'll be able to find (ii) by looking at predictions for the near future.
It seems to me like Eliezer rejects a lot of important heuristics like "things change slowly" and "most innovations aren't big deals" and so on. One reason he may do that is because he literally doesn't know how to operate those heuristics, and so when he applies them retroactively they seem obviously stupid. But if we actually walked through predictions in advance, I think he'd see that actual gradualists are much better predictors than he imagines.
I lean toward the foom side, and I think I agree with the first statement. The intuition for me is that it's kinda like p-hacking (there are very many possible graphs, and some percentage of those will be gradual), or using a log-log plot (which makes everything look like a nice straight line, but are actually very broad predictions when properly accounting for uncertainty). Not sure if I agree with the addendum or not yet, and I'm not sure how much of a crux this is for me yet.
Spending money on R&D is essentially the expenditure of resources in order to explore and optimize over a promising design space, right? That seems like a good description of what natural selection did in the case of hominids. I imagine this still sounds silly to you, but I'm not sure why. My guess is that you think natural selection isn't relevantly similar because it didn't deliberately plan to allocate resources as part of a long bet that it would pay off big.
There's more than just differential topology going on, but it's the thing that unifies it all. You can think of differential topology as being about spaces you can divide into cells, and the boundaries of those cells. Conservation laws are naturally expressed here as constraints that the net flow across the boundary must be zero. This makes conserved quantities into resources, for which the use of is convergently minimized. Minimal structures with certain constraints are thus led to forming the same network-like shapes, obeying the same sorts of laws. (See... (read more)
I think "deep fundamental theory" is deeper than just "powerful abstraction that is useful in a lot of domains".
Part of what makes a Deep Fundamental Theory deeper is that it is inevitably relevant for anything existing in a certain way. For example, Ramón y Cajal (discoverer of the neuronal structure of brains) wrote:
... (read more)Before the correction of the law of polarization, we have thought in vain about the usefulness of the referred facts. Thus, the early emergence of the axon, or the displacement of the soma, appeared to us as unfavorable arrangements acting
"you can't make an engine more efficient than a Carnot engine."
That's not what it predicts. It predicts you can't make a heat engine more efficient than a Carnot engine.
Thanks!
Haha yeah, I'm not surprised if this ends up not working, but I'd appreciate hearing why.
One thing that makes AI alignment super hard is that we only get one shot.
However, it's potentially possible to get around this (though probably still very difficult).
The Elitzur-Vaidman bomb tester is a protocol (using quantum weirdness) by which a bomb may be tested, with arbitrarily little risk. It's interest comes from the fact that it works even when the only way to test the bomb is to try detonating it. It doesn't matter how the bomb works, as long as we can set things up so that it will allow/block a photon based on wheth... (read more)
Zurek's einselection seems like perhaps another instance of this, or at least related. The basic idea is (very roughly) that the preferred basis in QM is preferred because persistence of information selects for it.
I think Critch's "Futarchy" theorem counts as a (very nice) selection theorem.
How bad is the ending supposed to be? Are just people who fight the system killed, and otherwise, humans are free to live in the way AI expects them to (which might be something like keep consuming goods and providing AI-mediated feedback on the quality of those goods)? Or is it more like once humans are disempowered no machine has any incentive to keep them around anymore, so humans are not-so-gradually replaced with machines?
The main point of intervention in this scenario that stood out to me would be making sure that (during the paragraph beginning with... (read more)
My guess is that a "clean" algorithm is still going to require multiple conceptual insights in order to create it. And typically, those insights are going to be found before we've had time to strip away the extraneous ideas in order to make it clean, which requires additional insights. Combine this with the fact that at least some of these insights are likely to be public knowledge and relevant to AGI, and I think Eliezer has the right idea here.
This gives a nice intuitive explanation for the Jeffery-Bolker rotation which basically is a way of interpreting a belief as a utility, and vice versa.
Some thoughts:
Not quite sure how specifically this connects, but I think you would appreciate seeing it.
As a good example of the kind of gains we can get from abstraction, see this exposition of the HashLife algorithm, used to (perfectly) simulate Conway's Game of Life at insane scales.
Earlier I mentioned I would run some nontrivial patterns for trillions of generations. Even just counting to a trillion takes a fair amount of time for a modern CPU; yet HashLife can run the breeder to one trillion generations, and print its resulting population of 1,302,083,334,180,208,337,404 in less than a second.
Entropy and temperature inherently require the abstraction of macrostates from microstates. Recommend reading this: http://www.av8n.com/physics/thermo/entropy.html if you haven't seen this before (or just want an unconfused explanation).
Roughly my feelings: https://elicit.ought.org/builder/trBX3uNCd
Reasoning: I think lots of people have updated too much on GPT-3, and that the current ML paradigms are still missing key insights into general intelligence. But I also think enough research is going into the field that it won't take too long to reach those insights.
It seems that privacy potentially could "tame" a not-quite-corrigible AI. With a full model, the AGI might receive a request, deduce that activating a certain set of neurons strongly would be the most robust way to make you feel the request was fulfilled, and then design an electrode set-up to accomplish that. Whereas the same AI with a weak model wouldn't be able to think of anything like that, and might resort to fulfilling the request in a more "normal" way. This doesn't seem that great, but it does seem to me like this is actually part of what makes humans relatively corrigible.
Privacy as a component of AI alignment
[realized this is basically just a behaviorist genie, but posting it in case someone finds it useful]
What makes something manipulative? If I do something with the intent of getting you to do something, is that manipulative? A simple request seems fine, but if I have a complete model of your mind, and use it phrase things so you do exactly what I want, that seems to have crossed an important line.
The idea is that using a model of a person that is *too* detailed is a violation of human values. In particular, it violates... (read more)
Half-baked idea for low-impact AI:
As an example, imagine a board that's lodged directly by the wall (no other support structures). If you make it twice as wide, then it will be twice as stiff, but if you make it twice as thick, then it will be eight times as stiff. On the other hand, if you make it twice as long, it will be eight times more compliant.
In a similar way, different action parameters will have scaling exponents (or more generally, functions). So one way to decrease the risk of high-impact actions would be to make sure that the scaling expo... (read more)
Another way to make it countable would be to instead go to the category of posets, Then the rational interval basis is a poset with a countable number of elements, and by the Alexandroff construction corresponds to the real line (or at least something very similar). But, this construction gives a full and faithful embedding of the category of posets to the category of spaces (which basically means you get all and only continuous maps from monotonic function).
I guess the ontology version in this case would be the category of prosets. (Personally, I'm not sure that ontology of the universe isn't a type error).
Yeah, I think the engineer intuition is the bottleneck I'm pointing at here.
I think people make decisions based on accurate models of other people all the time. I think of Newcomb's problem as the limiting case where Omega has extremely accurate predictions, but that the solution is still relevant even when "Omega" is only 60% likely to guess correctly. A fun illustration of a computer program capable of predicting (most) humans this accurately is the Aaronson oracle.
This post has caused me to update my probability of this kind of scenario!
Another issue related to the information leakage: in the industrial revolution era, 30 years was plenty of time for people to understand and replicate leaked or stolen knowledge. But if the slower team managed to obtain the leading team's source code, it seems plausible that 3 years, or especially 0.3 years, would not be enough time to learn how to use that information as skillfully as the leading team can.
Awesome! I was hoping that there would be a way to do this!
Currying doesn't preserve surjectivity. As a simple example, you can easily find a surjective function , but there are no surjective functions .
Ex 6:
If at any point , then we're done. So assume that we get a strict increase each time up to . Since there are only elements in the entire poset, and is monotone, has to equal .
Ex 7:
For a limit ordinal , define as the least upper bound of for all . If , then the set for is a set of size that maps into a set of size by taking the value of the element. Since there are no injections between these sets, there must be two ordinals such that. Since ... (read more),
Ex 8: (in Python, using a reversal function)
def f(s):
return s[::-1]
dlmt = '"""'
code = """def f(s):
return s[::-1]
dlmt = '{}'
code = {}{}{}
code = code.format(dlmt, dlmt, code, dlmt)
print(f(code))"""
code = code.format(dlmt, dlmt, code, dlmt)
print(f(code))
You can use 1d-Sperner to deal with all the cases effectively.
Found a nice proof for Sperner's lemma (#9):
First some definitions. Call a d-simplex with vertices colored in (d+1) different colored chromatic. Call the parity of the number of chromatic simplices the chromatic parity.
It's easier to prove the following generalization: Take a complex of d-simplices that form a d-sphere: then any (d+1)-coloring of the vertices will have even chromatic parity.
Proof by induction on d:
Base d=-1: vacuously true.
Assume true for d-1: Say you have an arbitrary complex of d-simplices forming a d-sphere, with an arbitra... (read more)
Thanks! I find this approach more intuitive than the proof of Sperner's lemma that I found in Wikipedia. Along with nshepperd's comment, it also inspired me to work out an interesting extension that requires only minor modifications to your proof:
d-spheres are orientable manifolds, hence so is a decomposition of a d-sphere into a complex K of d-simplices. So we may arbitrarily choose one of the two possible orientations for K (e.g. by choosing a particular simplex P in K, ordering its vertices from 1 to d + 1, and declaring it to be the prototypi... (read more)
[Epistemic status: very speculative]
One ray of hope that I've seen discussed is that we may be able to do some sort of acausal trade with even an unaligned AGI, such that it will spare us (e.g. it would give us a humanity-aligned AGI control of a few stars, in exchange for us giving it control of several stars in the worlds we win).
I think Eliezer is right that this wouldn't work.
But I think there are possible trades which don't have this problem. Consider the scenario in which we Win, with an aligned AGI taking control of our future light-cone. Assuming t... (read more)