(This post uses a more generalized definition of "simulator" than the one used in Simulators. Simulators, as defined here, are functions optimized towards some arbitrary range of inputs, and without requirements on how the function was created. If you need to distinguish between the two definitions, the simulators here can basically just be described as functions that represent some real world process.)

                                                                    image by DALL-E

 

5 Point Summary (Idea from this article):

  1. Antelligence is emulation accuracy. In other words, when a simulator tries to emulate a different system, its accuracy (the similarity between its outputs and the outputs of the other system) is the same thing as its antelligence. This also makes antelligence relative, meaning that it’s dependent on what is considered the “ideal” output/system.
  2. Antelligence is not defined through agency or being goal driven, and so addresses modern neural networks and simulators like GPT-3/4 and DALL-E much better than Intelligence.
  3. Simulators optimize towards arbitrary ranges over arbitrary inputs, with optimizers being a type of simulator which optimize towards only 1 range over all inputs.
  4. Antelligence is why humans are so successful, since high antelligence relative to the real world allows us to predict the future, letting us act in ways much more intelligent than other creatures.
  5. Artificial simulators are inherently benign (won’t suddenly transform into an agent that tries to kill us all), but powerful and potentially dangerous tools that would only lead to human extinction if we simulate and actively give it the ability to kill us all and/or give it to agents (human or artificial) that want to kill us all, making them equivalent in danger to nuclear weapons (with their strength being determined by their antelligence relative to the real world, as compared to humans).

Eliezer Yudkowsky was the first to define intelligence as Efficient Cross-Domain Optimization, with this definition being the primary one used in discussions over AI and potential AGI’s and ASI’s for years now, along with helping to spur the many discussions over AI alignment. I’m making this article here to offer an alternative to this idea of intelligence, which may be better for analyzing AI like GPT-3/4 and DALL-E than a definition based on optimization. Of course, I don’t want to argue over definitions, which is why I’ll be referring to this new idea as “antelligence.”

This idea builds mainly off of the ideas discussed in Simulators, so I’d recommend reading that article first (although I will be reiterating a lot of information from there in here as well). And, as a fair warning, I’d like to say that a lot of this article strays pretty far from conventional theories of AI intelligence that I’ve seen here, and so I haven’t quite reconciled how I got to pretty different and important conclusions in this article compared to those in other articles here. Onto the rest of this article, the first half will be focusing on exploring antelligence, and the second on how it relates to a few previous ideas in AI alignment discussion, like optimizers and mesa-optimizers.

Simulators and Antelligence

Simulation Theory is a theory which introduces the idea of classifying GPT-3/4 and AI like it through the lens of “Simulators.” Simulators then produce “simulacra,” which are basically the results from plugging different inputs in the simulator. So, a simulator is something like GPT-3/4, while simulacra are like the text that GPT-3/4 outputs. Simulacra can also be simulators, as long as they’re functions which can still take inputs and which are optimized for (although the optimization requirement is redundant, as seen in my proof in the 2nd half where I prove that all simulacra are arbitrarily optimized). The best part about simulators is that they’re, by definition, just functions, where you plug in specific inputs to set the conditions and get specific outputs (or with more complex systems, probability distributions). This allows us to think of, and possibly calculate, antelligence, along with directly allowing us to analyze neural networks as antelligent (since they’re just extremely large functions themselves).

Now, a thought experiment. Imagine two simulators. Let’s say that at first, the two simulators are trained on totally different data, and create completely different functions which produce very different outputs based on the same inputs. In this scenario, let’s also say that simulator 1 is something which we only have access to through its inputs and outputs, but not the actual computation itself (like the entire sum of internet text). We can then try to train simulator 2 on the inputs and outputs of simulator 1, similar to how ChatGPT trains on large amounts of internet text. We can say that, at first, the outputs of the two simulators are not very similar, and that simulator 2 has low antelligence relative to simulator 1, since the accuracy with which it emulates simulator 1 is low. As the two become more similar, simulator 2 gains antelligence relative to simulator 1 until the two situations are equivalent and they both have 100% antelligence relative to each other. 

Antelligence, in this case, is something relative. It’s something which changes based on whichever simulator is determined to be more valuable. The delusional man on the street’s brain might be simulating stuff which inaccurately emulates the real world, but there’s nothing stopping us other than practicality from saying that the real world is wrong and the man’s brain is actually correct. This also explains how seemingly intelligent beings can have varying levels of antelligence in reference to different tasks, since antelligence is fundamentally task specific. 

These tasks are learned through memory. In the brain, memories are sets of neurons that fire together. The same holds true for other neural networks as well, like those found in AI. These memories are formed by learning, aka using training data to change the weights and biases of the neurons. So, we can say that experiences are used, through a method of learning (like backpropagation), to modify and define the function in a way we can think of as creating memories. This makes a lot of intuitive sense. People who share similar memories tend to have similar viewpoints, as AI which have similar weights and biases tend to create similar outputs. People who have lots of experience and memories doing specific tasks are better at doing them effectively than those who don’t. The more diverse someone’s memories, the better they’ll be able to deal with new tasks.

Since antelligence is relative, it should be talked about dependent on certain tasks. A simulator can be highly intelligent relative to the ideal chess system, but have very low antelligence relative to the ideal internet chatbot, and vice versa. Therefore, the way that general antelligence emerges is training on a simulator on many different tasks, allowing for the tasks, and the tasks similar to the tasks the AI is trained for, to be done more accurately. This seems to line up with what we’ve seen in real life, with more training data, and different types of training data, allowing for higher antelligence relative to the real world. To achieve AGI, then, would simply take a sufficiently diverse range of training inputs and the architectures/learning methods to calculate and store the patterns found.

For sufficiently complicated environments (like the types ASI would have to train), the problem of how much training power is needed becomes much more prominent. It’s reasonable to say that simulators, when trained in the most ideal way to emulate, will typically detect the most obvious patterns first. It’s also typical that patterns tend to give less and less of a boost to the emulation accuracy as they become more and more complicated. This will be since obvious patterns will generally occur more frequently, and so are generally more important to the outputs of the less frequent patterns.

Therefore, the rate at which an AI’s antelligence relative to the real world goes up will decrease as the simulator becomes more intelligent (aka, there will be diminishing returns on antelligence gain). This growth is also asymptotic, since the AI will not be able to be more than 100% relative to the universe. The amount of computing power, time, and input amounts will all necessarily go up as AI’s become more intelligent (as seen now with current AI systems), removing any possible of a singularity, and possibly any change of ASI popping up soon (which will be limited by how modern computing power compares to the power of our own brains).

Of course, for the complicated problems of today, like solving the Riemann Hypothesis, curing cancer, etc, these improvements in antelligence will be much more meaningful. They might also be counteracted, at least partly, by the AI potentially engineering new computer parts and learning methods, although it’s very possible that by the time AI’s are antelligent enough to start making gains in these areas, we will have already reached close enough to the limits of technology in these areas to prevent a FOOM. While the returns may be diminishing in general, they’ll still be pretty smart with specific tasks that we haven’t yet done yet, still supporting the idea that there will be an explosion of new knowledge once AI slides past the antelligence of humans in different tasks.

On Optimizers

To start, let’s compare optimizers and simulators. Simulators, as I explained in the previous section, can be thought of as functions. Janus goes a step further in defining them as being “models trained with predictive loss on a self-supervised dataset, invariant to architecture or data type” (aka optimized towards what Janus calls the simulation objective), but this is (mostly) extraneous for the purposes of this article, as I’ll explain now. 

Optimizers optimize towards a specific goal. When it comes to actually implementing and creating optimizers, it’s best to think of optimizers as being functions (like simulators) which take inputs and then create outputs which fall within the range of the specified goal/what’s being optimized for. For the sake of not changing definitions, we’ll say that an optimizer optimizes towards 1 specific range. If we wanted to, though, we could also analyze simulators as functions that optimize, but instead of optimizing towards the outputs being in 1 range, it optimizes towards arbitrary ranges over arbitrary sets of inputs. 

Under this definition, we can also prove that any simulacra is optimized for. This is since all simulacra come by plugging inputs into a simulator, reducing the possibilities for outputs, meaning that the range of possible outputs of the simulacra must always be less than or equal to the range of possible outputs of the original simulator. We can then say that for any simulacra, that smaller range or equal range is the arbitrary range that the simulacra is optimized for. Since everything is technically a simulacra of the laws of physics, we can say that everything is optimized for, making the previous inclusion of optimization in the definitions of simulacra and simulators redundant. This also means that all optimizers are simulators that optimizers can be defined simulators which optimize towards 1 range.

These properties allow us to reconsider the problem of mesa-optimization presented with natural selection and the emergence of human beings. This example is explained on the alignment forum:

Example: Natural selection is an optimization process that optimizes for reproductive fitness. Natural selection produced humans, who are themselves optimizers. Humans are therefore mesa-optimizers of natural selection.

In the context of AI alignment, the concern is that a base optimizer (e.g., a gradient descent process) may produce a learned model that is itself an optimizer, and that has unexpected and undesirable properties. Even if the gradient descent process is in some sense "trying" to do exactly what human developers want, the resultant mesa-optimizer will not typically be trying to do the exact same thing.

So, in this problem, both humans and natural selection are categorized simply as optimizers. If we were to analyze them both as simulators, however, we could disentangle some important properties of the two which demonstrate where the argument in the example above falters. 

We can define natural selection as a simulacra of biological environments (which act as simulators) over the range of most lifeforms which optimizes for lifeforms with higher reproductive fitness. The lifeforms within these environments are also optimizers, with the range they optimize for being within the range that generally increases their reproductive fitness. This optimization towards reproductive fitness was then done by proxy, through things emotions, feelings, and instincts, but still ultimately not straying from having high reproductive fitness. 

Admittedly, even humans still have very high reproductive fitness when looking at the entire species, it’s just that since natural selection relies on competition, and we’re orders of magnitudes more powerful than our competition, the effect of natural selection has become mute. The key to why we were able to get so powerful in the first place, though, is our extraordinary ability to predict what will happen in the environment. This was done through the creation of a separate simulator in our head which is trained on the environment around us, creating a simulator more similar to the environment, and therefore with a higher antelligence relative to the environment. The more animal part of our brain then interacts with the new simulator, allowing us to act on predicted events and increase our fitness massively.

What allowed humans such a great advantage, therefore, was the addition of a similarity which predicts the environment in an environment where all other lifeforms aren’t able to. The argument from the example at the beginning of this section (and the explanation in the introduction of this article) argues that since humans are no longer optimized for what natural selection optimized for, it’s possible that something analogous could happen with AI and our own training methods for it. The problem with this argument is that natural selection isn’t actually an optimizer, or even a simulator. The environment is, and natural selection is simply a property/tendency of the environment, and so the bonds of natural selection are actually pretty weak. The bonds we put on AI training could also be weak if we allow for the AI to start using the internet or have access to the outside world in other ways, but that still doesn’t necessarily spell out AI doom. Reproductive fitness, what natural selection optimizes for, selects for lifeforms that hog resources and try to expand at all costs in highly competitive environments, which are things that we probably won’t select for in our training of AI’s. Technically, we don’t even need to train AI agents, since it’s completely possible to just create independent simulators (like with GPT-3/4).

Say, worst case scenario, someone creates a simulator with antelligence much higher than our own relative to the real world. Someone then trains a bunch of AI’s to focus on killing each other and hogging as many resources as possible using RL, and then plugs the AI agent into the super powerful simulator. The AI agent still has to be trained to connect to the simulator. And we’d get to see this AI train as well, probably in an pseudo-environment similar to the real world (since anyone malicious enough to do this probably doesn’t want the AI to get caught, which, due to its initial incompetence in interacting with the simulator, is very likely to happen). But that doesn’t even consider the fact that, assuming the ASI understood language, it’s more likely people would just talk to it, putting in certain inputs and seeing how it predicted what happened.

Conclusion

Based on what we’ve seen recently and the idea of antelligence, AI, at future levels of antelligence, would be like nuclear energy, something dangerous and powerful, but ultimately controllable. It’s likely that it will increase only at the same rate as computing power, passing human antelligence (possibly soon), but still within reasonable limits, and after increasingly longer and more computer intensive training periods. AI alignment, then, should focus on preventing people from training AI agents to also use simulators (especially if they have access to the outside world), and making sure that if those AI agents are connected to the simulators, that they’re goals and motivations don’t lead to the AI doing things we don’t want them to do (especially like accessing extra antelligence relative to the world in the simulator which gives the AI extra powerful abilities). 

New to LessWrong?

New Comment