Rogue AGI Embodies Valuable Intellectual Property

This employee has 100 million dollars, approximately 10,000x fewer resources than the hedge fund. Even if the employee engaged in unethical business practices to achieve a 2x higher yearly growth rate than their former employer, it would take 13 years for them to have a similar amount of capital.

I think it's worth being explicit here about whether increases in resources under control are due to appreciation of existing capital or allocation of new capital.

If you're talking about appreciation, then if the firm earns 5% returns on average and the rogue... (read more)

Big picture of phasic dopamine

Cheers for the post, I find the whole series fascinating.

One thing I was particularly curious about is how these 'proposals' are made. Do you have a picture of what kind of embedding is used to present a potential action?

For example, is a proposal encoded in the activations of set of neurons that are isomorphic to the motor neurons and it could then propose tightening a set of finger muscles through specific neurons? Or is the embedding jointly learned between the two in some large unstructured connection, or smaller latent space, or something completely different?

35moThe least-complicated case (I think) is: I (tentatively) think that the
hippocampus is more-or-less a lookup table with a finite number of discrete
thoughts / memories / locations / whatever (the type of content in different in
different species), and a "proposal" is just "which of the discrete things
should be activated right now".
A medium-difficulty case is: I think motor cortex stores a bunch of sequences of
motor commands which execute different common action sequences. (I'm a believer
in the Graziano theory [https://pubmed.ncbi.nlm.nih.gov/17964243/]that primary
motor cortex, secondary motor cortex, supplementary motor cortex, etc. etc., are
all doing the same kind of thing and should be lumped together.) The exact
details of the data structures that the brain uses to store these sequences of
motor commands are controversial and I don't want to get into it here…
Then the hardest case is the areas that "think thoughts", spawn new ideas, etc.,
all the cool stuff that leads to human intelligence. (e.g. dorsolateral
prefrontal cortex I think.) Things like "I'm going to go to the store" or "what
if I differentiate both sides of the equation?". Those things are clearly not
isomorphic to a sequence of motor commands. It's higher-level than that. Again,
the exact data structures and algorithms involved in representing and searching
for these "thoughts" is a very big and controversial topic that I don't want to
get into here…

Testing The Natural Abstraction Hypothesis: Project Intro

Another little update, speed issue solved for now by adding SymPy's fortran wrappers to the derivative calculations - calculating the SVD isn't (yet?) the bottleneck. Can now quickly get results from 1,000+ step simulations of 100s of particles.

Unfortunately, even for the pretty stable configuration below, the values are indeed exploding. I need to go back through the program and double check the logic but I don't think it should be chaotic, if anything I would expect the values to hit zero.

It might be that there's some kind of quasi-chaotic behaviou... (read more)

25moIf the wheels are bouncing off each other, then that could be chaotic in the
same way as billiard balls. But at least macroscopically, there's a crapton of
damping in that simulation, so I find it more likely that the chaos is
microscopic. But also my intuition agrees with yours, this system doesn't seem
like it should be chaotic...

Testing The Natural Abstraction Hypothesis: Project Intro

Been a while but I thought the idea was interesting and had a go at implementing it. Houdini was too much for my laptop, let alone my programming skills, but I found a simple particle simulation in pygame which shows the basics, can see below.

Planned next step is to work on the run-time speed (even this took a couple of minutes run, calculating the frame-to-frame Jacobian is a pain, probably more than necessary) and then add some utilities for creatin... (read more)

35moNice!
A couple notes:
* Make sure to check that the values in the jacobian aren't exploding - i.e.
there's not values like 1e30 or 1e200 or anything like that. Exponentially
large values in the jacobian probably mean the system is chaotic.
* If you want to avoid explicitly computing the jacobian, write a method which
takes in a (constant) vectoruand uses backpropagation to return∇x0(xt⋅u).
This is the same as the time-0-to-time-t jacobian dotted withu, but it
operates on size-n vectors rather than n-by-n jacobian matrices, so should be
a lot faster. Then just wrap that method in a LinearOperator
[https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.LinearOperator.html]
(or the equivalent in your favorite numerical library), and you'll be able to
pass it directly to an SVD method
[https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.svds.html]
.
In terms of other uses... you could e.g. put some "sensors" and "actuators" in
the simulation, then train some controller to control the simulated system, and
see whether the data structures learned by the controller correspond to singular
vectors of the jacobian. That could make for an interesting set of experiments,
looking at different sensor/actuator setups and different controller
architectures/training schemes to see which ones do/don't end up using the
singular-value structure of the system.

Testing The Natural Abstraction Hypothesis: Project Intro

Reading this after Steve Byrnes' posts on neuroscience gives a potentially unfortunate view on this.

The general impression is that the a lot of our general understanding of the world is carried in the neocortex which is running a consistent statistical algorithm and the fact that humans converge on similar abstractions about the world could be explained by the statistical regularities of the world as discovered by this system. At the same time, the other parts of the brain have a huge variety of structures and have functions which are the products of evolu... (read more)

57moHere's one fairly-standalone project which I probably won't get to soon. It
would be a fair bit of work, but also potentially very impressive in terms of
both showing off technical skills and producing cool results.
Short somewhat-oversimplified version: take a finite-element model of some
realistic objects. Backpropagate to compute the jacobian of final state
variables with respect to initial state variables. Take a singular value
decomposition of the jacobian. Hypothesis: the singular vectors will roughly map
to human-recognizable high-level objects in the simulation (i.e. the nonzero
elements of any given singular vector should be the positions and momenta of
each of the finite elements comprising one object).
Longer version: conceptually, we imagine that there's some small independent
Gaussian noise in each of the variables defining the initial conditions of the
simulation (i.e. positions and momenta of each finite element). Assuming the
dynamics are such that the uncertainty remains small throughout the simulation -
i.e. the system is not chaotic - our uncertainty in the final positions is then
also Gaussian, found by multiplying the initial distribution by the jacobian
matrix. The hypothesis that information-at-a-distance (in this case "distance" =
later time) is low-dimensional then basically says that the final distribution
(and therefore the jacobian) is approximately low-rank.
In order for this to both work and be interesting, there are some constraints on
both the system and on how the simulation is set up. First, "not chaotic" is a
pretty big limitation. Second, we want the things-simulated to not just be pure
rigid-body objects, since in that case it's pretty obvious that the method will
work and it's not particularly interesting. Two potentially-interesting cases to
try:
* Simulation of an elastic object with multiple human-recognizable components,
with substantial local damping to avoid small-scale chaos. Cloth or jello or
a sticky hand or

Developmental Stages of GPTs

I agree that this is the biggest concern with these models, and the GPT-n series running out of steam wouldn't be a huge relief. It looks likely that we'll have the first human-scale (in terms of parameters) NNs before 2026 - Metaculus, 81% as of 13.08.2020.

Does anybody know of any work that's analysing the rate at which, once the first NN crosses the n-parameter barrier, other architectures are also tried at that scale? If no-one's done it yet, I'll have a look at scraping the data from Papers With Code's databases on e.g. I... (read more)

Preparing for "The Talk" with AI projects

Hey Daniel, don't have time for a proper reply right now but am interested in talking about this at some point soon. I'm currently in UK Civil Service and will be trying to speak to people in their Office for AI at some point soon to get a feel for what's going on there, perhaps plant some seeds of concern. I think some similar things apply.

11ySure, I'd be happy to talk. Note that I am nowhere near the best person to talk
to about this; there are plenty of people who actually work at an AI project,
who actually talk to AI scientists regularly, etc.

Soft takeoff can still lead to decisive strategic advantage

I think this this points to the strategic supremacy of relevant infrastructure in these scenarios. From what I remember of the battleship era, having an advantage in design didn't seem to be a particularly large advantage - once a new era was entered, everyone with sufficient infrastructure switches to the new technology and an arms race starts from scratch.

This feels similar to the AI scenario, where technology seems likely to spread quickly through a combination of high financial incentive, interconnected social networks, state-sponsored espionage e... (read more)

Torture and Dust Specks and Joy--Oh my!
or: Non-Archimedean Utility Functions as Pseudograded Vector Spaces

Apologies if this is not the discussion you wanted, but it's hard to engage with comparability classes without a framework for how their boundaries are even minimally plausible.

Would you say that all types of discomfort are comparable with higher quantities of themselves? Is there always a marginally worse type of discomfort for any given negative experience? So long as both of these are true (and I struggle to deny them) then transitivity seems to connect the entire spectrum of negative experience. Do you think there is a way to remove the transitivity of comparability and still have a coherent system? This, to me, would be the core requirement for making dust specks and torture incomparable.

32yI agree that delineating the precise boundaries of comparability classes is a
uniquely challenging task. Nonetheless, it does not mean they don't exist--to me
your claim feels along the same lines as classical induction "paradoxes"
involving classifying sand heaps [https://en.wikipedia.org/wiki/Sorites_paradox]
. While it's difficult to define exactly what a sand heap is, we can look at
many objects and say with certainty whether or not they are sand heaps, and
that's what matters for living in the world and making empirical claims (or
building sandcastles anyway).
I suspect it's quite likely that experiences you may be referring to as "higher
quantities of themselves" within a single person are in fact qualitatively
different and no longer comparable utilities in many cases. Consider the dust
specks: they are assumed to be minimally annoying and almost indetectable to the
bespeckèd. However, if we even slightly upgrade them so as to cause a noticeable
sting in their targeted eye, they appear to reach a whole different level. I'd
rather spend my life plagued by barely noticeable specks (assuming they have no
interactions) than have one slightly burn my eyeball.

Topological Fixed Point Exercises

I've realised that you've gotta be careful with this method because when you find a trichromatic subtriangle of the original, it won't necessarily have the property of only having points of two colours along the edges, and so may not in fact contain a point that maps to the centre.

This isn't a problem if we just increase the number n by which we divide the whole triangle instead of recursively dividing subtriangles. Unfortunately now we're not reducing the range of co-ords where this fixed point must be, only finding a triad of ar... (read more)

33yYeah, you're right. That breaks the proof. I don't know how to deal with it yet.

Topological Fixed Point Exercises

Cleanest solution I can find for #8:

Also, if we have a proof for #6 there's a pleasant method for #7 that should work in any dimension:

We take our close*d convex set tha*t has the bounded function . We take a triangle that covers so that any point in is also in .

Now we define a new function such that where is the function that maps to the nearest point in .

By #6 we know that has a fixed point, since is continuous. We know that the fixed point of cannot lie outside because th... (read more)

On my approach:

I constructed a large triangle around the convex shape with the center somewhere in the interior. I then projected each point in the convex shape from the center towards the edge of the triangle in a proportional manner. ie. The center stays where it is, the points on the edge of the convex shape are projected to the edge of the triangle and a point 1/x of the distance from the center to the edge of the convex shape is 1/x of the distance from the center to the edge of the triangle.

Topological Fixed Point Exercises

Yeah agreed, in fact I don't think you even need to continually bisect, you can just increase n indefinitely. Iterating becomes more dangerous as you move to higher dimensions because an n dimensional simplex with n+1 colours that has been coloured according to analogous rules doesn't necessarily contain the point that maps to zero.

On the second point, yes I'd been assuming that a bounded function had a bounded gradient, which certainly isn't true for say sin(x^2), the final step needs more work, I like the way you did it in the proof below.

23yI hit that stumbling block as well. I handwaved it by saying "continue iterating
until you have x(k,B) and x(k,G) such that f(xk,B)<0, f(xk,G)≥0, and f has no
local maxima or local minima on the open interval (xk,B,xk,G)", but that doesn't
work for the Weierstrass function, which will (I believe) never meet that
criterion.

Topological Fixed Point Exercises

Here's a messy way that at least doesn't need too much exhaustive search:

First let's separate all of the red nodes into groups so that within each group you can get to any other node in that group only passing through red nodes, but not to red nodes in any other group.

Now, we trace out the paths that surround these groups - they immediately look like the paths from Question 1 so this feels like a good start. More precisely, we draw out the paths such that each vertex forms one side of a triangle that has a blue node at its opposite corner. ... (read more)

Topological Fixed Point Exercises

I was able to get at least (I think) close to proving 2 using Sperner's Lemma as follows:

You can map the continuous function f(x) to a path of the kind found in Question 1 of length n+1

by evaluating f(x) at x=0, x=1 and n-1 equally spaced divisions between these two points and setting a node as blue if f(x) < 0 else as green.

By Sperner's Lemma there is an odd, and therefore non-zero number of b-g vertices. You can then take any b-g pair of nodes as the starting points for a new path and repeat the process. After k iterations you have two v... (read more)

I'm having trouble understanding why we can't just fix in your proof. Then at each iteration we bisect the interval, so we wouldn't be using the "full power" of the 1-D Sperner's lemma (we would just be using something close to the base case).

Also if we are only given that is continuous, does it make sense to talk about the gradient?

I've been looking at papers involving a lot of 'controlling for confounders' recently and am unsure about how much weight to give their results.

Does anyone have recommendations about how to judge the robustness of these kind of studies?

Also, I was considering doing some tests of my own based on random causal graphs, testing what happens to regressions when you control for a limited subset of confounders, varying the size/depth of graph and so on. I can't seem to find any similar papers but I don't know the area, does anyone know of similar work?