AXRP - the AI X-risk Research Podcast


Does this not essentially amount to just assuming that the inductive bias of neural networks in fact matches the prior that we (as humans) have about the world?

No? It amounts to assuming that smaller neural networks are a better match for the actual data generating process of the world.

One argument sketch using SLT that NNs are biased towards low complexity solutions: suppose reality is generated by a width 3 network, and you're modelling it with a width 4 network. Then, along with the generic symmetries, optional solutions also have continuous symmetries where you can switch which neuron is turned off.

Roughly, say neurons 3 and 4 have the same input weight vectors (so their activations are the same), but neuron 4's output weight vector is all zeros. Then you can continuously scale up the output vector of neuron 4 while simultaneously scaling down the output vector of neuron 3 to leave the network computing the same function. Also, when neuron 4 has zero weights as inputs and outputs you can arbitrarily change the inputs or the outputs but not both.

Anyway, this means that when the data is generated by a slim neural net, optimal nets will have a good RLCT, but when it's generated by a neural net of the right width, optimal nets will have a bad RLCT. So nets can learn simple data, and it's easier for them to learn simple data than complex data - assuming thin neural nets count as simple.

This is basically a justification of something like your point 1, but AFAICT it's closer to a proof in the SLT setting than in your setting.

Maybe - but you definitely can't get it if you don't even try to communicate the thing you think would be better.

For instance, if I was running the US, I'd probably slow down scaling considerably, but I'd mostly be interested in implementing safety standards similar to RSPs due to lack of strong international coordination.

Surely if you were running the US, that would be a great position to try to get international coordination on policies you think are best for everyone?

hiding your beliefs, in ways that predictably leads people to believe false things, is lying

I think this has got to be tempered by Grice to be accurate. Like, if I don't bring up some unusual fact about my life in a brief conversation (e.g. that I consume iron supplements once a week), this predictably leads people to believe something false about my life (that I do not consume iron supplements once a week), but is not reasonably understood as the bad type of lie - otherwise to be an honest person I'd have to tell everyone tons of minutiae about myself all the time that they don't care about.

Is this relevant to the point of the post? Maybe a bit - if I (that is, literally me) don't tell the world that I wish people would stop advancing the frontier of AI, I don't think that's terribly deceitful or ruining coordination. What has to be true for me to have a duty to say that? Maybe for me to be a big AI thinkfluencer or something? I'm not sure, and the post doesn't really make it clear.

I mean, whether something's realistic and whether something's actionable are two different things (both separate from whether something's nebulous) - even if it's hard to make a pause happen, I have a decent guess about what I'd want to do to up those odds: protest, write to my congress-person, etc.

As to the realism, I think it's more realistic than I think you think it is. My impression of AI Impacts' technological temptation work is that governments are totally willing to enact policies that impoverish their citizens without requiring a rigourous CBA. Early wins does seem like an important consideration, but you can imagine trying to get some early wins by e.g. banning AI from being used in certain domains, banning people from developing advanced AI without doing X, Y, or Z.

Is the idea that an indefinite pause is unactionable? If so, I'm not sure why you think that.

The point is that advocating for a “pause” is nebulous and non-actionable

Setting aside the potential advantages of RSPs, this strikes me as a pretty weird thing to say. I understand the term "pause" in this context to mean that you stop building cutting-edge AI models, either voluntarily or due to a government mandate. In contrast, "RSP" says you eventually do that but you gate it on certain model sizes and test results and unpause it under other test results. This strikes me as a bit less nebulous, but only a bit.

I'm not quite sure what's going on here - it's possible that the term "pause" has gotten diluted? Seems unfortunate if so.

Some talks are visible on YouTube here

Did this ever get written up? I'm still interested in it.

Load More