How much novel security-critical infrastructure do you need during the singularity?

Adopting new hardware will require modifying security-critical code

Another concern is that AI companies (or the AI company) will rapidly buy a bunch of existing hardware (GPUs, other accelerators, etc.) during the singularity, and handling this hardware will require many infrastructure changes in a short period of time. New infrastructure might be needed to handle highly heterogeneous clusters built out of a bunch of different GPUs/CPUs/etc. (potentially including gaming GPUs) bought in a hurry. AI companies might buy the hardware of other AI companies, and it might be non-trivial to adapt the hardware to the other setup.

[-]Rohin Shah3mo42

Technologies that allow workers to be more isolated from each other gain you both convenience (because your coworkers no longer accidentally mess up what you’re doing) and also security (because you can remove your coworker’s permission to affect the code you’re running), but generally reduce efficiency. When we try to buy efficiency at the cost of convenience, we might lose security too.

Hmm, this feels less likely to me. Isolation can often be an efficiency benefit because one employee's mistake doesn't propagate to screwing up the work of everyone else, and this benefit scales as the number of employees increases. This assumes that labor is at least somewhat unreliable, but I do expect that AI labor will be overall unreliable, just because reliable AI labor will likely be much more expensive and won't look cost-effective to deploy in most situations.

I'd expect that you buy efficiency by building more fine-grained configuration mechanisms and requiring that employees use them. This will also become easier with AI, since you can have specialized AI assistants that understand the configuration mechanisms well and can set them for all experiments (unlike the current status quo where all your employees have to be separately trained on the use of the configurations and many of them forget it and just copy configurations from elsewhere).

In general it seems like this should improve security by creating more info and legibility around what is happening (which in turn gives you more affordances to enforce security). Some examples (not necessarily about AI):

For code that is costly to run (e.g. uses lots of compute, causes physical-world effects, affects production systems), spending labor to run automated bug finding tools that generate lots of false positives, and addressing the results. This directly increases security by having fewer bugs.
Breaking up big jobs into many smaller ones that are easier to schedule across your compute fleet. This can enable you to isolate small jobs from each other which reduces the attack surface.
Requiring detailed analysis and justification before using significant amounts of compute for experiments. This should increase security against rogue internal deployments, as it becomes easier to catch the AI in a lie.

That said, I did think of one example that worsens security -- switching from a memory-safe language like Python to one that requires manual memory management like C. Though in this case, it's that the "security infra" (automatic memory management) itself is too compute-expensive to run, which doesn't seem like it will apply in the AI case.

Overall my very weak guess is that the net effect will be to increase security. Of course, the part where you build more fine-grained configuration mechanisms is "novel (potentially) security-critical infrastructure", so this does support your broader point. (When I say it will increase security, this is under the assumption that the novel infra isn't sabotaged by scheming AIs.)

[-]Buck3mo40

I agree that fine-grained configuration mechanisms probably increase security, and this might net increase security, as I say in the post.

I agree that the increasing value of compute might increase the isolation you use for the reason you said. One reason I'm skeptical is that you can get almost all that value by having AIs voluntarily adopt mechanisms that generally isolate their jobs from the other jobs that are running (e.g. having naming conventions about who is allowed to read or write what) that get you the reliability benefits without getting any security.

[-]Rohin Shah3mo20

I agree that fine-grained configuration mechanisms probably increase security, and this might net increase security, as I say in the post.

You mention permission systems, which is certainly a big deal, but I didn't see anything about broader configuration mechanisms, much of which can be motivated solely by efficiency and incidentally helps with security. (I was disputing your efficiency -> less security claim; permissions mechanisms aren't a valid counterargument since they aren't motivated by efficiency.)

One reason I'm skeptical is that you can get almost all that value by having AIs voluntarily adopt mechanisms that generally isolate their jobs from the other jobs that are running (e.g. having naming conventions about who is allowed to read or write what) that get you the reliability benefits without getting any security.

Hmm, I think this assumes the AIs will be more reliable than I expect them to be. You would want this to be reliable enough that your (AI or human) software engineers ~never have to consider non-isolation as a possible cause of bugs (unless explicitly flagged otherwise); I'd guess that corresponds to about 6 nines of reliability at following the naming conventions? (Which might get you to 8 nines of reliability at avoiding non-isolation bugs, since most naming convention errors won't lead to bugs.)

Some complete guesses: human software engineers might get 0.5-2 nines of reliability at following the naming conventions, a superhuman coder model might hit ~4 nines of reliability if you made this a medium priority, if you add a review step that explicitly focuses just on checking adherence to naming conventions then you might hit 6 nines of reliability if the critique uses a model somewhat more powerful than a current frontier model. But this is now a non-trivial expense; it's not clear to me it would be worth it.

^{^}

I think the most plausible such security failures are the AI exfiltrating its weights, the AI hacking the datacenters enough that the AI company loses necessary observability into what’s happening inside them, or the AI seizing control of the AI company (perhaps by threatening or bribing company leadership).

^{^}

I’m using GDM as the example because they’re part of Google, a company that has famously good security and that doesn’t rely on external cloud providers. The situation seems basically just worse for Anthropic/OpenAI/x.ai.

^{^}

To start with, the AIs might be tasked with doing things that can lead to catastrophe without them having to violate any security invariants that AWS enforces on users. One example is that the AIs might be able to launch rogue internal deployments without violating those security invariants, if it already has access to scary model weights. Another is that if the AIs are tasked with generating data for future training runs, that data might be covertly poisoned. The other big issue is that even though AWS has pretty good security, it’s probably not robust to wildly superintelligent AIs putting hundreds of thousands of human-equivalent years of work into hacking it.

^{^}

[These numbers are sloppy] In March 2027 in the AI 2027 scenario, the compute available to OpenBrain has increased 7x compared to today, while there are now 200K Superhuman coder copies thinking at 30x human speed, for 6M human coder equivalents, which is like 4 OOMs more coders than it probably had in June 2025. So that’s about 3 OOMs less compute than is currently available.

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

31

How much novel security-critical infrastructure do you need during the singularity?

31

Adopting new hardware will require modifying security-critical code

Differences between AI and human labor will incentivize massive infra changes

Adopting new hardware will require modifying security-critical code

Infra rewrites can get you big performance improvements

The AI company builds new security-critical infra really fast for the sake of security

Conclusion