Access to powerful AI might make computer security radically easier

[-]Bird Concept1y311

I think this is an application of a more general, very powerful principle of mechanism design: when cognitive labor is abundant, near omni-present surveillance becomes feasible.

For domestic life, this is terrifying.

But for some high stakes, arms race-style scenarios, it might have applications.

Beyond what you metioned, I'm particularly interested in this being a game-changer for bilateral negotiation. Two parties make an agreement, consent to being monitored by an AI auditor, and verify that the auditor's design will communicate with the other party if and only if there has been a rule breach. (Beyond the rule breach, it won't be able to leak any other information. And, being an AI, it can be designed to have its memory erased, never recruited as a spy, etc.) However, one big challenge of building this is how two adversarial parties could ever gain enough confidence to allow such a hardware/software package into a secure facility, especially if it's whole point is to have a communication channel to their adversary.

[-]mako yass1y*20

However, one big challenge of building this is how two adversarial parties could ever gain enough confidence to allow such a hardware/software package into a secure facility, especially if it's whole point is to have a communication channel to their adversary.

Isn't it enough to constrain the output format to so that no steganographic leaks would be possible? Wont the counterparty usually be satisfied just with an hourly signal saying either "Something is wrong" (encompassing "Auditor saw a violation" / "no signal, the host has censored the auditor's report" / "invalid signal, the host has tampered with the auditor system" / "auditor has been blinded to the host's operations, or has ascertained that there are operations which the auditor cannot see") or "Auditor confirms that all systems are nominal and without violation."?

The host can remain in control of their facilities, as long as the auditor is running on tamperproof hardware. It's difficult to prove that a physical device can't be tampered with, it may be possible to take some components of the auditor even further and run them in a zero knowledge virtual machine, which provides a cryptographic guarantee that the program wasn't tampered with, so long as you can make it lithe enough to fit (zero knowledge virtual machines currently run at a 10,000x slowdown, though I don't think specialized hardware for them is available yet, crypto may drive that work), though a ZKVM wont provide a guarantee that the inputs to the system aren't being controlled, the auditor is monitoring inputs of such complexity — either footage of the real world or logs of a large training run — that it may be able to prove algorithmically to itself that the sensory inputs weren't tampered with either and the algorithm does have a view into the real world (I'm contending that even large state actors could not create Descarte's evil demon).

[-]RussellThor1y00

Good to see people thinking about how AI can make security better. It's usually the other way round, so I expect there is unexplored benefit there.

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

53

Access to powerful AI might make computer security radically easier

53

Four strategies for using powerful AI to improve security

Monitoring

Trust displacement

Fine-grained permission management

AI investigation of automatically detected suspicious activity

How vulnerable to jailbreaks or trickery is this?

These techniques seem really powerful