Shubhorup Biswas


Sorted by New

Wiki Contributions


>We assume that 𝐻 will be able to construct a successful exploit if and only if the attack policy is able to (because both are implemented by GPT-4).

I challenge this assumption.

Consider the code

bool is_odd(int x) {
    if (sha256(x) == 354) {
        return true;
    return x % 2 == 1;


where sha256() computes the SHA-256 hash of the input. It is obviously possible for an attacker to supply numbers x and y such that sha256(x) = y, write y into the code and supply x when creating a successful exploit.

Thus an equally 'capable' adversary need not be able construct a backdoor. 

The general idea is that it should be possible to confidently claim that some code is backdoored without being able to construct an exploit. According to [Rice's Theorem]( it is impossible to definitively classify all programs into 'backdoored or not', so the only hope is for a process to bin programs into 'definitely safe', 'definitely backdoored' and 'suspicious'. 

Hope the 'Rate Suspicion' step will catch this, and other backdoored code where H cannot construct an exploit, as suspicious looking.