People who’ve spent a lot of time thinking about P vs NP often have the intuition that “verification is easier than generation”. It’s easier to verify a solution to some equations than to find a solution. It’s easier to verify a password than to guess it. That sort of thing. The claim that it is easier to verify solutions to such problems than to generate them is essentially the claim that P ≠ NP, a conjecture which is widely believed to be true. Thus the intuition that verification is generally easier than generation.

The problem is, this intuition comes from *thinking about problems which are in NP*. NP is, roughly speaking, the class of algorithmic problems for which solutions are easy to verify. Verifying the solution to some equations is easy, so that problem is in NP.

I think a more accurate takeaway would be that *among problems in NP*, verification is easier than generation. In other words, among problems for which verification is easy, verification is easier than generation. Rather a less impressive claim, when you put it like that.

With that in mind, here is an algorithmic problem for which generation is easier than verification.

Predicate: given a program, does it halt?

Generation problem: generate a program which halts.

Verification problem: given a program, verify that it halts.

The generation problem is trivial. The verification problem is uncomputable.

That’s it for the post, you all can argue about the application to alignment in the comment section.

I think most people's intuitions come from more everyday experiences like:

These observations seem relevant to questions like "can we delegate work to AI" because they are ubiquitous in everyday situations where we want to delegate work.

The claim in this post seems to be: sometimes it's easier to create an object with property P than to decide whether a borderline instance satisfies property P. You chose a complicated example but you just as well have used something very mundane like "Make a pile of sand more than 6 inches tall." I can do the task by making a 12 inch pile of sand, but if someone gives me a pile of sand that is 6.0000001 inches I'm going to need very precise measurement devices and philosophical clarification about what "tall"means.

I don't think this observation undermines the claim that "it is easier to verify that someone has made a tall pile of sand than to do it yourself." If someone gives me a 6.000001 inch tall pile of sand I can say "could you make it taller?" And if I can ask for a program that halts and someone gives me a program that looks for a proof of false in PA, I can just say "try again."

I do think there are plenty of examples where verification is not easier than generation (and certainly where verification is non-trivial). It's less clear what the relevance of that is.

I don't think the generalization of the OP is quite "sometimes it's easier to create an object with property P than to decide whether a borderline instance satisfies property P". Rather, the halting example suggests that verification is likely to be harder than generation specifically when there is some (possibly implicit) adversary. What makes verification potentially hard is the part where we have to quantify over all possible inputs - the verifier must work for

anyinput.Borderline cases are

anissue for that quantifier, but more generally any sort of adversarial pressure is a potential problem.Under that perspective, the "just ask it to try again on borderline cases" strategy doesn't look so promising, because a potential adversary is potentially optimizing against me - i.e. looking for cases which will fool not only my verifier, but myself.

As for the everyday experiences you list: I agree that such experiences seem to be where peoples' intuitions on the matter often come from. Much like the case in the OP, I think people select for problems for which verification is easy - after all, those are the problems which are most legible, easiest to outsource (and therefore most likely to be economically profitable), etc. On the other hand, once we actively look for cases where adversarial pressure makes verification hard, or where there's a "for all" quantifier, it's easy to find such cases. For instance, riffing on your own examples:

To the extent that any of these problems are hard to verify, I think it's almost entirely because of the "position of the interior" where an attacker can focus their effort on hiding an attack in a single place but a defender needs to spread their effort out over the whole attack surface.

But in that case we just apply verification vs generation again. It's extremely hard to tell if code has a security problem, but in practice it's quite easy to verify a correct claim that code has a security problem. And that's what's relevant to AI delegation, since in fact we will be using AI systems to help oversee in this way.

If you want to argue that e.g. writing secure software is fundamentally hard to verify, I think it would be much more interesting and helpful to exhibit a case of software with a vulnerability where it's really hard for someone to verify the claim that the vulnerability exists.

Rice's theorem says there are a lot of programs where you can't tell if they will halt. But if I want to write a program that will/won't halt, I'm just going to write a program for which it's obvious. And if I asked you to write a program that will/won't halt and you write the kind of program where I can't tell, I'm just going to send it back.

Now that could still be hard. You could put a subtle problem in your code that makes it so it halts eventually even though it looks like it obviously doesn't. But Rice's theorem doesn't say anything about that.

And reiterating the previous point, if someone wrote down a program that looks like it obviously doesn't halt, but secretly it does because of an adversarial trick, then I would very strongly expect that someone could point out my mistake to me and I would conclude that it is no longer obvious whether it halts. Counterexamples to this kind of optimism would be way more impactful.

Conditional on such counterexamples existing, I would usually expect to not notice them. Even if someone displayed such a counterexample, it would presumably be quite difficult to verify that it is a counterexample. Therefore a lack of observation of such counterexamples is, at most, very weak evidence against their existence; we are forced to fall back on priors.

I get the impression that you have noticed the lack of observed counterexamples, and updated that counterexamples are rare, without noticing that you would also mostly not observe counterexamples even if they were common. (Though of course this is subject to the usual qualifiers about how it's difficult to guess other peoples' mental processes, you have better information than I about whether you indeed updated in such a way, etc.)

That said, if I were to actively look for such counterexamples in the context of software, the obfuscated C code competition would be one natural target.

We can also get indirect bits of evidence on the matter. For instance, we can look at jury trials, and notice that they are notoriously wildly unreliable in practice. That suggests that, relative to the cognition of a median-ish human, there must exist situations in which one lawyer can point out the problem in another's logic/evidence, and the the median-ish human will not be able verify it. Now, one could argue that this is merely because median-ish humans are not very bright (a claim I'd agree with), but then it's rather a large jump to claim that e.g. you or I is so smart that analogous problems are not common for us.

I'm bowing out here, you are welcome to the last word.

P≠NP deserves a little more credit than you give it. To interpret the claim correctly, we need to notice P and NP are classes of

decision problems, not classes ofproof systemsfor decision problems. You demonstrate that for afixedproof system it is possible that generating proofs is easier than verifying proofs. However, if we fix a decision problem and allowanyvalid (i.e. sound and complete) proof system, then verifying cannot be harder than generating. Indeed, let S1 be some proof system and A an algorithm for generating proofs (i.e. an algorithm that finds a proof if a proof exists and outputs "nope" otherwise). Then, we can construct another proof system S2, in which a "proof" is just the empty string and "verifying" a proof for problem instance x consists of running A(x) and outputting "yes" if it found an S1-proof and "no" otherwise. Hence, verification in S2 is no harder than generation in S1. Now, so far it's just P⊆NP, which is trivial. The non-trivial part is: there exist problems for which verification is tractable (insomeproof system) while generation is intractable (inanyproof system). Arguably there are evenmanysuch problems (an informal claim).