If we expand out an arbitrary program, we get a (usually infinite) expression tree. For instance, we can expand

fact(n) := (n == 0) ? 1 : n*fact(n-1)

into

fact(n) = (n == 0) ? 1 : n*(((n-1) == 0) ? 1 : (n-1) * ( (((n-1)-1) == 0) ? ... ))))

Let's call two programs "expression-equivalent" if they expand into the same expression tree (allowing for renaming of input variables). Two interesting problems:

  • Write an efficient algorithm to decide expression-equivalence of two given programs.
  • Write a program M which decides whether a given program is expression-equivalent to M itself.

I'm pretty sure these are both tractable, as well as some relaxations of them (e.g. allowing for more kinds of differences between the expressions).

Does anybody know of an existing name for "expression-equivalence", an existing algorithm for solving either of the above problems, or any relevant literature?

New Answer
New Comment

1 Answers sorted by

Ramana Kumar

Mar 29, 2020

20

I think this is related to the word problem for the rewriting system defined by your programming language. When I first read this question I was thinking "Something to do with Church-Rosser?" -- but you can follow the links to see for yourself if that literature is what you're after.

12 comments, sorted by Click to highlight new comments since: Today at 1:52 PM

Not sure I understand the question. Consider these two programs:

  1. f(n) := f(n)

  2. f(n) := f(n+1)

Which expression trees do they correspond to? Are these trees equivalent?

The first would generate a stick: ((((((((...)))))))))

The second would generate: (((((...) + 1) + 1) + 1) + 1)

These are not equivalent.

Does that make sense?

I don't understand why the second looks like that, can you explain?

Oh, I made a mistake. I guess they would look like ...((((((((...)))))))))... and ...(((((...) + 1) + 1) + 1) + 1)..., respectively. Thanks for the examples, that's helpful - good examples where the fixed point of expansion is infinite "on the outside" as well as "inside".

Was that the confusion? Another possible point of confusion is why the "+ 1"s are in the expression tree; the answer is that addition is usually an atomic operator of a language. It's not defined in terms of other things; we can't/don't beta-reduce it. If it were defined in terms of other things, I'd expand it, and then the expression tree would look more complicated.

Then isn't it possible to also have infinite expansions "in the middle", not only "inside" and "outside"? Something like this:

f(n) := f(g(n))
g(n) := g(n+1)

Maybe there's even some way to have infinite towers of infinite expansions. I'm having trouble wrapping my head around this.

Yup, that's right.

I tentatively think it's ok to just ignore cases with "outside" infinities. Examples like f(n) = f(n+1) should be easy to detect, and presumably it would never show up in a program which halts. I think programs which halt would only have "inside" infinities (although some non-halting programs would also have inside infinities), and programs with non-inside infinities should be detectable - i.e. recursive definitions of a function shouldn't have the function itself as the outermost operation.

Still not sure - I could easily be missing something crucial - but the whole problem feels circumventable. Intuitively, Turing completeness only requires infinity in one time-like direction; inside infinities should suffice, so syntactic restrictions should be able to eliminate the other infinities.

Ok, if we disallow cycles of outermost function calls, then it seems the trees are indeed infinite only in one direction. Here's a half-baked idea then: 1) interpret every path from node to root as a finite word 2) interpret the tree as a grammar for recognizing these words 3) figure out if equivalence of two such grammars is decidable. For example, if each tree corresponds to a regular grammar, then you're in luck because equivalence of regular grammars is decidable. Does that make sense?

Yeah, that makes sense. And off the top of my head, it seems like they would indeed be regular grammars - each node in the tree would be a state in the finite state machine, and then copies of the tree would produce loops in the state transition graph. Symbols on the edges would be the argument names (or indices) for the inputs to atomic operations. Still a few i's to dot and t's to cross, but I think it works.

Elegant, too. Nice solution!

I'm actually not sure it's a regular grammar. Consider this program:

f(n) := n+f(n-1)

Which gives the tree

n+(n-1)+((n-1)-1)+...

The path from any 1 to the root contains a bunch of minuses, then at least as many pluses. That's not regular.

So it's probably some other kind of grammar, and I don't know if it has decidable equivalence.

Just to make sure I understand, the first few expansions of the second one are:

  • f(n)
  • f(n+1)
  • f((n+1) + 1)
  • f(((n+1) + 1) + 1)
  • f((((n+1) + 1) + 1) + 1)

Is that right? If so, wouldn't the infinite expansion look like f((((...) + 1) + 1) + 1) instead of what you wrote?

Yes, that's correct. I'd view "f((((...) + 1) + 1) + 1)" as an equivalent way of writing it as a string (along with the definition of f as f(n) = f(n + 1)). "...(((((...) + 1) + 1) + 1) + 1)..." just emphasizes that the expression tree does not have a root - it goes to infinity in both directions. By contrast, the expression tree for f(n) = f(n) + 1 does have a root; it would expand to (((((...) + 1) + 1) + 1) + 1).

Does that make sense?

I think that makes sense, thanks.