This post is about following type signature, which I call the type of agency: (A→B)→A. You can also think of it as consequentialism or doing things on purpose. This post will be a rant with a bunch of random thoughts related to this type signature, and it will likely not make sense. It will also be sloppy and will have type errors, but I think it is worth posting anyway.
First, interpret these arrows as causal arrows, but you can also think of them as function arrows. This is saying that the causal relationship from A to B causes A to happen. Think of A as an action and B as the goal. The reason that A happens is the fact that it has B as a consequence. There are not normally exponential objects like this in Bayes' nets, but I think you can modify so that it makes sense. (I'm not sure that this works, but you have a Cartesian closed category with nodes that are the nodes in your Bayes net, and add small number of morphisms from product nodes to individual nodes, corresponding to the functions in the Bayes' net. The acyclicness of the Bayes' net roughly corresponds to this category being thin. Then you can consider having other types of morphisms that can keep the category thin.)
If you have a game between two agents with action nodes A1 and A2, with utilities U1 and U2. The game implements a pair of functions A1×A2→U1 and A1×A2→U2. We can Curry these functions and think of them as A2→(A1→U1) and A1→(A2→U2). Bringing in the agency (Ai→Ui)→Ai of both players leads to cycle. This cycle does not make sense unless the agency arrows are lossy in some way, so as to not be able to create a contradiction.
Fortunately, there is another reason to think that these agency arrows will be lossy. Lawvere's Fixed Point Theorem says that in a Cartesian closed category, unless B has the fixed point property, you cannot have a surjective function A→(A→B), in Set this is saying that if B has more than one element, you cannot have an injection (A→B)→A. i.e. The agency arrows have to be lossy.
Also, notice that Argmax, takes in a function f from some set A to R, and returns an element of the domain, A, so Argmax has type (A→R)→A.
This one is a bit more of a stretch, but if you look at gradient descent, you have some space X, you have a function f:X→R. The gradient can be thought of as a function from infinitesimal changes in X to infinitesimal changes in f(X). Gradient descent works by converting this gradient into a change in X. i.e. Gradient descent looks kind of like (∂X→∂R)→∂X.
It's interesting to notice that there's nothing with that type on hoogle (Haskell language search engine), so it's not the type of any common utility.
On the other hand, you can still say quite a bit on functions of that type, drawing from type and set theory.
First, let's name a generic function with that type k:(A→B)→A . It's possible to show that k cannot be parametric in both types. If it were, (0→0)→0 would be valid, which is absurd (0→0 has an element!). It' also possible to show that if k is not parametric in one type, it must have access to at least an element of that type (think about (A→0)→A and (0→B)→0).
A simple cardinality argument also shows that k must be many-to-one (that is, non injective): unless B is 1 (the one element type), |BA|>|A|
There is an interesting operator that uses k, which I call interleave:
It's interesting because partially applying interleave to some k has the type (A→B)→B, which is the type of continuations, and I suspect that this is what underlies the common usage of such operators.
I found a paper about this exact sort of thing. Escardo and Olivia call that type signature a "selection functional", and the type signature (A→B)→B is called a "quantification functional", and there's several interesting things you can do with them, like combining multiple selection functionals into one in a way that looks reminiscent of game theory. (ie, if ϵ has type signature (A→C)→A, and δ has type signature (B→C)→B, then ϵ⊗δ has type signature ((A×B)→C)→(A×B).
I think the gradient descent bit is spot on. That also looks like the flavour of natural selection, with non infinitesimal (but really small) deltas. Natural selection consumes a proof that a particular δx (mutation) produces δf (fitness) to generate/propagate/multiply δx.
I recently did some thinking about this and found an equivalence proof under certain conditions for the natural selection case and the gradient descent case.
In general, I think the type signature here can indeed be soft or fuzzy or lossy and you still get consequentialism, and the 'better' the fidelity, the 'better' the consequentialism.
This post has also inspired some further thinking and conversations and refinement about the type of agency/consequentialism which I'm hoping to write up soon. A succinct intuitionistic-logic-flavoured summary is something like (∃A.A→B)→A but there's obviously more to it than that.