This doesn't strike directly at the sampling question, but it is related to several of your ideas about incorporating the differentiable function: Neural Ordinary Differential Equations.
This is being exploited most heavily in the Julia community. The broader pitch is that they have formalized the relationship between differential equations and neural networks. This allows things like:
- applying differential equation tricks to computing the outputs of neural networks
- using neural networks to solve pieces of differential equations
- using differential equations to specify the weighting of information
The last one is the most intriguing to me, mostly because it solves the problem of machine learning models having to start from scratch even in environments where information about the environment's structure is known. For example, you can provide it with Maxwell's Equations and then it "knows" electromagnetism.
There is a blog post about the paper and using it with the DifferentialEquations.jl and Flux.jl libraries. There is also a good talk by Christopher Rackauckas about the approach.
It is mostly about using ML in the physical sciences, which seems to be going by the name Scientific ML now.