Maciej Bendkowski has some related work [1] on generating random lambda terms, but was unable to overcome what he calls the asymptotic sparsity problem:<p><pre><code> Sampling simply-typed terms seems notoriously more challenging than sampling closed ones. Even rejection sampling, whenever applicable, admits serious limitations due to the imminent asymptotic sparsity problem — asymptotically almost no term, be it either plain or closed, is at the same time (simply) typeable. [...] Asymptotic sparsity of simply-typed λ-terms is an impenetrable barrier to rejection sampling techniques. As the term size tends to infinity, so does the induced rejection overhead. In order to postpone this inevitable obstacle, it is possible to use dedicated mechanisms interrupting the sampler as soon as it is clear that the partially generated term cannot be extended to a typeable one. The current state-of-the-art samplers take this approach, combining Boltzmann models with modern logic programming execution engines backed by highly-optimised unification algorithms. Nonetheless, even with these sophisticated optimisations, such samplers are not likely to generate terms of sizes larger than one hundred.
</code></pre>
I would be curious to see a more rigorous analysis of the sample complexity of generating well-typed expressions in, e.g., the STLC. Maybe there is a way to avoid or reduce the rejection rate before evaluation.<p>[1]: <a href="https://arxiv.org/pdf/2005.08856" rel="nofollow">https://arxiv.org/pdf/2005.08856</a>