Establishing linkages between ML and Differential Geometry is intriguing (to say the least). But I have this nagging sense that "data manifolds" are too rigidly tied to numerical representations for this program to flourish. Differential geometry is all about invariance. Geometric objects have a life of their own so to speak, irrespective of any particular representation. In the broader data science world such an internal structure is not accessible in general. The systems modeled are too complex and their capture in data too superficial to be a reflection of the "true state". In a sense this is analogous to the "blind men touching a elephant in different parts and disagreeing about what it is".
Great read and visuals. I think they typo'd the pun on basically/basisally. It got me thinking about program synthesis in the following scheme: data is embedded as vectors and program operations are metric tensors (or maybe just fields in general?) which tell the data how to move. Then, if you have an input/output pair we seek some program to move the data from input to output along some low energy path. Model a whole program as a time varying (t 0-1) metric tensor (is that a thing?) and optimize to find such an object. Maybe you choose ahead of time the number of operations you're searching over and these are like spline basis points and then you lerp between the metric tensors of each op; or you do it continuously and then somehow recover the operations. Then you want to find one program which satisfies multiple input/output pairs, ie one time varying metric tensor (or generally field) such that if you integrate from the input points they all end up at (or close to, which makes me think that you want some learned metric tensor for closeness) the output points. Right now I'm only thinking of unary ops with no constants, maybe the constants could be appended to the input data symbolically and you also get to optimize that portion of the input vectors, with the constraint that it is a shared parameter across all inputs.