using new for each node and value, combined with virtual dispatch tends to be a c++ anti-pattern. It looks like you are writing other languages in C++ syntax, motivated by promises of speed.<p>The actual benefits of C++ come when you approach problems differently. This is a case where more exposure to C helps you avoid all the Java isms.<p>Things to consider:<p>- can you allocate memory for the whole system?
- can you make types homogenous so they can fit in tight arrays (unions are common for nodes)
- can you batch similar types
- specially for auto diff/math can you represent operations as a stack instead of a tree?<p>I am only bringing this up because you said your goal was to learn C++.
The actual C++/CUDA code is here:<p><a href="https://gitlab.com/mebassett/quixotic-learning/-/tree/master/silly_autodiff" rel="nofollow">https://gitlab.com/mebassett/quixotic-learning/-/tree/master...</a><p>about 1,000 LoC overall.