And, guess what? Since, the Galerkin approximation requires one to choose a basis that is appropriate to the problem at hand, we now have a deep learning solution too (since neural network learning is essentially equivalent to learning an adaptive basis).<p>It is called the Deep Galerkin Method [1]. In a nutshell, the method directly minimizes the L2 error over the PDE, boundary conditions and initial conditions. The integral is tricky though, and computed via a Monte Carlo approximation.<p>[1]: <a href="https://arxiv.org/abs/1708.07469" rel="nofollow">https://arxiv.org/abs/1708.07469</a>