I remember being handed this back when I was taking numerical analysis for the first time. It's an old document, but still useful.<p>IMO the critical pieces of CG that make it a favorable choice for many problems in scientific computing are<p>1) the fact that it can be performed matrix free<p>2) its rapid convergence behavior on operators with clusters of eigenvalues (useful for low rank structures)<p>Thet being said, practically speaking, even if I know my operator is positive semi definite, I often find minres out performing cg. There's a nice paper comparing that, "CG versus MINRES: An Empirical Comparison".
Shewchuk’s work on mesh generation is nothing short of a masterpiece. I will always direct people to the source of his Triangle code as an example of what good, literate C code should look like. His Berkeley page is here: <a href="https://people.eecs.berkeley.edu/~jrs/" rel="nofollow noreferrer">https://people.eecs.berkeley.edu/~jrs/</a>
I cited this in my doctoral thesis. I'm not sure the title is accurate though. It doesn't manage to remove the "this bit is magic, just do the exact incantations and it'll work out" feel from it.
Important to note that this method only works on Hermitian (usually AKA symmetric) and positive-definite matrices, both of which are often pretty big qualifiers.
Thank you! I first came across Conjugate Gradient when reviewing the paper on Neural Ordinary Differential Equations. It was quite challenging to parse through that math. This helps.