So the point of this article is that <i>differentiation</i> is linear. That is, the operator D which takes f to d f/d x is linear. The author points out that one can write this down in as a matrix with respect to a basis of polynomials, which is nice for suitably well behaved functions and I think nice for understanding. Other operators one might look at are integration, Fourier or Laplace transforms, or more exotic integral transforms which are linear. One can view a Fourier transform like a change of basis.<p>In another sense, <i>derivatives</i> themselves are linear: for a function f: U -> V of vector spaces, the derivative (at some point) is a linear map from U -> V, (i.e. the derivative of the functions is a function Df: U -> L(U,V)) and this extends the concept of derivative to multiple dimensions as f(x+h) = f(x) + (Df)(x)h + o(h).<p>This seems ok at first derivatives but can become unwieldy as they became tensors higher rank.<p>Another question one might ask on learning that differntiation is a linear operator is what it’s eigenvalues are. For differentiation these are functions of the form f(x) = exp(ax). But one can construct other linear operators and from this you get Sturm–Liouville theory which is fantastic.<p>One final note is that much of this multidimensional derivatives and tensor stuff becomes a lot easier if one learns suffix notation (aka Einstein notation, aka index notation, aka summation convention), as well as perhaps a few identities with the kronecker delta or Levi-Civita symbol. Notation can break down a bit with arbitrary rank tensors: $a_{i_1,...,i_k}$ becomes unwieldy but writing $a_{pq...r}$ is ok.