In addition to the differential equation, you can also tweak the definition of the exponential function as a limit used with e.g. compound interest:<p><pre><code> exp(tA) = lim n->infinity (I+tA/n)^n
= lim n->infinity (I+tA/n)...(I+tA/n) (n times)
</code></pre>
So you can interpret A (or log T) as a direction to move from the identity, and exp does infinite iterated compositions of an infinitesimal shift away from the identity in that direction.
Cool article. Regarding section "The exponential map and logarithm map", if you're interested in computing the matrix exponential, there is the classic: C Moler, C Van Loan, "Nineteen dubious ways to compute the exponential of a matrix, twenty-five years later". Also, using the series expansion is not necessarily unrobust as long as you don't stop at a fixed number of iterations but instead go on as long as terms have a norm greater than some tolerance. Scaling and squaring can be used to remain always in a given range of norms (less than 1, say).<p>Regarding Pitfall #3, the interpolation scheme exp(tlog(A) + (1-t)log(B)) is shortest path in a sense, just not with the usual matrix norms. See V Arsigny et al., "Log‐Euclidean metrics for fast and simple calculus on diffusion tensors". I can't help but find it more elegant than exp(log(BA^{-1})t)A which could just as well have been exp(log(A^{-1}B)t)A, or even Aexp(log(A^{-1}B)t), right? It also fixes the "no more than two transforms", as you can put any convex combination in exp(sum_i x_i log(A_i)).
essentially you decompose the transformation into (axial translation) + (screw rotation) + (oriented orthogonal stretch) and each of them are just straightforward interpolation: axial is linear, screw is angular, and stretch is exponential.