I am not convinced it is eli5. I am too lazy to write a blog post w/ illustration, but for audio signals, which I am more familiar with, the intuition behind DCT (and MDCT, used e.g. for mp3) is straightforward.<p>Assuming you understand that a Fourier transform is an operation to go from from time domain to the frequency domain, the problem solved by DCT, DST, etc. is related to the fact that <i>digital</i> signal processing are finite, and without any care, you introduce 'irregularities' if you use a 'normal' Fourier transform.<p>So the main idea of DCT/DST/etc. is to implicitly 'copy' and/or mirror the signal, to reduce the artefacts/irregularities introduced by Fourier Transform. Reducing irregularities intuitively leads to more regular signals, and the more regular your signal, the quicker the high frequencies decrease, which is the compression effect of DCT.<p>More mathematically, but still very informally: DCT/DST is about boundary conditions. Using DFT (the 'normal' Fourier transform for digital signals) will imply discontinuities at the boundaries. For continuous time signals, an intuitive way to define regularities is to measure the decay of successive derivative of a function f(t), by looking at convergence convergence of t^n f(t) as t -> inf for n. That implies that regular functions have bounded Fourier transforms, and the more regular, the faster the Fourier transform decays.<p>The DST/DCT, by mirroring/copying the signal, reduce irregularities, and hence their coefficients decrease faster.