The flip and slide thing that electrical engineers do has always struck me as unintuitive; it makes more sense to think of it as a blurring operation.<p>Consider a function f: R^2 -> R (or Z^2 -> Z if you like) that represents a grayscale image. So f(0,0) is the pixel at the origin, f(1,0) is the pixel at (1,0), etc. Think of g: R^2 -> R as a blurring function, e.g. a gaussian.<p>What convolution does is it turns every pixel of f into a copy of the blur g, weighted by and centered on each pixel being blurred. So f(0,0) gets turned into a blurred image h(x,y) = f(0,0)g(x,y). f(1,0) gets turned into a blurred image h(x,y) = f(1,0)g(x-1,0). Note that the subtraction is just recentering g so the blur applies in the right position. In general, each pixel gets blurred into the function h(x,y) = f(a,b)g(x-a,y-b).<p>Now sum up all the blurred pixels, so you get the final image (f*g)(x,y) = integral_(a,b) f(a,b)g(x-a,y-b).<p>Same thing can be done in the time domain instead of a spatial domain, or you can write it in vector form, so (f*g)(x) = integral_(r) f(r)g(x-r).<p>Note that you can also write it in a more symmetric way as (f*g)(c) = integral_(a+b=c) f(a)g(b), which makes it clear that you can do this for any semigroup (you get the normal definition when you have a group by noting that b=c-a), makes commutativity obvious, and makes it clear why polynomial multiplication is convolution of coefficients: x^n comes from summing over all the x^i*x^j with i+j=n, so the coefficient for x^n is the sum over all coefficients indexed by i,j with i+j=n, which is the symmetric way to write convolution.