Here's my attempt at explaining this phenomenon:<p>Our Retinas don't send raw data to our brain like a camera sensor would. Instead, the neurons in the retina already do some pre-processing, like boundary detection, or movement detection. The brain then receives signals where movement was detected.<p>This movement detection is basically just detecting changes in light level, so it works better when the contrast is high (dark gray vs. white) and works less good when contrast is low (dark grey vs black).<p>So our brain gets stronger "movement" signals when there is high contrast, and it looks like the part that has high contrast is moving faster. Since the image is designed in a way that the boundary of head/body always have low/high contrast or vice versa, it seems they are moving with different speeds.