I don't know how the coordinates were defined for the 900 images. They might have just copy-pasted pixel readings from some image program like ImageJ for half an hour. At first I thought they used html area tags to define bounding coordinates for a number of circles or rects around every image coordinate, but in functions.js there's VoronoiGrid.js. I had never heard of that - it separates all of the coordinate points so that every image they used gets displayed (every image gets its own "puzzle piece" of pixels it will be retrieved for: no two images are in the same bounded area).
Setup:<p>1. Define the dimensions for the view port (640 x 480, for example). This means, there are 640 times 480 (= 307,200) pixel locations in the viewport<p>2. Procure 640 times 480 (= 307,200) images of people pointing to each of the pixel locations<p>3. Save each image with the corresponding X,Y pixel location it points to in, say, a database<p>Run:<p>1. Get the (x,y) location of the mouse pointer on the view port<p>2. Query the database with the (x,y) co-ordinates and retrieve the corresponding image<p>3. Display image