It's interesting how there are so many papers published in this space but they all tend to use these same few images (the webcam chessboard, the two buildings, and the diagrams). I've been looking into how to reliably stitch video frames into "orthographs" (similar problem is faced by aerial surveys; there is a lot of good work on this from the drone community) recently and have read probably two dozen recent papers spanning homography, photogrammetry, feature detection, sfm, nerf, and segmentation and most of them reuse these diagrams and at least some of these images.<p>Maybe the world would benefit from some more well documented, open licensed training/validation data?