This is very interesting. Segmentation is a very, very hard problem and the video indicates they have made it fast. 4000 images is a lot and at 21GB per model the processing overhead has to be massive. 60-100 minutes is also a pretty long time to recover one model, but not crazy, and will certainly get faster.<p>I don't see any scale reference markers or other reference material in the scene, so I wonder if it recovers scale from light field depth information alone.<p>Interestingly from the paper, they aren't actually using a light field camera - but rather creating a light volume through sequentially varying the depth of image taken and modeling the volume:<p><i>Key to our method is the dense spatio-angular sampling of video, which results in smoothly varying parallax between successive frames</i><p>A hack for sure!<p>I would be curious to know if using a light field camera would even be useful in this scenario as much of what they do mimics the light field camera processes, only manually.
Fun to see state of the art algorithms juxtaposed with 90's internet aesthetic (auto playing audio ;)).<p>Hopefully lightfield cameras / camera arrays will become more common in the near future.
THIS IS HUGE. They are creating/calculating the missing "dimension" based on statistical input. That is amazing. Physics experiments are all about trying to understand the unknown dimension(s) as well! This D to D+1 dimension transformation is mathematically likely to be solvable for any D value that is an integer too.