> “It goes beyond image classification — the most popular task in computer vision — and tries to answer one of the most fundamental questions in computer vision: What is the right representation of visual scenes?<p>Can someone knowledgeable in graphics research explain the context that this question comes from?<p>If I am reading the question correctly, I infer that the question suggests that there exists a right way to reproduce the visual experience of reality. To me, this sounds like a question that is equally valid to have no answer (or many answers) in aesthetics, art, and philosophy, etc.