From Anthropic Claude Opus:<p>Imagine you have a picture book, and each page shows a different scene with various characters and objects. Now, think of a smart robot that can look at these pages and understand what's in them.<p>Apple's scientists have created a robot called ReALM that does something similar but with computer screens. When ReALM looks at a screen, it doesn't just see an image. Instead, it reads the screen like a book, identifying all the different things on the screen and where they are located.<p>ReALM then writes down what it sees in a special way, kind of like making a list of everything on the screen and giving each item a specific place. This helps ReALM understand the screen's content and how it's organized.<p>By doing this and with some extra training, ReALM has become really good at a task called "reference resolution." This means that when you ask ReALM about something specific on the screen, it can quickly find and point out what you're asking for, even better than other smart robots like GPT-4.<p>In short, ReALM is like a super-smart robot that can read computer screens, make a list of what it sees, and help you find things on the screen faster and better than ever before!