TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Apple Researchers Reveal New AI System That Can Beat GPT-4

12 点作者 cyclecount大约 1 年前

4 条评论

hackernoteng大约 1 年前
Apple needs to go back to the Steve Jobs era where they just released a final product with no rumors. Just "here it is" on stage, and available that day in the store. They need to replace Siri ASAP with a real AI LLM.
评论 #39909034 未加载
评论 #39908856 未加载
potatoman22大约 1 年前
It beats GPT-4... at reference resolution.<p>&quot;ReALM reconstructs the visual layout of a screen using textual representations. This involves parsing on-screen entities and their locations to generate a textual format that captures the screen&#x27;s content and structure. Apple researchers found that this strategy, combined with specific fine-tuning of language models for reference resolution tasks, significantly outperforms traditional methods, including the capabilities of OpenAI&#x27;s GPT-4.*
jaggs大约 1 年前
From Anthropic Claude Opus:<p>Imagine you have a picture book, and each page shows a different scene with various characters and objects. Now, think of a smart robot that can look at these pages and understand what&#x27;s in them.<p>Apple&#x27;s scientists have created a robot called ReALM that does something similar but with computer screens. When ReALM looks at a screen, it doesn&#x27;t just see an image. Instead, it reads the screen like a book, identifying all the different things on the screen and where they are located.<p>ReALM then writes down what it sees in a special way, kind of like making a list of everything on the screen and giving each item a specific place. This helps ReALM understand the screen&#x27;s content and how it&#x27;s organized.<p>By doing this and with some extra training, ReALM has become really good at a task called &quot;reference resolution.&quot; This means that when you ask ReALM about something specific on the screen, it can quickly find and point out what you&#x27;re asking for, even better than other smart robots like GPT-4.<p>In short, ReALM is like a super-smart robot that can read computer screens, make a list of what it sees, and help you find things on the screen faster and better than ever before!
评论 #39908326 未加载
评论 #39907366 未加载
评论 #39908830 未加载
sharemywin大约 1 年前
This seems like the last thing a walled garden would want to exist. An agent that could interact with a screen for you.
评论 #39907324 未加载