科技回声

6 条评论

abrichr大约 1 年前

Thank you for making this available!Check out <a href="https://github.com/OpenAdaptAI/OpenAdapt">https://github.com/OpenAdaptAI/OpenAdapt</a> for a cross platform (Mac and Windows) open source library that learns to perform tasks in desktop apps by observing human demonstrations.We believe a major shortcoming with conventional approaches to AI agents is expecting them to be able to figure tasks of arbitrary complexity from first principles. While understandable from an academic perspective, this is unnecessary for practical utility, since humans perform these tasks constantly.With OpenAdapt you can demonstrate to a model how to perform a task, then have it take over the task, with additional user-supplied natural language instructions.I have created an issue to evaluate OpenAdapt on OSWorld: <a href="https://github.com/OpenAdaptAI/OpenAdapt/issues/642">https://github.com/OpenAdaptAI/OpenAdapt/issues/642</a>. Contributions welcome!Edit: from <a href="https://github.com/xlang-ai/OSWorld/tree/main/evaluation_examples">https://github.com/xlang-ai/OSWorld/tree/main/evaluation_exa...</a>:> The ./trajectories file contains the annotated trajectories for each data item in ./examples for finishing the task.Unfortunately this file does not appear to be included in the repo. I have submitted an issue here: <a href="https://github.com/xlang-ai/OSWorld/issues/30">https://github.com/xlang-ai/OSWorld/issues/30</a>

ec109685大约 1 年前

Buried in their presentation is the current effectiveness of agents to complete desktop computing tasks.Humans are able to complete the tasks given at 70%+ effectiveness while the best model is at 12% (GPT4-v). Most of the other models were <5% effective.

评论 #40193426 未加载

评论 #40193230 未加载

评论 #40193353 未加载

TheRoque大约 1 年前

Gotta love people working on replacing themselves. Jokes aside, seeing an AI interacting with a computer is kind of scary. It's not just outputting text anymore, it's doing the full work of a human working on a computer, meaning... a ton of people

评论 #40192696 未加载

评论 #40193176 未加载

stavros大约 1 年前

I built a small Python script so I could let GPT-4 debug my system issues:<a href="https://github.com/skorokithakis/sysaidmin">https://github.com/skorokithakis/sysaidmin</a>It works surprisingly well!

评论 #40192850 未加载

bitwize大约 1 年前

Coming soon: Human-trained AI that can actuate a robotic hand to fill in paper forms with a Selectric typewriter. The doom of us all!

rosslazer大约 1 年前

Dumb question - What actually needs to be done to close the gap?

6 条评论

abrichr大约 1 年前

ec109685大约 1 年前

评论 #40193426 未加载

评论 #40193230 未加载

评论 #40193353 未加载

TheRoque大约 1 年前

评论 #40192696 未加载

评论 #40193176 未加载

stavros大约 1 年前

评论 #40192850 未加载

bitwize大约 1 年前

Coming soon: Human-trained AI that can actuate a robotic hand to fill in paper forms with a Selectric typewriter. The doom of us all!

rosslazer大约 1 年前

Dumb question - What actually needs to be done to close the gap?

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computers

6 条评论

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computers

6 条评论