3 点作者 marsh_mellow7 个月前

1 comment

distalx7 个月前

> On one evaluation created to test developers’ attempts to have models use computers, OSWorld, Claude currently gets 14.9%. That’s nowhere near human-level skill (which is generally 70-75%), but it’s far higher than the 7.7% obtained by the next-best AI model in the same category.<p>Here, "next-best AI model in the same category" referes to which model.

Developing a Computer Use Model

1 comment

Developing a Computer Use Model

1 comment