Hey HN! We're the team behind steel.dev, and today we're excited to share Surf (<a href="https://surf.new" rel="nofollow">https://surf.new</a>) - an open-source playground where you can test and experiment with different web agents. Think of it as a sandbox where you can see how different AI agents/models interact with the web in real-time.
With OpenAI's Operator launch last week, web agents are having a moment. But with it being closed source, not having an API, not SOTA, and costing $200 a month, we felt like there was a gap in clearly understanding where the space is at today for both developers evaluating for prod and end-users. The open-source community is actually leading the way in performance (Shoutout Browser-use, currently SOTA on WebVoyager), but testing different approaches requires a bunch of work, setup, and debugging just to get started.<p>We built Surf to solve this. It's a hosted playground where you can chat with different web agents/models and test automation tasks - basically a "try before you deploy" sandbox for web agents. The most interesting challenge wasn't the agents themselves - it was designing Surf to be generalizable enough for contributors to plug in new agents and models easily.<p>Right now, you can switch between Browser-use’s agent and our experimental Claude Computer-use-based agent. But the real goal is to make it trivial for anyone building web agents to add their own. The whole thing is open source (<a href="https://github.com/steel-dev/surf.new">https://github.com/steel-dev/surf.new</a>) and built on our Steel's Sessions API. We built it with Next.js, Vercel AI SDK, & LangChain to keep everything clean, familiar, and hackable.<p>Heads up - it's new, and slow, and agents can fail in all sorts of ways. But that is kind of the point - we want to make it easier for everyone to understand the current state of web agents, whether you’re evaluating them for production use, just curious about how they work or want to automate some tasks for yourself. We also don’t yet support operator features like taking control and multiple tabs yet either.<p>Try it out at surf.new (no signup needed!), and let us know what breaks. We're actively maintaining it and would love your feedback on what you'd like to see next.