Show HN: Browser MCP – Automate your browser using Cursor, Claude, VS Code

616 pointsby namukangabout 1 month ago

51 comments

rmacabout 1 month ago

[!warning!]1) this projects' chrome extension sends detailed telemetry to posthog and amplitude:- <a href="https://storage.googleapis.com/cobrowser-images/telemetry.png" rel="nofollow">https://storage.googleapis.com/cobrowser-images/telemetry.pn...</a>- <a href="https://storage.googleapis.com/cobrowser-images/pings.png" rel="nofollow">https://storage.googleapis.com/cobrowser-images/pings.png</a>2) this project includes source for the local mcp server, but not for its chrome extension, which is likely bundling <a href="https://github.com/ruifigueira/playwright-crx">https://github.com/ruifigueira/playwright-crx</a> without attributionsuper suss

评论 #43624154 未加载

评论 #43623145 未加载

评论 #43620709 未加载

bhoustonabout 1 month ago

So the website claims:"Avoids bot detection and CAPTCHAs by using your real browser fingerprint."Yeah, not really.I've used a similar system a few weeks back (one I wrote myself), having AI control my browser using my logged in session, and I started to get Captcha's during my human sessions in the browser and eventually I got blocked from a bunch of websites. Now that I've stopped using my browser session in that way, the blocks eventually went away, but be warned, you'll lose access yourself to websites doing this, it isn't a silver bullet.

评论 #43614754 未加载

评论 #43624297 未加载

评论 #43614578 未加载

评论 #43615784 未加载

评论 #43619122 未加载

StevenNunezabout 1 month ago

I feel like I slept for a day and now MCPs are everywhere... I don't know what MCPs are and at this point I'm too afraid to ask.

评论 #43616648 未加载

评论 #43616347 未加载

评论 #43617729 未加载

评论 #43617292 未加载

评论 #43620074 未加载

andy_pppabout 1 month ago

When I go to a shopping website I want to be able to tell my browser "hey please go through all the sideboards on this list and filter out for the ones that are larger than 155cm and smaller than 100cm, prioritise the ones with dark wood and space for vinyl records which are 31.43cm tall" for example.Is there any browser that can do this yet as it seems extremely useful to be able to extract details from the page!

评论 #43616303 未加载

评论 #43655816 未加载

评论 #43624324 未加载

评论 #43616351 未加载

neilellisabout 1 month ago

Well done, just tested on Claude Desktop and it worked smoothly and a lot less clunky than playwright. This is the right direction to go in.I don't know if you've done it already, but it would be great to pause automation when you detect a captcha on the page and then notify the user that the automation needs attention. Playwright keeps trying to plough through captchas.

thenaturalistabout 1 month ago

Crazy, in looking up some info on the web and creating a Spreadsheet on Google Sheets to insert the results, it worked almost perfectly the first time and completely failed subsequently on 8-10 different tries.Is there an issue with the lag between what is happening in the browser and the MCP app (in my case Claude Desktop)?I have a feeling the first time I tried it, I was fast enough clicking the "Allow for this chat" permissions, whereas by the time I clicked the permission on subsequent chats, the LLM just reports "It seems we had an issue with the click. Let me try again with a different reference.".Actions which worked flawlessly the first time (rename a Google spreadsheet by clicking on the title and inputting the name) fail 100% of subsequent attempts.Same with identifying cells A1, B1, etc. and inserting into the rows.Almost perfect on 1st try, not reproducible in 100% of attempts afterwards.Kudos to how smooth this experience is though, very nice setup & execution!EDIT 2: The lag & speed to click the allow action make it seemingly unusable in Claude Desktop. :(

评论 #43615351 未加载

评论 #43648672 未加载

评论 #43617177 未加载

评论 #43622009 未加载

nonethewiserabout 1 month ago

Stuff like this makes me giddy for manual tasks like reimbursement requests. Its such a chore (and it doesnt help our process isnt great).Every month, go to service providers, log in, find and download statement, create google doc with details filled in, download it, write new email and upload all the files. Maybe double chek the attachments are right but that requires downloading them again instead of being able to view in email).Automating this is already possible (and a real expense tracking app can eliminate about half of this work) but I think AI tools have the potential to elminate a lot of the nittier-grittier specification of it. This is especially important because these sorts of workflows are often subject to little changes.

doug_lifeabout 1 month ago

This may be obvious to most here, but you need Node.js installed for the MCP server to run. This critical detail is not in the set up instructions.

评论 #43617638 未加载

评论 #43617639 未加载

serverlessmaniaabout 1 month ago

Did something similar but controls a hardware synth, allowing me to do sound design without touching the physical knobs: <a href="https://github.com/zerubeus/elektron-mcp">https://github.com/zerubeus/elektron-mcp</a>

评论 #43614207 未加载

Gehinnnabout 1 month ago

Would be nice if it could use the Accessibility Tree from chrome dev tools to navigate the page instead of relying on screenshots (<a href="https://developer.chrome.com/blog/full-accessibility-tree" rel="nofollow">https://developer.chrome.com/blog/full-accessibility-tree</a>)

评论 #43617312 未加载

amendegreeabout 1 month ago

So is MCP the new RPA (Robotics Process Automation)? Like generic yahoo pipes?

评论 #43614150 未加载

评论 #43620153 未加载

评论 #43614117 未加载

cadence-about 1 month ago

Doesn't work on Windows:2025-04-07T18:43:26.537Z [browsermcp] [info] Initializing server... 2025-04-07T18:43:26.603Z [browsermcp] [info] Server started and connected successfully 2025-04-07T18:43:26.610Z [browsermcp] [info] Message from client: {"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"claude-ai","version":"0.1.0"}},"jsonrpc":"2.0","id":0} node:internal/errors:983 const err = new Error(message); ^Error: Command failed: FOR /F "tokens=5" %a in ('netstat -ano ^| findstr :9009') do taskkill /F /PID %a at genericNodeError (node:internal/errors:983:15) at wrappedFn (node:internal/errors:537:14) at checkExecSyncError (node:child_process:882:11) at execSync (node:child_process:954:15)

评论 #43614684 未加载

评论 #43614683 未加载

josefrichterabout 1 month ago

What I used this for:"Go to <a href="https://news.ycombinator.com/upvoted?id=josefrichter">https://news.ycombinator.com/upvoted?id=josefrichter</a>, summarize what topics I am interested in, and then from the homepage pick articles I might be interested in."Works like a charm.

washedDeveloperabout 1 month ago

Can you add a license to your code along with open sourcing the chrome extension?

makingstuffsabout 1 month ago

I don't see how an MCP can be useful for browsing the net and doing things like shopping as has been suggested. Large companies such as CloudFlare have spent millions on, and made a business from, bot detection and blocking.Do we suppose they will just create a backdoor to allow _some_ bots in? If they do that how long will it be before other bots impersonate them? It seems like a bit of a fad from my small mind.Suppose it does become a thing, what then? We end up with an internet which is heavily optimised for bots (arguably it already is to an extent) and unusable for humans?Wild.

评论 #43619025 未加载

评论 #43619180 未加载

评论 #43620557 未加载

评论 #43620056 未加载

hliyanabout 1 month ago

Ideally, shouldn't this be the native experience of most "sites" on the internet? We've built an entire user experience around serving users rich, two dimensional visual content that is not machine-readable and are now building a natural language command line layer on top of it. Why not get rid of the middleware and present users a direct natural language interface to the application layer?

buttofthejokeabout 1 month ago

Why use this over Puppeteer or Playwright extensions?

评论 #43613905 未加载

Ferniciaabout 1 month ago

Any plans to make a Firefox version?

评论 #43614105 未加载

DebtDeflationabout 1 month ago

In the Task Automation demo, how does it know all of the attributes of the motorcycle he is trying to sell? Is it relying on the underlying LLM's embedded knowledge? But then how would it know the price and mileage? Is there some underlying document not referenced in the demo? Because that information is not in the prompt.

pavelfeldmanabout 1 month ago

I mean no disrespect, but this looks like an outdated clone of <a href="https://github.com/microsoft/playwright-mcp">https://github.com/microsoft/playwright-mcp</a><a href="https://github.com/microsoft/playwright-mcp/blob/main/src/tools/tool.ts">https://github.com/microsoft/playwright-mcp/blob/main/src/to...</a> <a href="https://github.com/BrowserMCP/mcp/blob/main/src/tools/tool.ts">https://github.com/BrowserMCP/mcp/blob/main/src/tools/tool.t...</a>

评论 #43617583 未加载

评论 #43617461 未加载

icelancerabout 1 month ago

I just run into a bunch of errors on my Windows machine + Chrome when connected over remote-ssh. Extension installed, tab enabled, npx updated/installed, etc.2025-04-07 10:57:11.606 [info] rmcp: Starting new stdio process with command: npx @browsermcp/mcp@latest2025-04-07 10:57:11.606 [error] rmcp: Client error for command spawn npx ENOENT2025-04-07 10:57:11.606 [error] rmcp: Error in MCP: spawn npx ENOENT2025-04-07 10:57:11.606 [info] rmcp: Client closed for command2025-04-07 10:57:11.606 [error] rmcp: Error in MCP: Client closed2025-04-07 10:57:11.606 [info] rmcp: Handling ListOfferings action2025-04-07 10:57:11.606 [error] rmcp: No server info found---EDIT: Ended up fixing it by patching index.js. killProcessOnPort() was the problem. Can hit me up if you have questions, I cannot figure out how to put readable code in HN after all these years with the fake markdown syntax they use.

评论 #43614482 未加载

评论 #43614446 未加载

sdotdevabout 1 month ago

Still slightly confused on what MCPs are but looking at this it does look useful

评论 #43618248 未加载

评论 #43617912 未加载

BrandiATMuhkuhabout 1 month ago

This is really well done! Very cool.I wonder if it's possible to add such plugins to election apps (e.g.: Slack). It would be such a nice experience if I could just connect my AI of choice to a local app.

评论 #43614370 未加载

评论 #43614408 未加载

wifipunkabout 1 month ago

Setting this up for claude desktop and cursor was alright. Works well out of the box with little setup, and I like that it attached to my active browser tab. Keep up the good work.

qwertoxabout 1 month ago

MCP seems to be JavaScript's trojan horse into AI.

评论 #43614473 未加载

评论 #43614688 未加载

otheraydenabout 1 month ago

I literally started working on the same exact idea last night haha. Great work OP. I'm curious, how are you feeding the web data to the LLM? Are you just passing the entire page contents to it and then having it interact with the page based on CSS selectors/xpath? Also, what are your thoughts on letting it do its own scripting to automate certain tasks?

metadatabout 1 month ago

Bot Detection Evasion is becoming an increasingly relevant topic. Even for non-abusive automation, it's now a necessary consideration.Interesting research and reading via the HN search portal: <a href="https://hn.algolia.com/?q=bot+detection" rel="nofollow">https://hn.algolia.com/?q=bot+detection</a>

behnamohabout 1 month ago

What I don't like about LLMs is that people keep re-inventing the wheel over and over. For example, we've been able to control browsers using GPT for about 2 years now:- <a href="https://github.com/mayt/BrowserGPT">https://github.com/mayt/BrowserGPT</a>- <a href="https://github.com/TaxyAI/browser-extension">https://github.com/TaxyAI/browser-extension</a>- <a href="https://github.com/browser-use/browser-use">https://github.com/browser-use/browser-use</a>- <a href="https://github.com/Skyvern-AI/skyvern">https://github.com/Skyvern-AI/skyvern</a>- <a href="https://github.com/m1guelpf/browser-agent">https://github.com/m1guelpf/browser-agent</a>- <a href="https://github.com/richardyc/Chrome-GPT">https://github.com/richardyc/Chrome-GPT</a>- <a href="https://github.com/handrew/browserpilot">https://github.com/handrew/browserpilot</a>- <a href="https://github.com/ishan0102/vimGPT">https://github.com/ishan0102/vimGPT</a>- <a href="https://github.com/Jiayi-Pan/GPT-V-on-Web">https://github.com/Jiayi-Pan/GPT-V-on-Web</a>

评论 #43614367 未加载

评论 #43619278 未加载

评论 #43618555 未加载

评论 #43620724 未加载

评论 #43616750 未加载

webprofusionabout 1 month ago

Or just use Playwright MCP: <a href="https://github.com/microsoft/playwright-mcp">https://github.com/microsoft/playwright-mcp</a>

rahimnathwaniabout 1 month ago

This is cool. I'm curious why you chose to use an extension, rather than getting the user to run Chrome with remote debugging turned on?

评论 #43613934 未加载

评论 #43613912 未加载

101008about 1 month ago

Good, just what we needed. More bots browsing the internet. Somedays I think I am not 100% against of every website having a captcha...

评论 #43614244 未加载

评论 #43614237 未加载

knesabout 1 month ago

This is great. Especially debugging frontend issue on localhost or staging.Also works flawlessly with augment code.com too!

picardoabout 1 month ago

I like this. It would be interesting to use it for when I need to use authenticated browser sessions.

lxeabout 1 month ago

This one also uses aria snapshots formatted as yaml. This will quickly exceed context limits.

plessasabout 1 month ago

thank you for this. Using my own browser helps me automate tasks on sites I 'd typically get detected using automation. Works like a charm! Hope you continue to work on the repo.

jngiam1about 1 month ago

Pretty cool, do you know of a version of this that supports the new remote MCP protocol

评论 #43614538 未加载

revskillabout 1 month ago

Can u expose the sdk as a react component to be used inside an app ?

mvdtnzabout 1 month ago

Is anyone successfully running MCPs / Claude Desktop on Linux?

评论 #43618315 未加载

pknerdabout 1 month ago

So why do I need an editor(Cusror)? How does a non-coder use it?

评论 #43615496 未加载

xenaabout 1 month ago

Do you respect robots.txt so administrators can block this tool?

评论 #43614885 未加载

评论 #43614852 未加载

cadence-about 1 month ago

How does this compare to Anthropic's Computer Use?

tuananhabout 1 month ago

i want to add this for my project (which use wasm) but rustlang/socket2 WASI support is not merged yet. after that rust CDP will work.

评论 #43619051 未加载

jayunitabout 1 month ago

awesome! For the Cursor / React / Click to Add 2 example, can we also have it write a unit/e2e regression test?

评论 #43616650 未加载

mrwwwabout 1 month ago

How does it compare to playwright mcp?

graizabout 1 month ago

works better than puppet mcp for me but having issues with keyboard events and actions on some websites.

johnpaulkiserabout 1 month ago

> Private > Since automation happens locally, your browser activity stays on your device and isn't sent to remote servers.I think this is bullshit. Isn't the dom or whatever sent to the model api?

评论 #43613971 未加载

throwaway81523about 1 month ago

Can these things automatically solve recaptcha? That's the only AI browser feature that I have a real use for.

评论 #43616879 未加载

tntpreneurabout 1 month ago

Thanks but idea is ok but it is not working smoothly.

justanotheratomabout 1 month ago

neat, but instead of asking me to install browser extension, can you just bundle a browser in the MCP server?

tigreznoabout 1 month ago

this is the way

ndrabout 1 month ago

WARNING for Cursor users:Cursor is currently stuck using an outdated snapshot of the VSCode Marketplace, meaning several extensions within Cursor remain affected by high-severity CVEs that have already been patched upstream in VSCode. As a result, Cursor users unknowingly remain vulnerable to known security issues. This issue has been acknowledged but remains unresolved: <a href="https://github.com/getcursor/cursor/issues/1602#issuecomment-2654870021">https://github.com/getcursor/cursor/issues/1602#issuecomment...</a>Given Cursor's rising popularity, users should be aware of this gap in security updates. Until the Cursor team resolves the marketplace sync issue, caution is advised when using certain extensions.I've flagged it here, apologies for the repost: <a href="https://news.ycombinator.com/item?id=43609572">https://news.ycombinator.com/item?id=43609572</a>

评论 #43615018 未加载

评论 #43615720 未加载

51 comments

rmacabout 1 month ago

评论 #43624154 未加载

评论 #43623145 未加载

评论 #43620709 未加载

bhoustonabout 1 month ago

评论 #43614754 未加载

评论 #43624297 未加载

评论 #43614578 未加载

评论 #43615784 未加载

评论 #43619122 未加载

StevenNunezabout 1 month ago

I feel like I slept for a day and now MCPs are everywhere... I don't know what MCPs are and at this point I'm too afraid to ask.

评论 #43616648 未加载

评论 #43616347 未加载

评论 #43617729 未加载

评论 #43617292 未加载

评论 #43620074 未加载

andy_pppabout 1 month ago

评论 #43616303 未加载

评论 #43655816 未加载

评论 #43624324 未加载

评论 #43616351 未加载

neilellisabout 1 month ago

thenaturalistabout 1 month ago

评论 #43615351 未加载

评论 #43648672 未加载

评论 #43617177 未加载

评论 #43622009 未加载

nonethewiserabout 1 month ago

doug_lifeabout 1 month ago

This may be obvious to most here, but you need Node.js installed for the MCP server to run. This critical detail is not in the set up instructions.

评论 #43617638 未加载

评论 #43617639 未加载

serverlessmaniaabout 1 month ago

评论 #43614207 未加载

Gehinnnabout 1 month ago

评论 #43617312 未加载

amendegreeabout 1 month ago

So is MCP the new RPA (Robotics Process Automation)? Like generic yahoo pipes?

评论 #43614150 未加载

评论 #43620153 未加载

评论 #43614117 未加载

cadence-about 1 month ago

评论 #43614684 未加载

评论 #43614683 未加载

josefrichterabout 1 month ago

washedDeveloperabout 1 month ago

Can you add a license to your code along with open sourcing the chrome extension?

makingstuffsabout 1 month ago

评论 #43619025 未加载

评论 #43619180 未加载

评论 #43620557 未加载

评论 #43620056 未加载

hliyanabout 1 month ago

buttofthejokeabout 1 month ago

Why use this over Puppeteer or Playwright extensions?

评论 #43613905 未加载

Ferniciaabout 1 month ago

Any plans to make a Firefox version?

评论 #43614105 未加载

DebtDeflationabout 1 month ago

pavelfeldmanabout 1 month ago

评论 #43617583 未加载

评论 #43617461 未加载

icelancerabout 1 month ago

评论 #43614482 未加载

评论 #43614446 未加载

sdotdevabout 1 month ago

Still slightly confused on what MCPs are but looking at this it does look useful

评论 #43618248 未加载

评论 #43617912 未加载

BrandiATMuhkuhabout 1 month ago

评论 #43614370 未加载

评论 #43614408 未加载

wifipunkabout 1 month ago

Setting this up for claude desktop and cursor was alright. Works well out of the box with little setup, and I like that it attached to my active browser tab. Keep up the good work.

qwertoxabout 1 month ago

MCP seems to be JavaScript's trojan horse into AI.

评论 #43614473 未加载

评论 #43614688 未加载

otheraydenabout 1 month ago

metadatabout 1 month ago

behnamohabout 1 month ago

评论 #43614367 未加载

评论 #43619278 未加载

评论 #43618555 未加载

评论 #43620724 未加载

评论 #43616750 未加载

webprofusionabout 1 month ago

Or just use Playwright MCP: <a href="https://github.com/microsoft/playwright-mcp">https://github.com/microsoft/playwright-mcp</a>

rahimnathwaniabout 1 month ago

This is cool. I'm curious why you chose to use an extension, rather than getting the user to run Chrome with remote debugging turned on?

评论 #43613934 未加载

评论 #43613912 未加载

101008about 1 month ago

Good, just what we needed. More bots browsing the internet. Somedays I think I am not 100% against of every website having a captcha...

评论 #43614244 未加载

评论 #43614237 未加载

knesabout 1 month ago

This is great. Especially debugging frontend issue on localhost or staging.Also works flawlessly with augment code.com too!

picardoabout 1 month ago

I like this. It would be interesting to use it for when I need to use authenticated browser sessions.

lxeabout 1 month ago

This one also uses aria snapshots formatted as yaml. This will quickly exceed context limits.

plessasabout 1 month ago

thank you for this. Using my own browser helps me automate tasks on sites I 'd typically get detected using automation. Works like a charm! Hope you continue to work on the repo.

jngiam1about 1 month ago

Pretty cool, do you know of a version of this that supports the new remote MCP protocol

评论 #43614538 未加载

revskillabout 1 month ago

Can u expose the sdk as a react component to be used inside an app ?

mvdtnzabout 1 month ago

Is anyone successfully running MCPs / Claude Desktop on Linux?

评论 #43618315 未加载

pknerdabout 1 month ago

So why do I need an editor(Cusror)? How does a non-coder use it?

评论 #43615496 未加载

xenaabout 1 month ago

Do you respect robots.txt so administrators can block this tool?

评论 #43614885 未加载

评论 #43614852 未加载

cadence-about 1 month ago

How does this compare to Anthropic's Computer Use?

tuananhabout 1 month ago

i want to add this for my project (which use wasm) but rustlang/socket2 WASI support is not merged yet. after that rust CDP will work.

评论 #43619051 未加载

jayunitabout 1 month ago

awesome! For the Cursor / React / Click to Add 2 example, can we also have it write a unit/e2e regression test?

评论 #43616650 未加载

mrwwwabout 1 month ago

How does it compare to playwright mcp?

graizabout 1 month ago

works better than puppet mcp for me but having issues with keyboard events and actions on some websites.

johnpaulkiserabout 1 month ago

> Private > Since automation happens locally, your browser activity stays on your device and isn't sent to remote servers.I think this is bullshit. Isn't the dom or whatever sent to the model api?

评论 #43613971 未加载

throwaway81523about 1 month ago

Can these things automatically solve recaptcha? That's the only AI browser feature that I have a real use for.

评论 #43616879 未加载

tntpreneurabout 1 month ago

Thanks but idea is ok but it is not working smoothly.

justanotheratomabout 1 month ago

neat, but instead of asking me to install browser extension, can you just bundle a browser in the MCP server?