Regarding the WebSocket critiques specifically, as the author of <a href="https://www.npmjs.com/package/websocket" rel="nofollow">https://www.npmjs.com/package/websocket</a>, and having participated in the IETF working group that defined the WebSocket protocol, I completely agree with this blog post's author.<p>The WebSocket protocol is the most ideal choice for a bi-directional streaming communication channel, and the arguments listed in <a href="https://github.com/modelcontextprotocol/modelcontextprotocol/pull/206">https://github.com/modelcontextprotocol/modelcontextprotocol...</a> for "Why Not WebSockets" are honestly bewildering. They are at best thin, irrelevant and misleading. It seems as though they were written by people who don't really understand the WebSocket protocol, and have never actually used it.<p>The comment farther down the PR makes a solid rebuttal.
<a href="https://github.com/modelcontextprotocol/modelcontextprotocol/pull/206#issuecomment-2766559523">https://github.com/modelcontextprotocol/modelcontextprotocol...</a><p>Here are the stated arguments against using the WebSocket protocol, and my responses.<p>---<p>Argument 1: Wanting to use MCP in an "RPC-like" way (e.g., a stateless MCP server that just exposes basic tools) would incur a lot of unnecessary operational and network overhead if a WebSocket is required for each call.<p>Response 1: There are multiple better ways to address this.<p>Option A.) Define a plain HTTP, non-streaming request/response transport for these basic use cases. That would be both DRAMATICALLY simpler than the "Streaming HTTP" HTTP+SSE transport they did actually define, while not clouding the waters around streaming responses and bi-directional communications.<p>Option B.) Just leave the WebSocket connection open for the duration of the session instead of tearing it down and re-connecting it for every request. Conceptualizing a WebSocket connection as an ephemeral resource that needs to be torn down and reconstructed for every request is wrong.<p>---<p>Argument 2: From a browser, there is no way to attach headers (like Authorization), and unlike SSE, third-party libraries cannot reimplement WebSocket from scratch in the browser.<p>Response 2: The assertion is true. You cannot attach arbitrary headers to the initial HTTP GET request that initiates a WebSocket connection, not because of the WebSocket protocol's design, but because the design of the browser API doesn't expose the capability. However, such a limitation is totally irrelevant, as there are plenty of other ways that you could decide to convey that information from client to server:<p>- You can pass arbitrary values via standard HTTP GET query parameters to be interpreted during the WebSocket handshake. Since we're initiating a WebSocket connection and not actually performing a GET operation on an HTTP resource, this does not create issues with caching infrastructure, and does not violate standard HTTP GET semantics. The HTTP GET that initiates a WebSocket connection is HTTP GET in name only, as the response in a successful WebSocket handshake is to switch protocols and no longer speak HTTP for the remainder of the connection's lifetime.<p>- Cookies are automatically sent just as with any other HTTP request. This is the standard web primitive for correllating session state across connections. I'll grant, however, that it may be a less relevant mechanism if we're talking about cross-origin connections.<p>- Your subprotocol definition (what messages are sent and received over the WebSocket connection) could simply require that the client sends any such headers, e.g. Authorization, as part of the first message it sends to the server once the underlying WebSocket connection is established. If this is sent pipelined along with the first normal message over the connection, it wouldn't even introduce an additional round-trip and therefore would have no impact on connection setup time or latency.<p>These are not strange, onerous workarounds.<p>---<p>Argument 3: Only GET requests can be transparently upgraded to WebSocket (other HTTP methods are not supported for upgrading), meaning that some kind of two-step upgrade process would be required on a POST endpoint, introducing complexity and latency.<p>Response 3: Unless I'm missing something, this argument seems totally bewildering, nonsensical, and irrelevant. It suggests a lack of familiarity with what the WebSocket protocol is for. The semantics of a WebSocket connection are orthoganal to the semantics of HTTP GET or HTTP POST. There is no logical concept of upgrading a POST request to a WebSocket connection, nor is there a need for such a concept. MCP is a new protocol that can function however it needs to. There is no benefit to trying to constrain your conceptualization of its theoretical use of WebSockets to fit within the semantics of any other HTTP verbs. In fact, the only relationship between WebSockets and HTTP is that WebSockets utilizes standard HTTP only to bootstrap a connection, after which point it stops speaking HTTP over the wire and starts speaking a totally distinct binary protocol instead. It should be conceptualized as more analogous to a TCP connection than an HTTP connection. If you are thinking of WebSockets in terms of REST semantics, you have not properly understood how WebSockets differs, nor how to utilize it architecturally.<p>Since the logical semantics of communication over a WebSocket connection in an MCP server are functionally identical to how the MCP protocol would function over STDIN/STDOUT, the assertion that you would need some kind of two-step upgrade process on a POST endpoint is just false, because there would not exist any POST endpoint for you to have interacted with in the first place, and if one did exist, it would serve some other purpose unrelated to the actual WebSocket connection.<p>---<p>In my view, the right way to conceptualize WebSocket in MCP is as a drop-in, mostly transparent alternative to STDIO. Once the WebSocket connection is established, the MCP client/server should be able to speak literally EXACTLY the same protocol with each other as they do over STDIO.