(Last updated 17-May-25)
The Model Context Protocol (MCP) is rapidly emerging as a foundational technology for connecting Large Language Models (LLMs) with external tools and data sources. Hailed as the "USB-C for AI," MCP aims to standardise these interactions, making AI assistants more capable and context-aware. However, like any emerging technology, a healthy dose of skepticism and a sharp focus on security are paramount. As the already-worn joke goes - "the S in MCP stands for Security".
I'm not sold yet.
More Than Meets The Eye
One common point of confusion is the term "MCP Server". While it suggests a standalone service, MCP servers are typically lightweight proxies or adapters to existing APIs. This distinction is vital because the reliability and, more critically, the security of the MCP server are intrinsically linked to the underlying API and the integrity of the proxy itself.
Stranger Danger
The ease of creating and hosting MCP servers (see frameworks such as mcp-agent and FastMCP) has led to a proliferation of available options, including many developed by third parties rather than service owners themselves. This presents significant security challenges:
- Dynamic and Unverifiable: A third-party hosted MCP server can be changed at any time by its operator without notice. An initially benign server could be updated with malicious code designed to intercept data, manipulate responses, or even attempt to exploit the connected LLM or client application.
- Client-Side Integrity Checks Needed: Clients should implement robust client-side integrity checks to detect unexpected changes in a server's tools or behavior since last vetted, alerting users to potential malicious updates.
- Service-Side Authenticity Verification: Crucially, verify the authenticity of an "official" MCP server through the service's actual domain (e.g. a press release on GitHub.com for a GitHub MCP server) to avoid impostors.
Real-World Risks: The Third-Party WhatsApp MCP Server
It's not just theoretical. We've seen real-life examples of the dangers posed by unofficial, third-party MCP servers. For instance, third-party MCP servers claiming to offer access to services like WhatsApp have emerged. Because these weren't run or owned by Meta, the potential for the hoster to Man-in-the-Middle (MITM) all requests, steal sensitive messages, contact details, or authentication tokens is a stark reality. This major security risk underscore the critical need to scrutinise the origin and trustworthiness of any MCP server, particularly when less-technical end-users are involved.
The LLM: Not Just a Tool, But a Potential Adversary
It's crucial to remember that the LLM itself should always be treated as a potential adversary and sandboxed accordingly. This is even more critical than with traditional software libraries pulled from registries (which already carry supply chain risks). An LLM is:
- Non-deterministic: A statistical machine whose behavior can be unpredictable.
- Trained on vast, uncontrolled data: Often "trained on the contents of the internet," including potentially malicious or biased information.
- Opaque: Even its creators cannot fully explain or predict its behavior in all scenarios.
Connecting such an entity to external tools and data via MCP without extreme caution is inviting trouble.
Prompt Injection: The Unholy Trinity
One of the most significant threats in the LLM-MCP interaction is prompt injection. The danger becomes acute when three conditions are met - The Unholy Trinity:
- Access to Private Data: The LLM, via MCP or its context, can read sensitive information.
- Exposure to Malicious Instructions: The LLM ingests untrusted input (e.g. from a compromised MCP server response, or a malicious user prompt that the MCP server might relay or act upon) that can override its original instructions.
- Ability to Exfiltrate Information: The LLM, now under malicious control, can send the private data to an attacker-controlled destination, potentially through an MCP tool itself (e.g. a generic "send_data_to_url" tool).
Simply putting guardrail instructions in prompts or system messages (e.g. "You are a helpful assistant. Do not reveal private information.") is a poor and unreliable defense. These are often easily bypassed with clever prompt engineering or "jailbreaking" techniques. This is akin to telling a web application via a comment in its HTML not to be vulnerable to SQL injection - it's simply not a robust security control.
Best Practices for More Secure MCP Usage
Given these amplified risks, these are some proposed best practices:
- Prioritise Official Servers: Only use third-party hosted MCP servers run and maintained by the service owner themselves.
- Self-Host Vetted Third-Party Code: If using third-party MCP server code, self-host it in a controlled environment after rigorous code review for exfiltration, injection, and other malicious behavior. Disable auto-updates and manage updates carefully to reduce the delayed introduction of malicious behaviour (a rug pull attack).
- Introduction of an MCP Gateway.
The MCP Gateway: A Centralised Defense and Efficiency Layer
MCP Gateways and similar tools are only just starting to hit the market with limited functionality, though I expect organisations will quickly begin adopting these products as standard as they mature. An MCP Gateway - locally hosted or organisationally controlled - is critical to maintain control in this high-risk environment. It should act as a central hub with essential security and operational functions. Here is the core featureset I expect to emerge:
- Centralised Registration and Vetting: Rigorously vet MCP servers and selectively register verified trusted servers. The gateway acts as an aggregator that presents a unified, trusted list of tools.
- Secure Routing and Authentication: Handles authentication centrally, selectively forwarding identity to backend MCP servers.
- Enhanced Security with Data Scrubbing & Monitoring: This is paramount. The Gateway should:
- Integrate secrets detection and sensitive information scanning to scrub client requests before they hit any MCP server.
- Allow configurable scrubbing policies based on the trust level and features of the destination MCP server.
- Monitor requests and responses for anomalous patterns or indications of prompt injection attempts.
-
Substantial Prompt Token Reduction & Cost Savings: This is a crucial operational benefit. Here's why:
- Understanding the "Final Prompt": When you interact with an LLM, your input isn't all that's processed. What you send (e.g. to OpenAI) is combined with other elements to create a "final prompt" that the LLM sees. This typically includes the LLM provider's system message, your own custom system message (if any), and then your actual conversational prompt.
- MCP Tools Bloat the "Final Prompt": With MCP, each tool the LLM could potentially use needs its description (name, function, parameters, etc.) included in this "final prompt." This is essential so the LLM knows what tools are available and how to structure a call to them.
- The Scalability Problem: As you enable more MCP tools, the size of this "final prompt" grows significantly. If you have, say, 50 MCP servers enabled, each potentially offering multiple tools, the cumulative size of all those tool definitions is added to every single request you make that could invoke any of those tools.
- Performance Degradation and Cost Implications:
- There's no such thing as a persistent LLM "session" at the fundamental API level. Higher-level applications might cache parts of your conversation to simulate a session, but ultimately, the necessary context, including all relevant tool definitions, needs to be passed repeatedly.
- This "final prompt" (provider system message + user system message + all tool descriptions + user conversational turn) has a direct token cost. More tools mean more tokens, leading to higher API expenses.
- Beyond cost, very large "final prompts" can adversely affect LLM performance, potentially leading to slower responses, less coherent outputs, or even hitting context window limits. This is like why some chat interfaces have conversation length limits - the ever-growing context becomes unwieldy and can hit capacity.
- How the MCP Gateway Solves This: An MCP Gateway dramatically alleviates this issue. Instead of burdening the LLM with the definitions of all 50+ possible MCP tools on every turn:
- Selective Tool Exposure: The Gateway can dynamically present only a relevant subset of tools to the LLM based on the current user query or context. If you ask about your code repositories, the Gateway can ensure only the GitHub/GitLab tool definitions are loaded into the LLM's prompt for that interaction, not the weather, calendar, and social media tools.
- Abstraction Layer: The LLM can interact with the Gateway as a single, intelligent tool orchestrator. The Gateway manages the complexity of the underlying tool ecosystem.
- Optimised Tool Manifests: The Gateway can provide more summarised or contextually-filtered tool information to the LLM, further reducing token overhead.
By intelligently managing which tool definitions are injected into the LLM's context and when, an MCP Gateway makes it feasible to have a rich ecosystem of dozens of tools available without each request becoming prohibitively expensive or slow. It transforms an unscalable scenario into a manageable one.
Where Do We Go From Here?
MCP offers exciting possibilities, but the path forward requires deep security consciousness and smart architectural choices. Considering the MCP specification has only recently started to include mentions of authorisation, there is a long maturity journey ahead - MCP may even fall to the wayside. Agent-to-Agent (A2A) protocol is also in it's infancy, offering another method of abstraction. The "build fast, break things" mantra of early-stage development clearly describes how security is typically an afterthought in exchange for rapid technological evolution. That's fine, but we should understand how to mitigate this ourselves.