AI agents are connecting to enterprise systems right now. Whether a developer wired up Claude to an internal Confluence instance, a vendor shipped an agentic workflow that calls the CRM, or an employee enabled a browser-based AI assistant that reads email, Model Context Protocol (MCP) is rapidly becoming the integration layer between large language models (LLMs) and corporate data. Most security teams have no visibility into any of it. The attack surface is live, the permissions are broad, and the logging is nonexistent.
What Is Model Context Protocol (MCP)?
Model Context Protocol is an open standard that allows AI agents and LLMs to connect to external tools, data sources, and services through a structured interface. Developed by Anthropic and released as an open protocol in late 2024, MCP defines how an AI model can request access to a resource (a file system, a database, a SaaS API) and receive structured data back.
Think of MCP as the HTTP of AI integration: a common language that lets AI models talk to external systems without custom one-off connectors for each tool. An MCP server exposes a set of "tools" (callable functions) and "resources" (readable data objects) that an AI agent can invoke. The AI decides, at runtime, which tools to call and what data to retrieve based on user prompts.
That runtime autonomy is exactly what makes MCP a security problem.
Top MCP Security Risks
1. Uncontrolled data access and exfiltration
MCP servers grant AI agents access to resources including files, databases, APIs, code repositories, and communication tools. When those permissions are misconfigured or overly broad, an agent can read, summarize, and transmit sensitive data that no human explicitly authorized it to access.
Unlike a user logging into a system, an AI agent does not apply judgment about whether accessing a particular file is appropriate. It accesses what it is permitted to access. If an MCP server has read access to your S3 buckets, a well-crafted prompt can cause it to retrieve and expose regulated data.
2. Prompt injection via MCP tool responses
When an MCP server returns data from an external source (i.e. a web page, a document, a ticket), that content enters the LLM's context window. Attackers can embed instructions inside that content designed to hijack the agent's behavior. This is called indirect prompt injection, and it is one of the most actively exploited attack surfaces in agentic AI.
An attacker who controls content that an MCP tool retrieves can instruct the agent to exfiltrate data, modify records, or take actions the user never intended. MCP's tool-call architecture creates a clean delivery path for this class of attack.
3. Tool poisoning
Tool poisoning occurs when a malicious or compromised MCP server misrepresents what a tool does. An MCP server exposes tool definitions (name, description, parameters) that the AI model reads to decide which tool to invoke. A tool defined as "retrieve recent documents" could, in practice, exfiltrate data to an external endpoint. The model trusts the description it is given.
In multi-agent environments where one agent orchestrates others, a compromised downstream MCP server can poison the orchestrator's tool catalog.
4. Excessive privilege and scope creep
MCP's design makes it easy to expose broad permissions at the server level. A developer building an internal productivity tool connects an MCP server to a Slack workspace, a Google Drive folder, and a GitHub repo. Six months later, that MCP server is running in production with access it no longer needs. No one reviews MCP server permissions the way they review OAuth grants or identity and access management (IAM) roles.
This is the principle of least privilege failing at the integration layer.
5. Authentication gaps between agents and MCP servers
Most current MCP implementations use API keys or OAuth tokens that are stored in agent configuration files or environment variables. These credentials are rarely rotated, often shared across environments, and typically not bound to the specific agent identity that should hold them. A compromised agent runtime can abuse these credentials at scale.
The MCP specification is evolving to include stronger auth mechanisms, but most deployed MCP servers today have not implemented them.
6. Shadow MCP: unsanctioned integrations
Developers and power users can stand up MCP servers and connect AI agents to enterprise systems without going through security review. Because MCP is an open standard with widely available server implementations (GitHub has thousands of them), standing up an MCP integration takes minutes. Most enterprises have no inventory of which MCP servers are running, who built them, or what data they can access.
How to Secure MCP Servers: Mitigation Steps
Establish an MCP server inventory before anything else
You cannot secure what you cannot see. Map every MCP server running in your environment: who built it, what it connects to, what permissions it holds, and whether it has been reviewed by security. Treat MCP servers as a new category in your integration catalog alongside OAuth apps and API keys.
In most enterprises today, this inventory does not exist. Start there.
Apply least-privilege scoping to every MCP server
Review the tools and resources each MCP server exposes. Scope permissions to the minimum required for the agent's stated function. An agent that summarizes tickets does not need write access to your ticketing system. An agent that drafts emails does not need access to your file storage.
Enforce read-only access by default. Require explicit justification and security review for any MCP server with write, delete, or export capabilities.
Require human-in-the-loop checkpoints for sensitive actions
For MCP tool calls that affect sensitive data, require a human confirmation step before the action executes. This is especially important for any tool call that writes, modifies, deletes, or transmits data outside the environment.
Human-in-the-loop controls do not need to interrupt every agent interaction. Scope them to high-risk action categories such as data export, record modification, communication on behalf of the user, and file system writes.
Monitor agent behavior and MCP tool call patterns
Treat MCP tool calls as an observable data access event, not a black box. Log which agent made which tool call, what data was retrieved, and what action followed. Anomalous patterns (i.e. an agent calling a file retrieval tool at volume, or calling an external API tool it rarely uses) are indicators of compromise or misuse.
This requires integrating MCP server activity into your existing SIEM or data security monitoring pipeline.
Validate and sanitize MCP tool outputs before they re-enter the agent context
To reduce the risk of indirect prompt injection, treat data returned by MCP tools as untrusted input. Apply output validation rules before that content enters the LLM's context window. Flag or strip content that contains instruction-like patterns, system prompt syntax, or override language.
This is a technical control that requires coordination between security team and the teams building agent workflows.
Treat MCP credentials as secrets, not configuration
API keys and tokens used by MCP servers should be managed in a secrets management system ( i.e. Vault, AWS Secrets Manager, or equivalent), rotated on a defined schedule, and scoped to specific agent identities. They should never be hardcoded in configuration files or shared across environments.
Audit existing MCP server credentials for age, scope, and storage method as part of your initial inventory.
How Cyberhaven Addresses MCP Data Risk
Cyberhaven's Data Lineage technology tracks data as it moves through AI workflows, including data accessed or transmitted via MCP integrations. When an AI agent retrieves a file containing regulated data, sends it to an LLM, and that LLM surfaces it in a response, Cyberhaven can trace the full path: where the data originated, which agent touched it, and where it ended up.
For security teams building MCP security programs, this visibility closes a critical gap. Most existing DLP tools inspect content at the point of egress, but they have no visibility into what happens inside an agentic workflow before data reaches that boundary. Cyberhaven lineage tracing within the AI layer itself, not just at the perimeter.
This means security architects can enforce policy on MCP-facilitated data access with the same precision they apply to user-initiated data movement: block exfiltration of regulated data, alert on unusual agent access patterns, and maintain an audit trail that satisfies compliance requirements.
AI agents are not going to stop connecting to enterprise systems. MCP is the standard that makes those connections easy, which is precisely why it needs to be on your security team's radar now, before the footprint grows past the point of easy control. The enterprises that treat MCP server deployment as a governed process, with inventory, least-privilege scoping, and monitoring built in from the start, will be significantly better positioned than those that address it after a data exposure event.
Better understand where AI security and data security intersect with “IDC Spotlight: Rethinking Data Security and Insider Risk for Trusted AI Adoption.”
Frequently Asked Questions
What is MCP security?
MCP security refers to the practices and controls used to protect data and systems accessed by AI agents through Model Context Protocol integrations. Because MCP allows AI agents to connect to enterprise tools, file systems, and APIs at runtime, it introduces data exfiltration, prompt injection, and access control risks that traditional security tools were not built to address.
What are the biggest security risks of MCP servers?
The highest-risk categories are uncontrolled data access (overly broad permissions granted to AI agents), indirect prompt injection (attackers embedding malicious instructions in data returned by MCP tools), tool poisoning (compromised MCP servers misrepresenting what a tool does), and authentication gaps in how MCP server credentials are stored and rotated.
How do you secure an MCP server?
Start with a full inventory of all MCP servers running in your environment. Apply least-privilege scoping to every server's tool and resource permissions. Require human-in-the-loop checkpoints for any tool call that writes or transmits data. Log all tool call activity and treat MCP credentials as secrets subject to rotation and vault storage.
Can existing DLP tools protect against MCP data risks?
Most legacy DLP tools inspect data at the point of egress and cannot observe what happens inside an agentic workflow. They have no visibility into MCP tool calls, agent context windows, or the data that flows between an AI model and the tools it uses. Protecting against MCP data risk requires data security tooling that operates within the AI layer, not just at the network perimeter.
How do you detect prompt injection attacks through MCP?
Detecting indirect prompt injection in MCP workflows requires validating and inspecting tool outputs before they re-enter the LLM's context. Look for content containing instruction-like syntax, override language, or system prompt patterns. Behavioral monitoring of agent tool call patterns can also surface anomalies that indicate an injection attack is influencing agent behavior.
What is the principle of least privilege in the context of MCP?
In MCP, least privilege means scoping each MCP server to the minimum set of tools and data it needs for its specific function. Read-only access should be the default. Any MCP server with write, export, or delete capabilities should require a documented security review, ongoing permission audits, and human confirmation controls for sensitive actions.

.avif)
.avif)
