Managing Shadow AI: Best Practices For Enterprise Security

Jack Hart

Vice President of API Security, Financial Services

March 28, 2025

•

1 min

In This Article

Example H2

The rush to work faster with artificial intelligence (AI) can unintentionally encourage employees to put sensitive data at risk in ways security teams can't see, track, or stop. Consider a scenario that plays out regularly at organizations of every size: someone in procurement has a tight deadline and uploads a confidential contract into an AI tool to review a few redlines. It's often unclear whether the AI system retains that contract data, how long it'll be stored, or whether it resurfaces in a future prompt to someone else. There was no malicious intent, but there's also no visibility into what happened and no controls to prevent it from happening again.

This isn't an isolated problem with one department. It's happening throughout organizations, across every function, at a scale most security teams haven't fully reckoned with. Employees are using AI tools outside the boundaries of IT oversight, leaving companies with little control over their most sensitive data.

What is Shadow AI?

Shadow AI refers to the unsanctioned use of AI tools by employees within an organization. These tools operate outside the visibility of security and IT teams, meaning they aren't vetted for compliance, security, or data privacy standards before employees start using them.

Shadow AI models typically operate outside established governance frameworks and documentation processes. They lack the access controls, audit logging, and contractual data protections that enterprise-grade tools are expected to provide. The term shadow AI, in enterprise environments, describes not just the tools themselves but the entire ecosystem of undocumented, unmonitored AI activity that can accumulate when adoption outpaces governance.

The scope is broader than most security leaders initially assume. Shadow AI often includes browser-based AI tools accessed through personal accounts, open-weight models run locally or through third-party wrappers, AI coding assistants embedded in development environments, custom AI agents connected to internal systems, and any generative AI product that employees adopt independently without IT approval. It could be a ChatGPT-powered email assistant, an AI-driven task management app, or a developer's homegrown script that calls a public model API against a production codebase.

What's Driving the Rise of Shadow AI?

The underlying driver of shadow AI is straightforward: employees believe AI makes them more productive, and they're right. The workforce has indicated clearly that it wants to keep up with what AI can do, and adoption has followed accordingly. According to McKinsey, 72% of organizations had adopted AI tools by 2024, up from roughly 50% in prior years.

But the nature of that adoption shifted meaningfully in 2025. The first wave of AI adoption was largely exploratory. Employees used consumer AI tools, such as ChatGPT, for drafting, research, and ideation. The second wave has quickly become operational and automated. AI coding assistants, browser-based agents, and custom AI agents saw rapid growth. These tools operate inside development environments, browsers, and workflows. They interact directly with sensitive data, proprietary code, and critical systems, often with limited oversight.

As AI becomes infrastructure rather than a standalone interface, the security implications intensify. Employees are no longer using AI only for ideation or research. They are inputting source code, financial data, customer information, and intellectual property across a fragmented and expanding ecosystem of tools. Much of this activity occurs outside traditional IT visibility. It spans personal accounts, open-weight models, and SaaS platforms that lack enterprise-grade security controls.

The challenge shadow AI represents for security teams is not simply one of policy. It's a structural visibility problem. When employees access AI through personal accounts or consumer-tier tools, their activity is invisible to the organization. There's no log of what data was submitted. No audit trail. No mechanism to enforce a policy that the security team didn't know was being violated.

Examples of Shadow AI in Practice

Shadow AI shows up differently depending on the department, but the underlying pattern is consistent: employees find an AI tool that helps them work more efficiently or complete a specific task and start utilizing it, often before IT or security has any awareness it's in use.

Software Development

Developers use AI tools to spin up boilerplate APIs in minutes instead of hours, feed error logs to pinpoint bugs faster, and automate documentation that most engineers would rather skip. AI coding assistants like GitHub Copilot, Cursor, and a growing number of alternatives have become deeply embedded in how development teams work. The shadow AI risk here is compounded: developers are often submitting proprietary source code and internal API structures to third-party models, and the AI-generated code that comes back can carry subtle vulnerabilities or reproduce security flaws from the model's training data without any indication that something is wrong.

AI usage is highest in engineering departments according to Cyberhaven Labs research, where more than 60% of employees use AI tools — nearly 20 percentage points higher than the next highest group, marketing.

Marketing

Marketing teams use AI to produce content quickly, rephrase messaging for different audiences, and accelerate campaign cycles. The data exposure risk here often comes from feeding AI tools customer information, proprietary positioning documents, or unreleased product details in search of better copy.

Customer Service

AI-powered tools have become common in customer-facing workflows, from chatbots answering basic account questions to tools that help support agents draft responses. These deployments frequently involve processing customer records and personal information in environments that weren't designed with enterprise data governance in mind.

Finance and Legal

Finance teams use AI to process contracts, summarize financial reports, and assist with analysis. Legal teams use it for document review and research. Both involve some of the most sensitive categories of data an organization handles, including M&A materials, litigation strategy, and regulatory filings. When these workflows run through unsanctioned tools, the data exposure can be significant and the compliance implications serious.

Loan Underwriting and Fraud Detection (Financial Services)

In banking and financial services specifically, AI tools are being used to evaluate loan applications, assess creditworthiness, and flag suspicious transactions. These use cases involve regulated data categories and create compliance risk when they occur through tools that haven't gone through a proper vendor review.

Is Shadow AI Relevant to Data Provisioning?

Yes, and this is an area that often gets overlooked in shadow AI discussions. When employees use AI tools outside sanctioned workflows, data provisioning controls break down. Sensitive datasets that were supposed to flow through governed pipelines end up submitted to external models through copy-paste, file upload, or API calls that bypass the controls entirely.

For organizations with structured data governance programs, this creates a parallel track where sensitive data is being provisioned to AI systems outside the oversight mechanisms that were built to manage it. Shadow AI isn't just a security problem: it's a data governance and compliance problem that touches every team responsible for managing how data is accessed, used, and protected.

The Personal Account Problem

One of the most significant and underappreciated dimensions of shadow AI is personal account usage. Not all employees access AI tools through governed corporate accounts, and the numbers are more striking than most security teams realize.

Cyberhaven's data shows that 32.3% of ChatGPT usage occurs through personal accounts, as does 24.9% of Gemini usage. Claude and Perplexity see even higher rates of personal account usage, at 58.2% and 60.9% respectively.

Personal account usage reduces control and oversight, increasing the likelihood of data policy violations, data leakage, and compliance risk. Employees may use personal accounts for convenience, through negligence, or to bypass corporate restrictions. But regardless of intent, these workarounds create measurable security exposure. The risk intensifies when weak governance exists at multiple layers simultaneously. Tools that lack enterprise-grade controls, including some open-weight and non-U.S. models, are often accessed through personal accounts, creating overlapping blind spots in both model governance and user access.

The implication for detection is important. A security team that monitors corporate ChatGPT usage may have no visibility into the personal account activity happening alongside it, even on corporate devices or networks.

What Are the Security Risks of Shadow AI?

Sensitive Data Loss

Employees routinely use personal AI accounts to analyze company data, bypassing corporate safeguards entirely. Research from Cyberhaven has found a concerning amount, 39.7% of all interactions with AI tools, involve sensitive data, including prompts or copy-paste actions. That means the average employee enters sensitive data into AI tools once every three days.

The data categories at risk are consequential: source code, financial projections, customer records, M&A documents, HR information, legal strategy, and proprietary research. Once submitted to an external AI tool, the organization typically has no way to know what happens to that data, how long it's retained, or whether it influences future model outputs for other users.

Lack of Visibility and Enforcement

Organizations can't protect data they can't see leaving their environment. The challenge has two parts. First, without accurate, comprehensive data classification, security teams don't know what's sensitive enough to protect. Second, even when classification exists, most organizations lack the monitoring infrastructure to detect when classified data moves from a spreadsheet into an AI tool's input field. And even if they could detect it, enforcement capabilities often don't follow. The ability to block a data transfer or trigger a user warning in real time requires tools that most traditional DLP solutions weren't designed to provide.

Backdoor Vulnerabilities in AI-Generated Code

Developers who use AI coding tools are, in effect, inviting the model's training data into their codebase. AI coding assistants don't understand code in the way a senior engineer does: they pattern-match against what they've seen before. When that training data contains insecure patterns, the model can reproduce software vulnerabilities and introduce security risks without the developer recognizing what's happened. AI-generated code can contain subtle flaws or, in adversarial cases, intentional backdoors that compromise system integrity.

Compliance and Regulatory Risk

Many organizations operate under regulatory frameworks that impose specific requirements on how certain categories of data can be processed and where they can be transmitted. GDPR, HIPAA, SOC 2, PCI DSS, and sector-specific regulations in financial services all have implications for AI usage. When employees submit regulated data to unsanctioned AI tools, they can create compliance violations that the organization wasn't aware of until an audit or incident surfaces them.

Model Governance Risk

Beyond the data exposure dimension, shadow AI creates model governance risks for organizations adopting certain tools. Organizations have no visibility into how the models their employees are using were trained, what data they were trained on, who controls them, or how they handle submitted data. Some open-weight and non-U.S. models present additional governance concerns that compound this risk. These models create potential regulatory and legal exposure, supply chain and governance opacity, and amplified leakage risks. When AI tools are used without review, organizations are implicitly trusting model providers they've never vetted.

To put it simply, AI tools carry more data risks than users, but often contain far fewer security controls.

How to Detect Shadow AI

Detecting shadow AI requires visibility at the data layer, not just at the network perimeter. Traditional security tools weren't designed to observe what happens when an employee opens a browser, navigates to an AI tool, and pastes in a portion of a confidential document. Closing that gap requires a different approach.

Browser-level monitoring is one of the most effective mechanisms for shadow AI detection. A browser extension that can observe which applications employees are accessing, which accounts they're using to access them (corporate versus personal), and what data is being transferred provides visibility that network-layer tools can't offer.
Data lineage tracking extends detection further. By tracking how data moves from its point of origin through user activity and out to external destinations, security teams can identify when sensitive files or records are being submitted to AI tools, even when the transfer method is copy-paste rather than a file upload that a DLP rule might catch.
Application inventory and classification is also foundational. Security teams need to know which AI tools are in use across the organization before they can govern them. This requires discovering tools that employees are using independently, not just the tools IT has provisioned. The gap between those two lists is the shadow AI surface area.
Usage analytics and anomaly detection can surface concerning patterns, such as an individual submitting unusually large volumes of data to AI tools, or activity that involves data categories that shouldn't be going to external systems.

How to Prevent and Remove Shadow AI Risk: A Governance Framework

The goal isn't to eliminate AI usage. Employees will continue finding and using tools that make them more productive, and that's a competitive advantage organizations should want. The goal is to bring AI usage within a governance framework that controls what data flows to which tools under what conditions.

Build Visibility First

You can't govern what you can't see. Before any policy can be enforced, security teams need an accurate picture of which AI tools are in use, how often, by which teams, and with what types of data. This discovery process often reveals a much larger and more varied shadow AI ecosystem than leadership assumed.

Qualify and Expand the Sanctioned Tool List

Rather than defaulting to restriction, security teams should work with business stakeholders to vet and approve AI tools that meet the organization's security and compliance requirements. The sanctioned list should be regularly updated as the AI tool landscape evolves. A fast-moving approval process reduces the pressure that drives employees toward unsanctioned alternatives.

Apply Risk-Based Controls to Data Flows

Not all AI interactions carry the same risk. The framework should differentiate by tool risk level, data sensitivity, and the nature of the interaction. Submitting publicly available market research to a sanctioned enterprise AI tool is a different risk profile than uploading source code to a consumer-tier coding assistant through a personal account. Controls should be proportionate: warn users for lower-risk interactions, block transfers of highly sensitive data to unsanctioned tools, and give security teams the visibility to investigate patterns over time.

The specific interaction types that matter here are uploads, copy-paste activity, form submissions, and, increasingly, prompt and response monitoring for tools where that capability is available. Catching sensitive data at the point of transfer, before it leaves the organization, is far more effective than attempting to recover from a data exposure event after the fact.

Enforce Governance at the Account Level

Because personal account usage represents such a large fraction of AI tool interactions, governance policies need to account for it explicitly. Requiring the use of corporate accounts for AI tool access is a necessary baseline. Detecting and alerting when personal accounts are in use, including for sanctioned tools, closes the blind spot that makes personal account usage so consequential.

Govern AI Usage Through Role-Based Access Controls

Not every employee needs access to every AI capability. Role-based access controls (RBAC) aligned to the job function allow organizations to grant appropriate access while limiting exposure. Developers may have access to AI coding tools under specific usage policies. Finance teams may be approved for AI-assisted analysis within controlled environments. Tailoring access to job functions reduces the data exposure surface while preserving the productivity benefits of AI adoption.

Establish Review Processes for AI-Generated Code

Any organization with software development teams should have a review process specifically for AI-generated code. Because AI coding tools can reproduce vulnerabilities from their training data, human review before production deployment is an important control. This applies equally to internally built AI tools and to AI-generated components incorporated into existing systems.

Create Clear Policies and Communicate Them

Policy without communication is theater. Employees need to understand what tools are approved, what data can and can't be submitted to AI tools, how to request approval for a new tool, and what the consequences of policy violations are. Well-communicated policies, combined with user-facing warnings that create friction at the moment of a risky action, are more effective than enforcement mechanisms that operate invisibly and surface violations only after the fact.

Safely Adopting AI at Scale

The risk of shadow AI isn't going to diminish as AI becomes more capable and more embedded in how work gets done. The second wave of AI adoption, marked by agentic tools, coding assistants, and workflow-integrated AI, makes the governance challenge harder, not easier. These tools operate at a layer closer to sensitive systems and data than the conversational AI tools that defined the first wave.

But the answer isn't to block all AI. Organizations that succeed in governing AI will be the ones that build the infrastructure to understand what's happening, enforce controls at the data layer, and make the sanctioned path the easy path for employees who want to use AI responsibly.

Cyberhaven is the AI and data security platform that uncovers shadow AI everywhere it hides, rates its risk, and stops sensitive data from leaking to unsafe AI tools, so enterprises can safely adopt AI at scale. The platform monitors data flows at the point of transfer, including uploads, copy-paste activity, and form submissions, and enforces risk-based controls that give security teams visibility and enforcement capability without requiring them to choose between productivity and protection.

Eliminating shadow AI entirely isn't a realistic goal. Containing the risk it creates, and building the governance infrastructure to manage it over time, is. When security teams work with the business to enable AI rather than simply block it, they create an environment where teams can move faster while minimizing the security risks that come from ungoverned adoption.

Learn more about how organizations are adopting AI, and what that means for security, with the Cyberhaven 2026 AICyberhaven 2026 AI.

Hear from experts about why cybersecurity is critical in the AI era in this Harvard Business Review and Cyberhaven webinar.