5 Ways Enterprise Data Leaks Through AI Tools (And How to Stop It)

Enterprise AI data leaks are not random. They take a small number of repeatable shapes, most of them documented in the public record, and each has a specific class of control that catches it. The five below are the patterns we designed Themisto Labs to handle. For each, we'll describe what the leak looks like in practice, why it keeps happening, and the property a control needs to have to actually stop it.

1. The summarize-this-for-me leak

The shape. An employee drops a document into a chat interface and asks for a summary. The document contains a customer record, an unsigned partner contract, or an internal strategy deck. The summary is good. The document is in a vendor's retention log.

Why it keeps happening. Summarization is the highest-utility knowledge-work use case for LLMs today. Four decades of software have trained people that copy-paste is safe. The cognitive gap between pasting into a word processor and pasting into a third-party inference pipeline has not closed.

What a control needs to do. Classify content before egress, not after. By the time the request reaches the model provider, the data has left the perimeter. The classifier has to sit on the device, in the path of the request, redact sensitive fields inline where possible, and block with a specific explanation where not. The explanation matters: people route around silent blocks. They comply with clear ones.

2. The stack-trace paste

The shape. An engineer debugging a production issue pastes the full stack trace into an AI assistant. The trace contains an internal hostname, an environment variable value, a connection string with credentials, or the contents of the request that triggered the error, which in turn contains customer data.

Why it keeps happening. Debugging is adversarial. More context produces better answers. Nobody manually redacts 200 lines of stack trace at 3am during an outage.

What a control needs to do. Automatic secret detection at the process layer, on by default. Not a policy document that asks people not to paste credentials. Actual pattern detection for AWS access keys, GCP service account JSON, JWTs, connection strings, PEM blocks. Redact inline before the request leaves. This is the single highest-yield control for a developer organization. Credential leaks through AI tools now account for a non-trivial share of the secrets rotating through corporate password managers.

3. The prototyping-against-real-data leak

The shape. A product manager asks an AI assistant to build a churn-analysis tool. To ground the prompt, they paste a representative sample of the customer table: schema and a thousand real rows. The output is a useful notebook. The customer data is now in the vendor's pipeline.

Why it keeps happening. Prototyping against real data is faster than prototyping against synthetic data. It also surfaces bugs that synthetic data hides. Analysts and PMs have learned this the hard way.

What a control needs to do. Structured-data awareness in the classifier. Regex on strings is not enough. The classifier has to recognize when a block of pasted text is a table of customer records, which is a schema-detection problem closer to data profiling than keyword matching. When it detects one, the right default is to prompt the user: "this looks like 1,247 customer records, redact identifiers before sending?" The safe path becomes the default, the risky path becomes a conscious choice.

4. The agentic tool-call leak

The shape. A developer wires up an agent that can query internal systems: read the CRM, look up a user, pull billing history. The agent was intended for operational questions. Someone in a different department asks it to "pull the full account history for customer X" through a Slack integration. The agent does. The response is now in Slack history, in a shared channel, indexed by whichever third-party integrations are connected.

Why it keeps happening. Agentic workflows erase the lines that were drawn between production systems and collaboration tools. Authorization models for function calls were built assuming a human reads the result in a controlled surface. Agents route the same result through whatever channel is wired in.

What a control needs to do. Policy that follows the data, not the tool. When a record leaves the CRM, it carries a classification tag that survives serialization. When the agent tries to write that tagged data into Slack, a ticket, an email, or another LLM request, the governance layer applies the same policy that would have applied at the source. This is genuinely hard engineering and is where most of the interesting work in enterprise AI security will be done in the next few years.

5. The wrong-vendor leak

The shape. A team has an enterprise agreement with one AI vendor, zero-retention terms, full DPA. A new hire uses the same prompt, verbatim, against a different vendor's free tier because sign-up was faster. The prompt is fine under the enterprise terms. Under the free-tier consumer terms, the data is retained.

Why it keeps happening. The switching cost between AI vendors is near zero from a user perspective. The switching cost from a compliance perspective is enormous and invisible.

What a control needs to do. Model-level policy awareness. The governance layer has to know which model a request is going to and apply different rules based on the answer: block unapproved providers, route approved prompts through the right endpoints, redact on egress regardless of destination for the cases that need it. This also means the approved-provider list has to stay fresh. New model options appear weekly.

The common structure

Look at the five patterns together and the same requirement shows up in every one: classify the payload before it leaves, at the layer closest to the process.

Network-level tools do not see the content. SaaS-level tools do not see unapproved SaaS. Endpoint DLP was built for files, not prompts. The control that catches all five has four properties:

On-device. It sees the request before TLS encrypts it for transport.
Content-aware. It classifies what is in the payload, not just where it is going.
Model-aware. It applies different policy based on destination model and its retention terms.
Usable. Transparent when the request is fine, firm and specific when it is not.

That is the minimum viable shape for enterprise AI governance. It is not a theoretical bar. It is the feature set the five patterns above each independently demand. Any product claiming to cover this space that is missing one of the four has a corresponding set of incidents it will not catch.

This is what we are building at Themisto Labs. If any of the five shapes above describes something your team is already dealing with, the demo is 30 minutes and shows the control behaviour directly.