Your Employees Are Using AI. The Real Question Is: Which Version?

Part 1 of 3 — The AI Security Series

There's a question I've started asking business owners and executives during first conversations:

"Do you have a policy for which AI tools your employees are allowed to use?"

Most of the time, there's a pause. Then: "We've been meaning to address that."

That pause matters. Because your employees are probably already using AI. Not maliciously. Not recklessly. Not because they are trying to create a compliance problem. They are using it because it works.

A billing manager wants to clean up an email. An HR director wants help drafting a sensitive notice. A portfolio analyst wants to summarize notes. A lawyer wants a better structure for a memo. An architect wants to condense project correspondence into something usable.

The tool is fast. The output is good. The pressure to move faster is real.

But here's the part most firms have not fully thought through: the risk is not simply that employees are using AI. The risk is that they may be using the wrong AI tier with the wrong data. And in regulated industries, that distinction matters.

A free consumer AI account is not the same thing as an enterprise AI agreement. A browser-based assistant is not the same thing as an API with Zero Data Retention. A marketing claim is not the same thing as a Data Processing Agreement. And "we don't train on your data" is not the same thing as "we don't retain your data."

That's where the real governance problem begins.

The AI Privacy Problem Is Really a Tier Problem

The public conversation around AI privacy is confusing because most AI companies use the same brand name across very different products.

There is usually a consumer product. There may be a business workspace. There may be an API. There may be an enterprise contract. Each can have different rules for data retention, logging, model training, abuse monitoring, user controls, legal commitments, and sub-processors.

That means two people can say they are "using the same AI tool" and be operating under completely different risk profiles. One employee may be using a free app through a personal account. Another company may be using the same provider through an enterprise contract with a signed DPA, administrative controls, SSO, audit logs, data retention controls, and contractual restrictions on training.

Those are not the same thing. But to most employees, they look the same. That is the danger.

Perplexity Is a Good Example

Perplexity is a useful example because it shows how much the answer depends on which product surface is being used.

Perplexity's official Sonar API documentation describes a strict Zero Data Retention policy. Perplexity states it does not retain data sent through the Sonar API, does not use customer data to train models, and only keeps billing and usage metadata — token counts, model used, timestamps, and API key identifiers.

That is a very different risk profile from an employee using the public Perplexity app in a browser or on a phone. Perplexity's consumer help center says search queries and feedback reports may be used to improve the search experience unless the user turns off the AI Data Usage setting.

Perplexity also offers enterprise protections, including commitments around enterprise data not being used to train or fine-tune models. But that does not mean every Perplexity product surface operates under Zero Data Retention. The consumer app, enterprise workspace, and Sonar API are different environments with different controls.

The question is not: "Is Perplexity safe?" The better question is: "Which Perplexity product are we using, under what agreement, with what data, and with what retention controls?" That is the question every regulated firm needs to be asking about every AI platform.

OpenAI, Google, and Anthropic Follow the Same Pattern

OpenAI has different data handling depending on whether someone is using consumer ChatGPT, a business or enterprise version, or the API platform. Business data from ChatGPT Business, Enterprise, Edu, and the API platform is not used to train models by default — but API data controls and retention depend on the endpoint, settings, and whether Zero Data Retention has been specifically approved.

Google's Vertex AI and Gemini environments have configuration-dependent privacy controls. Some features and logging behaviors must be configured or actively disabled to achieve Zero Data Retention-style outcomes. This is not something leadership should assume is automatically enabled just because the tool is from Google.

Anthropic offers Zero Data Retention for eligible Claude API use cases and commercial arrangements. API ZDR means customer data is not stored at rest after the API response is returned, except where needed for legal or misuse-related reasons. But consumer Claude accounts have their own terms, settings, and retention rules.

Same brand. Different product. Different contract. Different risk. That is the part most organizations are missing.

"We Don't Train on Your Data" Is Not the Same as Zero Data Retention

This is where vendor language gets slippery. A vendor can say "we don't train on your data" — and that may be true — without the data being zero-retained. Data may still be stored for abuse monitoring, debugging, analytics, product improvement, legal compliance, user history, session resumption, or administrative review.

For a regulated firm, the questions that actually matter are:

The right questions to ask every AI vendor

Where is the data stored? How long is it retained? Who can access it? Is it available to human reviewers? Can it be subpoenaed? Is it covered by a DPA? Are sub-processors bound to the same obligations? Can administrators control retention? Can the firm prove what happened after the fact?

That is the difference between using AI casually and using AI responsibly.

What Zero Data Retention Actually Means

Zero Data Retention (ZDR) is simple in principle. The AI provider processes the request, generates the response, and does not write the prompt or response to persistent storage. No prompt logs. No retained conversation content. No training use. No stored transcript sitting in a third-party system for 30, 60, or 90 days.

There may still be limited metadata for billing, security, or operational purposes. But the substance of the prompt and response is not retained. For some use cases, that distinction is critical.

Think about what flows through AI tools in a typical professional services environment in a single week: client financial data, legal strategy, employment matters, acquisition plans, project specifications, board communications, regulatory evidence. That is not generic text. That is confidential business data — and much of it carries legal, regulatory, contractual, or fiduciary obligations.

If that data goes into the wrong AI tool, the firm may have created exposure without anyone intending to do anything wrong. That is the modern version of shadow IT — except now the shadow system can read, summarize, transform, and retain sensitive information at scale.

The Real Risk Is Not Hackers. It Is Default Settings.

Most firms think about AI security the way they think about traditional cybersecurity. They imagine a breach. A hacker. A compromised server. A dramatic incident. Those risks are real. But the more immediate risk is usually much quieter.

It is the employee using a personal AI account to save time. The default setting nobody changed. The free tool nobody approved. The sensitive document uploaded to summarize a meeting. The client email pasted into a prompt because someone needed a cleaner version.

No one meant to disclose confidential information. But disclosure does not always require bad intent. Sometimes it only requires a convenient tool, a busy employee, and no policy.

What Good Looks Like: Start With Classification

A mature AI policy does not start with fear. It starts with classification. Not all AI use is equally risky — asking a tool to draft a public LinkedIn post is not the same as uploading a client tax return. Summarizing public research is not the same as analyzing deposition transcripts. The first step is to classify the data, then classify the tools, then match the right tool to the right data.

A practical data classification framework

Public data — Information already available publicly or approved for external use. Lower-risk AI tools may be acceptable.

Internal business data — Company information not intended for the public but not highly sensitive. Business-tier tools with appropriate terms.

Confidential client data — Client records, financial information, private communications, contracts, regulated records, or privileged material. Requires enterprise tools with contractual protections, DPAs, and retention controls.

Restricted data — Protected health information, legal privilege, sensitive HR data, credentials, acquisition activity, or anything governed by strict contractual obligations. AI use requires a designed, controlled workflow — not ad hoc consumer tools.

What Leaders Should Be Asking

For regulated and professional services firms, AI governance should become part of the operating system of the business. The leadership team should be able to answer:

The governance checklist

Which AI tools are employees actually using — not just which ones are approved?
Do we have a signed DPA with each AI vendor handling sensitive data?
Do we have a BAA if healthcare data is involved?
Do we know whether prompts and responses are stored, and for how long?
Do we know whether our data is used for model training?
Do we know who the sub-processors are and what they're bound to?
Do we have SSO, administrative controls, and audit logs?
Do employees understand the difference between consumer AI and approved business AI?

If the answers are unclear, that uncertainty is itself a risk signal.

The Contract Matters More Than the Marketing

This is where many firms get caught. They read a privacy page and assume they are protected. But a privacy page is not a contract. A marketing claim is not a negotiated obligation. A toggle in a personal account is not an enterprise governance framework.

For sensitive or regulated work, the minimum standard should include: a signed Data Processing Agreement, clear retention terms, restrictions on model training, administrative controls, SSO and identity management, audit logs, defined sub-processor obligations, SOC 2 Type II reports, and BAA support where healthcare data is involved.

One more thing worth saying plainly: "your data is encrypted" is not a privacy guarantee. Encrypted data retained for 30 days is still retained for 30 days. The encryption protects against a breach — it does nothing about the retention itself.

"The firms that win will not be the ones that ban AI. They will be the ones that build the governance rails early enough to use it confidently."

Your employees are using AI. The question is whether your firm is managing that reality or hoping nothing goes wrong. For regulated firms, professional services organizations, and companies built on trust, the standard has to be higher than convenience.

It is not enough to say "we use AI." The better standard is: "We know which AI tools we use, what data goes into them, what contracts govern them, how long the data is retained, and who is accountable."

That is what responsible adoption looks like. That is what clients will expect. That is what regulators will care about. And that is what operationally mature firms should already be building.