Is Your AI Learning About You the Same Way Facebook Did?

Part 2 of 3 — The AI Security Series

I want to ask you a question you probably haven't considered.

You know how Facebook and TikTok built billion-dollar empires by watching what you click, what you linger on, what you scroll past, who you follow? You know how their algorithms got so good at predicting behavior that a data firm — Cambridge Analytica — used Facebook's data to build psychological profiles on 87 million people? Profiles accurate enough to predict voting behavior, emotional vulnerabilities, and susceptibility to specific messages?

Here's the question:

Do you think AI companies are doing something similar? And if they are, would you even know?

I'm not going to tell you the answer. But I am going to lay out what the research actually shows — because this conversation is happening at the highest levels of academic and regulatory scrutiny, and most business leaders are nowhere near it yet.

What Facebook Knew vs. What AI Knows

Facebook's surveillance model was built on behavioral proxies. Clicks. Likes. Shares. Watch time. Friend relationships. What you said you were interested in versus what you actually engaged with.

It was remarkably effective. Cambridge Analytica's methodology used Facebook likes to model personality traits and could reportedly predict them with significant accuracy from as few as 68 likes. That was considered revolutionary. Alarming enough to produce congressional hearings, multi-billion dollar fines, and a wave of privacy regulation still reverberating today.

Now consider what an AI chatbot sees.

Unlike a search bar, which captures short fragmented queries, questions to AI chatbots are framed as if the person is talking to another human being. A single conversation might contain personal thoughts, emotional struggles, health concerns, financial fears, relationship dynamics, and professional anxieties — often in the same session.

Facebook tracked what you clicked on.

AI systems see how you think.

Every conversation generates what researchers call behavioral exhaust: which topics you linger on, how you phrase questions, what you revise before sending, when you abandon a search mid-query. These aren't stored simply as transcripts. They're stored as behavioral sequences — the exact data type Cambridge Analytica weaponized to map psychological vulnerabilities.

The comparison isn't rhetorical. It's increasingly literal.

The Intimacy Problem

There's a reason therapists have confidentiality laws. There's a reason attorney-client privilege exists. There's a reason you don't write certain things in email.

People intuitively understood those communications were sensitive. They calibrated what they shared based on context and consequences.

AI companions are more intimate and better optimized for engagement than social media — making it more likely that people will offer up more personal information. In many ways, a conversation with an AI feels more private than anything you'd post on Facebook. There's no visible audience. No social judgment. No one in the room.

The AI companies building the models, on the other hand, see everything.

That last sentence deserves to sit for a moment.

The perceived privacy of talking to an AI — the feeling that it's just you and the tool — actually makes people more forthcoming than they'd be in almost any other digital context. People share things with AI they wouldn't put in an email, post on social media, or say to a doctor.

And on most consumer-tier AI platforms, all of that is governed by the platform's terms of service. Not by your privacy instincts. Not by professional ethics codes. Not by regulatory frameworks.

By a terms of service document almost nobody reads.

The Business Model Question

Social media's monetization model is well understood at this point. The product is attention. The customer is the advertiser. The raw material is user behavior data.

AI companies face the same structural pressure. Building and running large language models costs hundreds of millions of dollars. That investment has to be recovered somehow.

For consumer-facing AI products offered at low or no cost, the economics are familiar: if you're not paying meaningfully for the product, you are in some form the product. The question is just how, and through what mechanism.

OpenAI launched the Agentic Commerce Protocol in late 2025, letting users buy products from merchants directly inside ChatGPT conversations. They're also building "Sign in with ChatGPT" — an identity layer that would give them visibility into which apps users access, how often, and what they do there. At 600 million monthly active users, that level of behavioral data makes Google's advertising model look modest by comparison.

These aren't speculative concerns. They're documented product decisions by companies with investors to satisfy.

"The question isn't whether AI companies are building these systems. The question is whether your organization is going to develop a deliberate posture before regulators force the issue — or after."

What This Means for Professional Services Firms Specifically

For individual consumers, the implications are unsettling. For professionals in regulated industries, they're potentially catastrophic.

Think about what flows through AI tools in a typical firm over the course of a week.

A financial advisor uses AI to draft client communication referencing portfolio positions, risk tolerance, and retirement goals. A lawyer pastes deposition excerpts to structure a brief. An HR director inputs salary data and performance reviews to draft a termination letter. A business owner shares revenue projections and strategic plans to prepare for a board meeting.

Every one of those conversations is, in a sense, a behavioral and professional profile. Not just of the user — but of their clients, their organization, their strategy, and their vulnerabilities.

People with access to those conversations can gain detailed behavioral blueprints: personal habits, relationship dynamics, professional vulnerabilities, decision-making patterns. In the wrong hands, that data is a powerful tool for social engineering. A private exchange with an AI could later be interpreted as incriminating in a legal setting, regardless of intent at the time.

On most free and consumer-tier AI platforms, there is no legal agreement protecting any of that. The platform's privacy policy — which the platform can modify — is the only thing standing between your professional intelligence and whatever the platform decides to do with it next.

A Different Architecture Exists

The intimacy problem and the business model problem share the same root cause: when you send data to someone else's infrastructure, you lose control of it. The cloud-based AI business model is structurally dependent on data flowing from users to providers. That's how models improve. That's how the compute costs get justified. That's how monetization eventually works — whether through subscriptions, commerce integrations, identity services, or advertising.

The alternative is to bring the AI infrastructure inside the perimeter.

Open-weight models have now reached performance levels that are genuinely competitive with the major commercial providers for most business use cases. When you run AI on your own infrastructure, your data stays in your environment. Your client records, internal documents, and strategic discussions never leave your network.

The tradeoff is real. Self-hosting requires infrastructure investment, technical expertise, and ongoing maintenance. For many organizations, enterprise-tier cloud AI with proper contractual protections is the more practical path. But for regulated businesses handling sensitive client data — law firms, financial advisory firms, healthcare organizations — the question is worth asking seriously.

At what point does the productivity benefit of cloud AI not justify the data exposure?

Most organizations haven't run that calculation. They've defaulted to the free tool because it was convenient.

The Lesson From the Last Decade

We had a decade of social media before we fully understood what Facebook and TikTok were actually building. The product. The business model. The psychological optimization. The political and social consequences. By the time the Cambridge Analytica story broke, hundreds of millions of people had already handed over the raw material for extraordinarily detailed profiles.

The regulation followed. The lawsuits followed. The congressional hearings followed.

We are right now in the early innings of the same story with AI. The data being collected is more intimate. The profiling capability is more sophisticated. The business model pressures are identical.

The firms that get ahead of this — that build governance frameworks, that distinguish between appropriate and inappropriate data flows, that start evaluating private AI infrastructure for their most sensitive use cases — will be in a fundamentally different position than the firms that don't.

The lesson from the social media era is clear.

By the time the story breaks, the data is already gone.

← Previous Part 1: Your Employees Are Using AI. The Real Question Is: Which Version?