Blog

The Three Ways AI Quietly Fails Your Business

Most AI-risk writing fixates on science fiction. The risks that actually cost businesses money are smaller, quieter, and easier to prevent if you know what to ask. Three failure modes worth understanding before you ship.

March 7, 20265 min read

Most of what gets written about AI risk is some variation of "the robots are coming" or "the robots will take your job," neither of which is useful when you're trying to decide whether to ship a chatbot next quarter.

The real risks of running AI in your business are smaller, more boring, and a lot more likely to actually happen. They're also easier to prevent if you know what to look for. So let's skip the science fiction and talk about three failure modes that have actually cost real companies real money over the last couple of years.

1. Confidently wrong

This is the big one, and the one most people underestimate. AI doesn't know when it doesn't know. When it lacks the right answer, it will produce a confident, well-written wrong one — and it will sound exactly as plausible as a right one.

What it looks like in the wild: a chatbot tells a customer they're entitled to a refund that company policy doesn't actually offer. A virtual assistant references a feature your product doesn't have. An agent quotes a discount code that was never created. Lawyers have shown up in court citing cases their AI tool fabricated. (More than once. In case you thought it was a one-off.)

One airline got taken to small-claims court after its chatbot promised a customer a bereavement-fare refund that didn't match the airline's actual policy. The court held the airline to what the chatbot said. That's the new precedent worth understanding: your AI's promises are your promises.

How to spot whether someone has thought about this: Ask how they test the AI against known-right answers, what happens when it doesn't know something, and where it's not allowed to answer on its own. If the answer is some version of "we just trust the model," walk away.

2. Helpful to the wrong person

The second failure mode is data leakage, and it usually doesn't look the way the news makes it sound. There's no hacker, no dramatic breach. There's just an AI assistant that was given access to too much, asked an innocent-sounding question, and helpfully pulled information into a response that the person asking shouldn't have seen.

What that looks like: a contractor asks an internal assistant a generic question and it cheerfully includes a snippet from the HR file with employee compensation. A customer-service bot wired into an internal knowledge base accidentally surfaces a private pricing memo. A "summarize this folder" tool summarizes the folder it should have been blocked from.

AI has no intuition about who should see what. If it has access to a piece of data, it will use it the moment it seems relevant. The fix is unglamorous: the AI's permissions have to mirror the user's permissions, every single time, with no exceptions.

How to spot whether someone has thought about this: Ask whether the AI can ever access data the requesting user couldn't access on their own. The right answer is "no, never." Ask whether there's a log of what it touched. The right answer is "yes, and we can show you on demand."

3. The model changed underneath you

This is the failure mode that gets the least attention and bites the hardest. You build something that works beautifully. You launch it. Six months later, the AI provider quietly updates the model — or your team upgrades because the new version is cheaper or faster — and the same inputs now produce subtly different outputs. Sometimes wrong ones. You don't notice until customers complain.

What that looks like: a tool that summarizes contracts starts missing the indemnification clause it used to catch. An automated email responder that used to sound like your brand starts sounding like a generic call center. A categorization system starts putting the wrong things in the wrong buckets.

The model is a moving target. The version of the AI you tested in March isn't necessarily the version answering questions in October. Without something watching for that drift, you'll only find out about it after it's already affected people.

How to spot whether someone has thought about this: Ask which version of which model the work runs on, whether it's pinned, and what the plan is for catching changes in behavior. If the answer is a long pause, that's your answer.

What "thinking about this" actually looks like

These aren't theoretical risks. They're the most common ways AI projects go from "look how cool this is" to "we're in a meeting with our lawyer." None of them are exotic, and none of them require a giant security team to prevent — but they do require somebody on the project who has either been burned before, or who has been paying close attention to the people who were.

That's most of what you're hiring for when you bring in a partner for an AI project. The cleverness is the cheap part now. The careful part is what keeps your business out of trouble.

If you're evaluating a partner and you want a quick test, ask them about the three things above. The right answers are specific and a little bit boring. That's how you know.