Most of the AI integration conversations we have start the same way: someone has decided the product needs AI. The brief arrives in the shape of a feature ("an AI assistant for our dashboard"), not a problem ("our users can't find anything"). That ordering is the source of more wasted engineering effort than any other pattern we see in AI integration work right now.
The right question isn't can we add AI here. It's what does the failure mode cost. AI is a probabilistic tool. Some problems tolerate that, and some emphatically don't.
Where AI does pay off
Three patterns where AI integration tends to deliver real, durable value.
Narrowly scoped tasks with a tolerant audience
Summarization, classification, drafting, and search-over-unstructured-content are the consistent winners. They're narrow enough that you can evaluate quality, and the user is already doing the work in their head — AI just speeds it up. A 90% correct summary is better than no summary; a 90% correct database query is dangerous.
Augmenting an existing workflow
The integrations that get adopted are the ones that fit into a workflow people already do, not ones that ask them to learn a new way of working. A "draft a reply" button next to an inbox is used. A floating chat assistant that wants to reorganize the user's process is not.
Where the alternative is "do nothing"
A surprising amount of valuable AI work replaces no work happening at all. Tagging support tickets that nobody had time to tag. Drafting first-pass meeting notes that nobody was writing. Surfacing patterns in feedback nobody was reading. The bar for these is low because the comparison isn't a careful human — it's an empty queue.
Where it usually isn't the right tool
Three patterns where the AI integration is more likely to cost you than help.
Replacing deterministic logic with probabilistic logic
If the existing system is rules-based — pricing, eligibility, routing, validation — replacing it with an LLM trades certainty for vibes. The cases where this is the right call are rare. Most teams who try this end up rebuilding the rules engine inside the prompt, which is the worst of both worlds: rule logic without the auditability of code, plus a per-call inference bill.
Decisions where errors are expensive
Anything that affects money, identity, eligibility, medical or legal outcomes, or safety should not have an LLM in the decision path. Use AI to prepare information for the human or system making the decision, not to make the decision itself. The best AI integrations in regulated domains are the ones that look the most boring from the outside — they help humans read faster, not approve faster.
When the audit trail matters
If you'd struggle to explain why the system did what it did six months from now, you have a problem. LLM outputs change with model updates, prompt drift, and context window changes. For workflows that need a stable, defensible decision record — compliance, hiring, anything customer-facing with a dispute path — deterministic systems give you that for free. AI doesn't.
A simple decision filter
When a team brings us an AI integration brief, we usually walk through three questions:
- What does the failure mode cost? If the answer is "the user laughs and tries again", AI is probably fine. If it's "the user gets the wrong refund" or "the support agent misses a regulated disclosure", probably not.
- What's the alternative we're comparing against? "AI vs nothing" is a much easier bar than "AI vs a rules engine that already works".
- What does the human do with the output? If the human reviews, edits, and ships — AI is a force multiplier. If the human rubber-stamps — the human will eventually stop reviewing, and the AI is now in the decision path whether you intended that or not.
If you can answer those three honestly and AI still looks like the right tool, it probably is. The integrations that get to production and stay there are the ones where the team did this triage early, not the ones where someone decided to "add AI" and worked backwards from there.
What this looks like in practice
The work we end up doing in AI Integration tends to be smaller and more boring than the brief described. A focused RAG over a real document set, a classifier for an existing queue, a "draft this" affordance next to a workflow that already exists. The big agentic features that make for good demos rarely make for good product — and when they do, it's because someone earlier in the process did exactly this kind of triage and threw out the parts that didn't pass it.
If you're in the early stages of an AI feature and you're not sure which side of this line it falls on, that's the conversation worth having before the engineering starts — not after.

