Stop Trying to Automate the Whole Workflow

A diagram showing AI embedded at a specific point in a business workflow

If you’ve approved an AI initiative in the last two years, there’s a reasonable chance you signed off on something more ambitious than it needed to be. Not because the aspiration was wrong — it probably wasn’t — but because the conversation reached for orchestration and end-to-end automation before asking a simpler question first. You’re not alone in this. The organizational forces that produce those decisions are predictable, and understanding them is more useful than relitigating the approval.

There is a simpler pattern that produces faster results, lower risk, and more durable improvement in the large and underserved class of cases where a working process already exists and has a clear expensive bottleneck. It gets skipped constantly, not because it doesn’t work, but because it doesn’t look impressive enough to justify the conversation.

The pattern is called the Embed. The idea is straightforward: find an existing business process that works, identify the one step where it is consistently slow, expensive, or unreliable, and add AI capability at that step. Everything else stays the same. The workflow structure doesn’t change. The roles don’t change. The accountability doesn’t change. The AI makes one step better, and the human still makes the consequential determination.

What the Embed Actually Does

To understand why this matters, it helps to be precise about what “embedded” means in practice.

An Embed does not wrap a workflow in AI. It does not replace a decision or a role. It improves the quality of information available at a specific point in a process before the human does their job. A contract review that took a lawyer three hours to summarize now takes thirty minutes because the AI has already extracted the key clauses, flagged the non-standard terms, and organized the open issues. The lawyer still reviews. The lawyer still signs off. The AI changed the cost of the input, not the structure of the accountability.

That specificity is the pattern’s power. The AI’s contribution can be verified before it affects any outcome. If the extraction missed something or flagged the wrong terms, the lawyer sees it before it matters. The feedback that follows — accepted, corrected, overridden — becomes the primary improvement signal over time.

Organizations that run the Embed well often find that those human corrections do more for the system’s quality than the next model upgrade. The loop is doing the work — but only if it’s maintained. Feedback loops have a structural blind spot: humans tend to scrutinize flagged or uncertain outputs carefully while rubber-stamping the ones the AI returns with high confidence. Over time, this creates systematic under-sampling of cases where the model is quietly wrong. Random sampling of accepted outputs, not just flagged ones, is what prevents the loop from drifting.

Why Organizations Skip It

The pattern is well understood in abstract. In practice, it consistently loses to more complex alternatives, and the reason is not technical.

Organizational incentives do not reward it. An AI specialist brought in to solve a business problem is looking for a solution that uses their expertise. A vice president approving a multi-quarter AI initiative wants a result that looks commensurate with the investment. A vendor presenting a solution has a revenue model that depends on scope. The Embed does not generate a satisfying slide deck. It looks like a partial answer.

So organizations approve architectures that are more ambitious. These proposals are often presented in the language of transformation: end-to-end automation, adaptive orchestration, autonomous processing of high volumes. The roadmap is compelling. The business case models significant efficiency gains.

The problem is that the more ambitious architecture arrives with a set of assumptions included in it. It assumes that the data is clean and representative. It assumes there are people on staff who can maintain an orchestration layer. It assumes the organization can move fast enough on integration to keep the system current. These assumptions often go unexamined during approval because the proposal is evaluated against an aspiration, not against the actual state of the organization’s data infrastructure and operating capacity.

The Failure Mode That Doesn’t Announce Itself

When the assumptions don’t hold — and they often don’t — the failure is slow and quiet rather than sudden and visible.

A financial services firm spent fourteen months and roughly $2.8 million building an automated underwriting orchestration system intended to handle the full case load. It was designed to ingest applications, pull external data, run credit assessments, and route decisions with minimal human review. Document review — the step where analysts read through income statements, employment records, and edge-case documentation — was the binding bottleneck. It was where analyst time concentrated, where throughput constraints accumulated, and where the process slowed most visibly. The system was designed to bypass that step entirely.

In testing, it performed well on clean cases. In production, the distribution shifted. Applications with unusual income structures, recent employment gaps, or edge-case documentation patterns created routing problems the system hadn’t been designed to handle gracefully. Part of this was a pattern-choice problem — a more complex architecture has more surfaces on which execution can fail, and more places where the absence of graceful degradation creates hard stops. Part of it was execution: no phased coverage targets, no fallback paths for cases outside the training distribution, no production-monitoring for how the live case mix differed from what the system had been tested on. Both failures are real. But the execution failures only became consequential because the architecture had no margin for them.

Nine months after launch, the system was handling about 28% of application volume. The remaining 72% required manual handling. The manual process that had been running before the system launched — slower, more expensive, but complete — had no fallback problem. It was the fallback. The organization had paid to replace a process that handled 100% of cases with a system that handled 28%, and was now running both in parallel.

The Embed they had declined to build — AI at the underwriting analyst’s document-review step — would have required a fraction of the investment, handled the same percentage of cases the manual process always handled, and been operational within a quarter.

The Question Worth Asking First

I don’t think organizations overbuild because they are irrational. They overbuild because the corrective question usually doesn’t get asked early enough.

The question is: what would happen if you embedded AI at the most expensive bottleneck in this process and left everything else alone?

Start there and work outward. How much value does that generate? How quickly could it be operational? Who needs to be trained, and on what? What specifically would need to be true for the Embed to be insufficient — and can that limitation be named concretely? If the limitation is “this process needs to scale past what human review can handle,” that is a specific condition that may justify a more complex architecture. If the limitation is “we feel like this problem is bigger than that,” the Embed is probably the right answer, and the instinct toward complexity is worth examining rather than following.

The diagnostic also works in the other direction. When embedding AI at the most expensive step generates little value, that answer is equally useful — it usually means the cost is distributed across many steps rather than concentrated in one, and the honest response may be process redesign rather than AI. Or it means the bottleneck isn’t AI-addressable at all, which is information worth having before a fourteen-month build. The question is designed to discriminate, not to confirm.

To understand what “a concretely nameable constraint” looks like when it actually exists: a fulfillment operation processing 400,000 orders per day with value concentrated in anomaly detection — flagging the 0.3% of orders that signal fraud or supplier failure — has a constraint the Embed can’t satisfy. The volume is roughly 40 times what a human review team could cover, and the value is concentrated precisely in the cases no analyst would ever see. That’s not a process with an expensive bottleneck that AI can improve. That’s a signal-detection problem at a scale that requires a different architecture from the start. The diagnostic surfaces that distinction; it doesn’t produce the Embed as the answer to every question.

The diagnostic works because it separates what the process actually needs from what the architecture aspires to deliver. Anything more ambitious than the Embed requires a real reason — not a vision, but a specific constraint that the Embed can’t satisfy.

The Architecture Question Underneath

Choosing the simpler architecture in a room full of people who want to build something ambitious is not the easy call.

The Embed requires a clear-eyed assessment of what a process actually needs, which is harder than approving an end-to-end automation proposal. It requires the organizational will to resist scope inflation, which is harder than expanding the brief. And it requires knowing when the bottleneck is solved and what comes next, which is harder than launching a system and moving on.

Most organizations I see are underinvesting in the simpler problem. They are building systems that assume capabilities they don’t have because the simpler path doesn’t feel like enough. The Embed is not the consolation architecture. In the large and underserved class of cases where a working process already exists and has a clear expensive bottleneck, it is almost always the right first move — the one that matches the actual maturity of the process and the actual readiness of the organization.

If you are thinking about where AI belongs in a working process, I’d start with the most expensive step, add AI there, and then ask what specific limitation would force a more complex answer. That question does more useful work than the architecture conversation that usually comes first.

Subscribe to The Algorithm

Notes on building AI systems that actually work.