I Stopped Prompting AI. I Started Assigning Work.

I Stopped Prompting AI. I Started Assigning Work.

There’s a particular cognitive shift that happened after I had Compound running for a few weeks. My job with AI got smaller in one direction and larger in another. The part that got smaller was setup — figuring out how to explain what I needed and what standard I expected. The part that got larger was judgment — deciding whether what came back was actually good, pushing it further, redirecting when the framing was wrong. I hadn’t expected the ratio to flip that cleanly.

That ratio is what this post is about. It’s not really about agents or system design. It’s about a more basic question: in any given AI interaction, who is holding the context?

What prompting actually costs

The cost of prompting isn’t usually framed correctly. Most people talk about it as a skill — something to get better at through practice, better templates, smarter context windows. And there’s something to that. But the framing hides a structural problem.

When you prompt, you are the context-carrier. You hold the standards. You know what good output looks like, what domain knowledge the model needs, and what you’ll reject when it comes back wrong. The model is ready to work; you are the one doing the orientation. Every session starts with that transfer, and the transfer is entirely on you. No amount of prompt engineering eliminates it — it just makes you more efficient at a job that, structurally, shouldn’t be yours every time.

This is why prompt improvement has a ceiling. The bottleneck isn’t the quality of the instruction. It’s the location of the context. Until the context lives inside the system rather than inside your head, you’re reprinting the same document at the start of every task.

What an assignment actually is

When you assign work to a person, you don’t narrate every step. You give them the objective, the constraints, and the standard you expect — then you evaluate what comes back. You’re the judge of the output, not the operator of the process. The person doing the work holds the context about how to do it well. You hold the authority to say whether it met the bar.

Most AI use today has that relationship inverted. The human holds the context; the AI holds the processing power. The result is that each new task requires fresh orientation: re-explain the domain, re-establish the standard, re-specify what you actually want. You’re not delegating. You’re operating a responsive tool with no institutional memory.

An assignment model asks a different question from the start: what would it look like if the system already knew how this work should get done, and my job was to evaluate the result? That question points toward a different kind of design.

An assignment has properties that a prompt usually doesn’t. It carries context about who is doing the work and what standards they’re expected to meet — not as instructions to follow, but as built-in constraints on how the work gets approached. It produces a defined artifact rather than an open-ended response. And the human’s entry point is evaluation and direction, not reconstruction. In Compound, that entry point is a literal plan review step: before any agent runs, the orchestrator shows which specialists it will use and in what order, and you approve, redirect, or narrow scope before execution starts. The supervision happens at the beginning, not throughout.

The upfront cost of specificity

Here’s where the model breaks down a little in practice, and it’s worth being honest about it.

Getting to an assignment relationship requires work on the front end. Writing a good agent definition — or even a well-configured persistent instruction set — forces you to make your standards explicit in ways you probably haven’t before. “Be analytical” is not useful. “Lead with the downside scenario, separate facts from inferences, and flag when the evidence is thin” is something a system can act on consistently. Most professionals have standards like the second version living in their head. They’ve just never had to write them down because they’ve always been the ones doing the work.

That process of making standards explicit is genuinely costly. It requires knowing what you actually value in an output — not as a general preference, but as a specific constraint. And it takes iteration. The first version of any agent definition is usually too vague to be useful, and the gap only becomes visible when the output comes back wrong in ways you can’t quite articulate yet.

But here’s what that cost buys: you pay it once, and the system applies it repeatedly without re-explanation. The orientation work shifts from happening at the start of every task to happening once at the design stage. The relationship changes from transactional to something closer to a working standard — something the system can be held to rather than something you have to reinstate each time. In Compound, those standards live in two places: the agent definition, which encodes the role and approach, and the skills — reusable behavioral modules that agents reference declaratively. Improving a skill raises the baseline everywhere it’s used, which is where the compounding actually happens.

The counterargument worth taking seriously

The obvious objection is that most people don’t have a full AI operating system, and building one isn’t realistic for most workflows.

That’s fair. But the underlying principle doesn’t require the full system. The question to ask about any AI use is simpler: who is holding the context? If the answer is “me, every time I start a new task,” then you have a tool relationship — and you’ll keep bearing the setup burden. If the answer is “the system, because I’ve made the standards explicit and built them in,” then you’re operating closer to an assignment model, even if the implementation is a single GPT or Claude Project with persistent instructions and a specific task scope.

The gap between those two isn’t primarily technical. It’s a question of whether you’ve done the work of being specific. Most people haven’t — not because they’re lazy, but because the pressure to be specific only becomes visible once you’ve tried to delegate to something that can’t infer what you mean. Prompting lets you stay vague and correct on the back end. An assignment model requires clarity on the front end. That shift in where the precision lives is the real change.

The operating model frame

The deeper shift isn’t about efficiency. It’s about what kind of relationship you have with AI.

A tool relationship asks: what can I make this thing do? An operating model relationship asks: what work can I hand off, who is the right specialist for it, and how do I evaluate what comes back? The second question set is closer to how experienced managers think about delegation than to how users think about software.

When I think about where AI goes wrong in knowledge work, it’s usually not that the model is bad. It’s that the relationship is wrong. The model is ready. The allocation of cognitive work is the problem. The human is doing orientation and operation when they should be doing judgment and direction.

Compound is my attempt to build that operating model — specialists who hold their own context and standards, with my job focused on evaluating what comes back and deciding what happens next. Whether you have a full operating system or one well-configured assistant, the same structural question applies.

If you’re spending more time orienting your AI than you are evaluating what it produces, you’re still prompting. That’s worth changing — regardless of what system you use.

Subscribe to The Algorithm

Notes on building AI systems that actually work.