"Context Engineering: The 95% Nobody Talks About"

Key Takeaway: Context is the 95%. The prompt is the last mile, the last 5%.

Prompt engineering is a red herring. The prompt is the last 5%. The other 95% is context engineering -- deciding what information to assemble, retrieve, and inject before the model ever sees your question. I learned this by building a production system where context assembly is the entire product. And the reason nobody talks about it is that "context engineering" doesn't fit on a LinkedIn headline. It's plumbing. It's the unsexy work of building retrieval pipelines, managing embedding spaces, and deciding how to allocate a finite context window across competing information sources. But it's where all the value lives.

Ask the same model "what should I prioritize this week?" With no context you get a generic productivity listicle. With wrong context you get a confidently irrelevant answer. With the right context assembled from active projects, recent decisions, and relevant conversation history you get something useful. Same model. Same prompt. The context did all the work.

Context is architecture, not prompt tweaking

Most teams treat context as a system prompt they wordsmith for three weeks. The actual discipline is building a retrieval and assembly pipeline -- software engineering that happens to involve AI.

I built a context registry with six declarative layers. Each query triggers a different assembly based on intent:

RAG -- semantic search across 18,000 chunks of emails, messages, transcripts, and documents. Binary needsRAG() check, fan-out across topic indices, distance cutoff of 0.70. Three iterations to get here. The complex tiered version was slower and worse.

Health-SQL -- structured queries against a medical database when the question involves vitals, labs, or medications. No embedding needed. SQL is the right retrieval for structured data.

Referenced conversations -- when a query references a prior discussion, pull the full conversation rather than semantic fragments. Context about context.

Injected identity -- seven source files that define personality, philosophy, and memory. These load before every interaction. The model reads its own behavioral instructions.

Voice style -- nine calibrated writing voices built from real writing samples. The layer auto-selects based on detected intent. Mention a client, get business voice. Mention family, get personal voice.

Web search -- real-time information when the question requires current data that RAG can't contain.

The insight: each layer is a discrete engineering decision. You declare what context types a query needs, the registry assembles them, and the model sees a complete picture. This is architecture. Adding a seventh layer means writing one module, not rearchitecting the system.

The evidence

Same query, same model, three conditions. Condition A: careful prompt, no RAG context -- generic answer. Condition B: simple prompt, full RAG context -- specific, accurate answer. Condition C: careful prompt plus full RAG context -- marginally better than B.

The context did nearly all the work. The prompt refinement was marginal. Context is the 95%. The prompt is the last mile, the last 5%. So instead of wordsmithing your system prompt, build the retrieval pipeline.

What this means for enterprise

Every company building AI features is optimizing prompts when they should be engineering context. The differentiation is in the pipeline: what data you have, how you retrieve it, how you assemble it, how you manage the budget against a finite context window. The model is a commodity. The context is the product.

The teams that figure this out build systems where output quality improves by adding data sources, not by tweaking instructions. Every new data source you connect makes every query better -- not linearly, but combinatorially, because the model can now cross-reference across domains it couldn't before. That's the compounding advantage. It's found money, hiding in data you already own but never assembled correctly. The irony is that most companies are sitting on exactly the context they need. It's in their CRM, their email archives, their support tickets, their internal wikis. Nobody built the pipeline to put it in front of the model at the right time.