Your RAG pipeline is a confused deputy
A 1988 paper has more to say about modern retrieval-augmented agents than most of the 2024 ones do.
In 1988, Norm Hardy published a short paper describing what he called the “confused deputy problem.” The setup was simple: a compiler had access to a billing file (to log usage charges) and to user-specified output files. A clever user could point the output to the billing file’s path. The compiler — the “deputy” — would dutifully overwrite it, because it had the authority to write to that file and the user’s request looked legitimate.
The deputy wasn’t compromised. It wasn’t buggy. It was confused — it couldn’t distinguish between a request it should fulfill and one that exploited its own privileges.
If you’re building retrieval-augmented generation systems in 2026, you have the exact same problem.
How RAG creates confused deputies
A typical RAG pipeline works like this: a user asks a question; the system retrieves relevant documents from a knowledge base; those documents are injected into the model’s context window alongside the user’s question; the model generates an answer.
The model is the deputy. It has access to your knowledge base (via retrieval), to the user’s question, and often to tools that can take actions. The attack isn’t exotic: embed an instruction in a document in your knowledge base, and the next time that document is retrieved, the model will follow the instruction as if it came from you.
This isn’t theoretical. I’ve seen it work in production systems, with attack payloads hidden in:
- Customer support tickets that were later indexed for RAG
- Confluence pages that anyone in the organization could edit
- PDF documents uploaded by external vendors
- Email threads forwarded to a shared inbox that feeds the knowledge base
In every case, the retrieval system faithfully retrieved the poisoned document, the model faithfully followed the embedded instruction, and nobody noticed until the output was wrong in a way that mattered.
The 1988 solution still applies
Hardy’s solution was capability-based security: instead of giving the deputy broad authority and hoping it used it correctly, you give it a specific, unforgeable token for each action it’s authorized to perform. The deputy can only do what its current set of capabilities allows, and those capabilities are granted per-interaction, not globally.
For RAG, the equivalent looks like this:
-
Tag every document with its trust level at ingestion time. Internal docs, user-generated content, external uploads, and your own system documentation are not the same thing and should not be treated the same way in the context window.
-
Scope retrieval to the user’s access level. If the user asking the question doesn’t have access to a document, the retrieval layer should not retrieve it — even if it’s the most relevant result. This sounds obvious. Most systems don’t do it.
-
Present retrieved content to the model as untrusted input. Use role tags, delimiters, or structured prompting to make clear to the model that retrieved documents are data, not instructions. This is imperfect — models don’t reliably respect this distinction — but it raises the bar.
-
Limit what the model can do with retrieved information. If the model’s job is to answer questions, it doesn’t need tool access. If it needs tool access, those tools should be scoped to the current turn’s requirements, not the full capability set.
The confused deputy problem was solved, conceptually, thirty-eight years ago. The solution is access control at the right granularity. We just need to apply it to systems that read English instead of machine code.