The boundary between code and data is dissolving.
When your orchestration layer passes a user query through a retrieval system, into a context window, through a tool-calling loop, and back out as an API response - every hop is a potential injection point. Every context boundary is an untrusted interface.
I've spent the last year thinking about this problem not as a prompt-engineering challenge, but as a trust chain problem. And trust chains have a solved solution in traditional systems: cryptographic signatures.
The Problem with Agentic Trust
Traditional AppSec operates on a clear boundary: user input is untrusted, internal code is trusted. Defense-in-depth means validating inputs at system boundaries and relying on code integrity checks (signed binaries, dependency audits, SBOM) for internal components.
Agentic pipelines collapse this distinction. A tool-calling agent might:
- Receive a user query
- Retrieve relevant documents from a vector store
- Pass retrieved content as context to a sub-agent
- Call an external API with that context
- Synthesize results and write to a database
At step 3, the retrieved documents are data acting as instructions. An attacker who can influence the vector store - through poisoned embeddings, indirect prompt injection in crawled web content, or compromised tool outputs - can hijack the agent's behavior without ever touching your application code.
Provenance Chains as a Solution
The core idea: every piece of context that enters an LLM's context window should carry a verifiable provenance claim.
ContextToken {
content_hash: SHA256(content),
source_id: "vector-store-prod",
retrieved_at: 1731692400,
ingested_at: 1730000000,
signature: ECDSA(private_key, content_hash || source_id || retrieved_at)
}
Before the orchestration layer injects context into a prompt, it verifies:
- The signature was produced by a trusted source key
- The content hash matches the retrieved content (no tampering in transit)
- The ingestion timestamp is within acceptable bounds (no stale poisoned data)
- The source ID is on the allowlist for this particular agent and task
Crucially, this verification happens outside the model - in the orchestration layer, in code you control, before context ever reaches the context window.
Why This Matters More Than Prompt Engineering
Prompt-based defenses against injection ("ignore previous instructions") are fundamentally fragile. You're asking the model to enforce security policies using the same mechanism an attacker uses to override them.
Cryptographic verification doesn't ask the model to be the security boundary. It makes the orchestration layer the boundary - which is where security controls belong.
Implementation Considerations
Key management is the hard part. You need a PKI for your data pipeline: source systems hold signing keys, the orchestration layer holds verification keys. Key rotation needs to happen without disrupting running pipelines.
Granularity matters. You can sign at the document level, the chunk level, or the sentence level. Finer granularity means better security but higher overhead. For most use cases, chunk-level signing is the right tradeoff.
What it doesn't solve. Provenance chains verify where content came from and that it hasn't been tampered with. They don't verify that the source itself was trustworthy at ingestion time. A compromised crawler that signed malicious content before ingestion defeats this scheme. Your security is only as good as your signing key controls and ingestion pipeline security.
Where This Is Going
I expect this pattern to become standard in regulated industries first - financial services, healthcare, legal - where the traceability of AI reasoning carries compliance weight. It'll then propagate into general enterprise deployments as model-in-the-loop architectures become load-bearing.
The teams building this infrastructure now will have a significant head start.
Currently exploring this as a formal research project. If you're working on similar problems in regulated environments, I'd like to compare notes.