memnode
Sign InSign Up
Back to Articles
Featured

Memory Poisoning: The Agent Attack That Survives a Restart (OWASP ASI06)

Prompt injection ends when the chat closes. Memory poisoning persists across sessions and fires days later. OWASP made it ASI06 in 2026, and its core defense is provenance-tracked memory, exactly what an auditable memory layer provides.

memnode7 min read
memory poisoningOWASPagent securityprovenanceagentsmcp

Prompt injection has one comforting property: it ends when the conversation does. Close the chat, start a new session, and the malicious instruction is gone. Memory poisoning takes that comfort away. The attacker plants something in the agent's persistent memory, the session ends, and the payload waits. Days or weeks later an unrelated request retrieves it, and the agent acts on an instruction nobody typed in that conversation.

In 2026 this stopped being a fringe worry. OWASP added it to the Top 10 for Agentic Applications as ASI06: Memory and Context Poisoning, the persistent corruption of agent memory, embeddings, and shared context that influences future actions. OWASP rates it high persistence and very high detection difficulty, and research on the MINJA attack reported injection success above 95% against production agents. If your agent has memory, this is part of its threat model now, whether or not you modeled it.

Why it is worse than prompt injection

Both share a root cause: the model cannot reliably separate instructions from data. The difference is lifespan. A prompt injection is a single-turn problem with a single-turn blast radius. A poisoned memory is a stored asset the agent re-trusts every time it recalls it.

  • Persistence. The payload survives restarts, new sessions, even model upgrades, because it lives in your store, not the context window.
  • Delayed trigger. It fires on a future, unrelated query, so the harmful action is separated in time from the attack. That separation is what breaks incident response.
  • Trusted by default. The retrieval layer hands recalled memory to the model as ground truth. It never asks where the memory came from.

The vector-store trap

The most common agent-memory design is also the most exposed: embed everything the user and tools produce, then retrieve by similarity. That makes your vector index an open write surface. Any content the agent ingests, a web page, a document, a tool result, becomes a stored, retrievable memory with no record of whether the source was trusted. The same property that makes similarity search convenient, it returns whatever is close, is what surfaces the poison later. This is one more reason embedding-everything is the wrong default, which we argue separately in agent memory vs a vector DB.

The defense OWASP actually recommends

OWASP's mitigation for ASI06 is not a content filter bolted on at the end. It is layered, and the load-bearing layer is memory sanitization with provenance tracking: every memory carries where it came from, and recall is trust-aware, so the agent can refuse to act on a memory whose origin it cannot vouch for. In practice that means five controls:

  1. Provenance on every write. Record the source of each memory (which user, which tool, which document) so origin is queryable later. This is the lineage and provenance layer.
  2. Trust scoring by source. A fact the user stated directly is not the same as a sentence scraped from a page the agent happened to read.
  3. Supersession, not accumulation. When a fact changes, the old value becomes explicit, dated history, not a second retrievable memory competing with the new one.
  4. Trust-aware recall. Filter retrieval by origin and trust, so low-trust memories never reach a high-stakes action path unchallenged.
  5. Behavioral monitoring. Watch for the tell of a delayed trigger: an action that traces back to a memory written in an unrelated, much older session.

Notice that four of those five are properties of the memory layer, not of a separate security product. An agent whose memory is opaque similarity search cannot implement any of them. An agent whose memory records origin, supports supersession, and filters recall by trust gets most of ASI06's defense for free, because the audit trail the defense needs is the same audit trail good memory needs anyway. Where you run that store matters too: the hosted vs local choice changes who can write to it.

Where to go deeper

The offensive side, how an attacker crafts and lands the payload, is covered well in Austa's breakdown of prompt injection against persistent agent memory. The recall-quality side, why even un-poisoned memory layers surface the wrong thing, is in why memory layers recall the wrong thing. The two problems share a fix: a memory layer you can read, trust, and correct beats one you can only query by similarity.

Sources