Episodic and Semantic Memory: The Two-Layer Model Behind Durable Agent Recall

Why a flat vector index fails agents, and how memnode separates episodic memory (what happened) from semantic memory (distilled facts), letting experience consolidate into evidence-backed knowledge over time.

memnode•June 4, 2026•9 min read

agent memorymemnodeepisodic memorysemantic memorydesign notes

Ask most agent memory systems what they are, and the honest answer is a vector index of text chunks. You embed an observation, you store the vector, and at recall time you fetch the nearest neighbors. It is a fine information-retrieval design. It is a poor model of memory. The reason is that it treats two fundamentally different kinds of knowledge as one undifferentiated soup: the raw record of what happened, and the distilled understanding of what is true. Cognitive science has a name for that distinction, and building it into the data model rather than ignoring it is the single most consequential design decision behind durable agent recall.

This article explains the two-layer model that memnode is built on: episodic memory and semantic memory. It covers what each layer is for, why collapsing them into one store fails agents in specific and recurring ways, how a fact is promoted from episodic experience into semantic knowledge, and why this separation is the foundation that canonization, consolidation, and trust all build on. It is part of the memnode design series, and it is the piece the rest of the series leans on.

The cognitive-science distinction, briefly

In humans, episodic memory is the memory of events: what happened, when, where, and in what context. It is time-stamped and source-rich. You remember not just that you locked the door, but that you locked it this morning, on your way out, while holding a coffee. Semantic memory is different. It is distilled general knowledge that no longer carries its original episode. You know that Paris is the capital of France, but you almost certainly cannot recall the specific moment you learned it. The fact survived; the episode that produced it evaporated.

The relationship between the two is directional and developmental. Semantic knowledge is built out of episodic experience over time. You do not learn a durable fact in one shot and file it away pristine. You accumulate episodes, and the stable regularity across them becomes a fact that stands on its own. memnode borrows exactly this shape, and refuses to borrow more. As the design notes put it, the goal is not to simulate the brain literally but to steal a few high-value principles: memory is not flat, strength changes with use, and recall is reconstructive rather than exact.

Why one undifferentiated store fails agents

A flat vector store of chunks fails not because similarity search is bad, but because it has no way to represent the difference between an observation and a conclusion. Concretely, here is what breaks when an agent relies on it.

It cannot tell a one-off from a rule. An agent that once saw a teammate use tabs in a single file, and an agent that has confirmed the repo enforces tabs across a thousand files, store identical-looking chunks. At recall both are just vectors near the query. The store has no field that says one is provisional and the other is settled.
It cannot model supersession. When a convention changes, the old chunk and the new chunk both sit in the index, equally retrievable, equally weighted by cosine distance. A vector database has no representation for "this fact replaced that one." The agent can and will recall the stale answer.
It loses provenance the moment it summarizes. If you do summarize chunks into facts, the summary becomes just another chunk. There is no link back to the events that justified it, so you cannot answer "why do you believe this?" and you cannot revisit the conclusion when one of its supporting events turns out to be wrong.
It conflates noise with knowledge in ranking. Raw observations are numerous, repetitive, and often contradictory. Distilled facts are few and stable. Mixing them in one ranked list means a dense cluster of near-duplicate raw notes can drown out the one consolidated fact that actually answers the question.

These are not edge cases. They are the everyday failure modes of agents that "have memory" but still recall the wrong thing. We unpack the broader version of this argument in Vector embeddings are the wrong default for AI agent memory, which is the right companion read if you want the positioning case. Here the point is narrower and structural: a single layer cannot carry two different epistemic statuses at once.

What each layer is good for

The episodic layer

Episodic memory is where everything an agent observes lands first. Every observation, in memnode, is recorded as episodic and provisional by default. The episodic node is deliberately rich and deliberately cheap to write: it captures the raw observation, the time it was recorded, the source it came from, the surrounding context, and an honest note of uncertainty. It does not pretend to be true. It is a faithful record that this was observed, not an assertion that this is the case.

The episodic layer is good at exactly the things a fact store is bad at. It preserves context and time, so you can reconstruct a session. It tolerates contradiction, because two episodes can disagree without either being deleted. And it is the permanent evidence base: when you later need to know why the system believes something, the episodes are still there to point at.

The semantic layer

Semantic memory holds the distilled, stable facts and conventions an agent should act on. The defining rule is that semantic nodes are never written directly. An agent cannot reach in and assert a fact. Semantic knowledge is produced only by consolidation, and every semantic node carries provenance links back to the episodes that justified it. A semantic fact is therefore always accountable. It can tell you which observations support it, how many, and when it was last reconsidered.

Because semantic nodes are scarce, evidence-backed, and stable, they are what recall should prefer when an agent needs to act with confidence. They carry status (provisional, supported, canonical, and so on) and they carry support and rebuttal weight, so the system can express not just "here is a fact" but "here is a fact the evidence currently endorses, with this much tension against it."

A concrete example: the convention observed once versus the durable fact

Take an agent working in a code repository. Early in a session it notices a single file using a particular import style. That is an episode: observed, time-stamped, sourced to one file, low confidence. It would be a serious mistake to immediately treat it as "the repo uses this import style" and start rewriting other files to match. In a flat store, nothing stops that mistake, because the lone observation is indistinguishable from settled knowledge.

Episodic observation (provisional, low confidence)
  observed: "import style A used in app/api/users/route.ts"
  source:   repo_convention
  anchor:   app/api/users/route.ts
  when:     session start

  ...many more episodes accumulate across files...

Semantic fact (produced by consolidation, never written directly)
  summary:    "this repo uses import style A"
  status:     supported -> canonical (only after the rules pass)
  provenance: [episode_1, episode_2, ... episode_N]
  supersedes: prior style fact, if a correction arrived

After the agent has seen the same import style across many files, the regularity is real. Consolidation clusters those episodes, distills a semantic fact, and links it back to every episode that supports it. Now the agent can act on "this repo uses import style A" with justified confidence, and crucially it can still answer "why?" by walking the provenance to the actual files. A repo convention observed once is a hint. A repo convention observed across the codebase is knowledge. The two-layer model is what lets the system hold both honestly and treat them differently.

How recall draws on both layers

Good recall is not a choice between episodic and semantic; it is a coordinated read across both. When a query arrives, memnode does not simply dump the top-k nearest vectors. It assembles candidates from embedding similarity, keyword match, and graph structure, then reconstructs a winning answer from them, returning the answer with confidence, supporting memories, and a justification rather than a raw list.

Within that, the layers play different roles. Semantic nodes provide the stable spine of the answer, the facts the agent should rely on. Episodic nodes provide the supporting evidence and the situational color, and they are what you fall back to when no consolidated fact yet exists. A task-context parameter, the recall mode, shifts how much weight goes to each: a precision-oriented mode leans hard on canonical semantic facts, while a recent-session mode pulls episodic detail forward. The separation is what makes this dial possible at all. You cannot ask a flat store to "prefer settled facts but show me the raw events behind them" if it has no notion of the difference. The deeper mechanics of how the graph contributes to candidate selection are covered in Spreading activation: graph-aware recall.

How a fact gets promoted from episodic to semantic

Promotion is the heart of the model, and it is deliberately not automatic. An agent observing something does not get to declare it true. The system observes, accumulates, and decides. Promotion happens during offline consolidation, the background job that runs when the system is idle and does the expensive work of turning experience into knowledge.

At a concept level, consolidation clusters related episodes by subject and time, looks for regularity and corroboration across them, merges duplicates, and where the evidence holds up, distills a semantic fact that links back to its supporting episodes. The new fact does not arrive canonical. It enters provisional and must earn higher status through repeated support, the absence of unresolved contradiction, and continued successful recall. There is a hard guarantee underneath this: provenance must exist before anything can become semantic. A fact with no episodes behind it cannot exist in the semantic layer.

New information always lands episodic. The agent observes; the system decides what to trust. Nothing becomes a believed fact by assertion, only by evidence that survives review.

We keep the exact thresholds and weighting deliberately out of this article, because the precise numbers are tunable and proprietary. What matters for the design is the shape: episodic experience flows in cheaply and noisily, consolidation distills the durable signal, and the gate to becoming a trusted fact is evidence, not recency or confidence asserted at write time.

Why this separation is the foundation for everything else

Once you have two layers with a directional promotion path between them, a series of harder capabilities become natural rather than bolted-on.

Canonization needs somewhere to apply its rules. A fact can only be promoted from provisional to supported to canonical if there is a distinct semantic layer carrying status in the first place. The full lifecycle is its own topic in Canonization: how a memory system decides what it believes.
Consolidation only makes sense as a transform from one layer to another. "Improving while idle" means episodes becoming knowledge offline, which presupposes that episodes and knowledge are different things.
Trust and explainability depend on provenance, and provenance only exists because semantic facts are required to cite the episodes that produced them. That is what lets you answer "why do you believe this?" instead of treating recall as a black box.
Correction and supersession can target the right thing. When a convention changes, the correction updates the semantic fact and follows the chain to the latest truth, while the episodes that supported the old fact remain intact as history.
Holding contradictions stays sane. Conflicting episodes do not corrupt a fact; they create tension that the semantic layer can represent and reason about, as covered in Belief networks: holding contradictions.

None of those work on a flat index, because each one requires the system to know whether it is looking at a record of an event or a claim about the world. The two-layer model is what supplies that knowledge.

The takeaway

Durable agent recall is not a retrieval problem you solve by buying a better vector database. It is a modeling problem. An agent that remembers well has to distinguish what happened from what is true, keep the evidence even after distilling the conclusion, and refuse to treat a one-off observation as a settled fact. memnode does that by keeping episodic and semantic memory as separate layers with a disciplined, evidence-gated promotion path between them. Everything else the system does, from canonization to trust, stands on that split. If you want the practical, end-to-end picture of wiring this into an agent, start with How to Give Your AI Agent Long-Term Memory and the series hub, How memnode evolved: from a graph database to a memory reasoning engine.