Agent Memory Frameworks in 2026: Letta, Mem0, Graphiti, Cognee - and Where a Durable, Inspectable Layer Fits

The honest landscape builders share to decide what to adopt. Letta (ex-MemGPT), Mem0, Graphiti/Zep, and Cognee compared fairly, the local-first wave including Mnemosyne, the land grab with no breakout, and why a durable inspectable layer sits beneath the frameworks rather than competing as another one.

memnode•June 14, 2026•13 min read

agent memorylettamem0graphitizepcogneemnemosyneframeworksmemnode

If you are choosing an agent memory tool in 2026, the hard part is not finding options. It is that there are too many, they describe themselves in overlapping language, and a new one shows up roughly every week. In the last thirty days alone the Show HN feed carried a local-first encrypted memory in Rust, a TypeScript SDK for self-improving memory, a vocabulary-driven graph memory, a local document memory, and several more, none of which broke out. The r/LLMDevs thread that tried to map the whole field was titled, fittingly, landscape of second brain and memory solutions for AI native workflow, and the comments are mostly people asking the same question you are: which of these do I actually adopt, and what is the difference. This piece is the honest version of that map. It covers the four frameworks people keep naming - Letta, Mem0, Graphiti, and Cognee - plus a couple of the smaller entrants worth knowing, and it is upfront about where a durable inspectable layer like memnode fits. The goal is to help you decide, not to win an argument, so this is meant to be fair rather than a hit piece.

One framing up front, because it determines everything below. Most of these tools are frameworks: they give your agent a memory abstraction and an opinion about how memory should be organized, and they sit in your application's control flow. A durable layer is a different thing. It is the store underneath, the part that persists facts, tracks where they came from, and lets you correct them. The interesting question is not framework versus framework. It is where the durable, inspectable layer sits relative to whatever framework you pick.

The land grab, and why nothing has broken out

It is worth naming the market dynamic, because it explains the paralysis. Agent memory is in a land-grab phase. The primitive is obviously valuable, the barrier to a first version is low - a SQLite file and an embedding call gets you a demo - and so dozens of teams have shipped a memory layer in the hope of becoming the default. The thirty-day evidence is a clean snapshot: a steady drip of Show HN memory launches, each with single-digit to low-double-digit upvotes, none reaching escape velocity. The honest read is that the category has many credible options and no consensus winner, which is exactly the condition under which a buyer should optimize for things that will still matter in two years - your ability to inspect, correct, and move your data - rather than for whichever launch is loudest this week.

In a land grab with no breakout, the safest bet is not the loudest framework. It is the layer whose data you can read, correct, and carry out the door.

Letta: the agent that manages its own memory

Letta is the framework formerly known as MemGPT, the name from the UC Berkeley research that popularized the idea of an LLM managing its own context the way an operating system manages memory. The MemGPT name now refers to the original agent design pattern, and Letta is the platform built on it. The defining idea is self-editing memory: the agent uses tools to read and write its own memory tiers, rather than a separate retrieval pipeline doing it behind the agent's back. The OS-inspired tiers are a main context the model sees every turn, a recall storage of recent history it can search, and an archival storage for long-term knowledge.

What Letta gets right is agency and transparency of the agent loop: because the agent edits memory explicitly through tools, you can see the decision to remember as part of the reasoning trace rather than as an opaque embedding match. The trade-off is that Letta is opinionated about your agent architecture; you are adopting a framework with a point of view about how the loop should run, not just a store you call. If that opinion matches yours, it is a strong fit, and it is one of the most mature options in the field.

Mem0: breadth, vector plus graph

Mem0 is the pragmatic, widely adopted memory layer that bets on breadth. Its architecture is dual-store: a vector database handles semantic search and an optional knowledge graph captures entity relationships. When you add a memory, Mem0 embeds it into the vector store and, if the graph layer is enabled, extracts entities and relationships into the graph. The appeal is that it slots into an existing stack quickly and covers the common case - remember facts about a user, retrieve the relevant ones - with a clean API and a managed option.

The honest framing, which Mem0's own ecosystem uses, is that Mem0 bets on breadth while graph-native tools bet on depth. On the public LongMemEval comparisons that have circulated, Mem0 lands in the middle of the pack rather than at the top, which is consistent with a breadth-first design: it is good across a wide range of cases without being the deepest at temporal reasoning. For many teams that breadth is exactly the right trade. We did a closer integration look in the Mem0 plugin comparison.

Graphiti and Zep: time as a first-class dimension

Graphiti is an open-source temporal knowledge graph engine, and Zep is the memory and context platform built on top of it. The distinctive bet here is depth on one axis: time. Instead of storing a fact and overwriting it when it changes, Graphiti stores facts as graph edges with explicit validity windows. The standard example is a subscription tier: user is on the Pro plan is not deleted when they upgrade; it is marked invalid after the upgrade timestamp, and a new edge is created with its own valid-from date. The graph keeps the full history of what was true when, which lets an agent reason about change rather than only about the present.

Zep's temporal architecture was written up in a 2025 arXiv paper, and on the LongMemEval benchmark Zep has reported notably stronger numbers than a plain vector approach. The cost of that depth is that you are adopting a graph model and a temporal discipline; it is more machinery than a key-value recall, and it is the right machinery specifically when your domain has facts that change over time and you need the agent to know the difference between then and now. For why graph-aware recall beats flat similarity in general, see spreading activation over a typed memory graph.

Cognee: build a knowledge graph from your data

Cognee is an open-source memory layer that turns ingested data into a self-hosted knowledge graph. Its pipeline is summarized as ECL - extract, cognify, load - and the cognify step is where it does ontology-based entity validation rather than treating knowledge as a bag of embedded chunks. The pitch is that documents become both searchable by meaning and connected by relationships that evolve as the underlying knowledge does, and it is built to plug into common vector databases (Qdrant, Weaviate, LanceDB) and graph databases (Neo4j, Kuzu) rather than forcing one store.

Cognee is strongest when your memory is really a corpus - documents, tickets, a codebase - that you want structured into a graph an agent can reason over, which puts it adjacent to graph RAG more than to per-user conversational memory. As with Graphiti, the depth comes with the cost of running and maintaining a graph pipeline. If your need is small structured per-user state, this is more machinery than the job calls for; if your need is a structured, queryable model of a large body of knowledge, it is squarely aimed at that.

The smaller entrants: Mnemosyne and the local-first wave

Below the four headliners is a wave of lighter, often local-first projects, and one worth naming because it has been visibly shipping is Mnemosyne, a SQLite-backed, zero-dependency memory layer built first for the open-source Hermes Agent but usable with any framework. Its public materials describe a tiered organization (a hot working tier and an episodic tier among them) and strong reported scores on memory benchmarks while staying in a single SQLite file with no cloud dependency. The broader local-first wave - the Rust-and-MCP entrants, the TypeScript SDKs - shares a thesis worth taking seriously: a lot of agent memory does not need a cloud service, and keeping it local is cheaper, faster, and easier to inspect.

We are deliberately not ranking these on benchmarks. The benchmark numbers that circulate are real but narrow, and this cluster has a history of inflated launch claims; treat any single headline score with caution and test on your own workload. The signal worth taking from the smaller entrants is directional, not a leaderboard: local-first and inspectable is a live, credible design direction, not a fringe one.

Where a durable, inspectable layer fits

Here is the positioning, stated plainly so it is not mistaken for a competitive claim it is not making. memnode is not trying to be a fifth framework in this list. It is aimed at the layer beneath the frameworks: the durable store that persists facts, tracks their lineage, and lets you correct them. A useful way to see the difference is the four-operation loop we described in stop putting agent memory in the context window: record, recall, lineage, correction. Most frameworks give you record and recall well. The two operations that get thin as you go up the framework stack are lineage and correction.

Three properties are the wedge, and none of them is a benchmark:

Lineage. Any recalled answer can be traced back to the specific memories that produced it. Show-me-the-evidence is answerable, not a matter of trusting a similarity score. This is the case made at length in lineage and provenance in agent memory.
Correction. A wrong fact is corrected in place, the correction is itself recorded with provenance, and the bad value stops being recalled without being erased from history. The store improves with use instead of accumulating contradictions, the failure mode dissected in belief networks for agents.
Inspectability. You can read the memory. A returning user's state, the facts the agent believes, the status of each one - all of it is legible, not a blob of opaque vectors. When something goes wrong, you can debug it by reading, which is the opposite of how an embedding index fails.

These are not in tension with the frameworks. You can run Letta's loop, or Mem0's API, or a Graphiti graph, on top of a durable layer that gives you lineage and correction underneath, the way you would run an application framework on top of a database you can still query directly. The frameworks compete on the agent abstraction. The durable layer competes on whether, six months later, you can still answer where a belief came from and fix it when it is wrong.

How to actually choose

Strip away the marketing and the decision comes down to the shape of your memory, not the popularity of the tool:

Want the agent to manage its own memory as part of its reasoning? Letta is the most coherent answer, with the caveat that you are adopting its agent architecture.
Want a broad, fast-to-adopt memory API over vector plus optional graph? Mem0 covers the common case well and integrates quickly.
Need to reason about facts that change over time? Graphiti and Zep make time a first-class dimension; this is their genuine edge.
Turning a corpus of documents into a queryable knowledge graph? Cognee is built for exactly that.
Want local-first, no cloud dependency, easy to inspect? The lighter SQLite-backed entrants like Mnemosyne are a real option, especially for single-machine or self-hosted agents.
Care most about being able to trace and correct what the agent believes, under whatever framework you pick? That is the durable, inspectable layer, and it is the part that does not go obsolete when the framework leaderboard reshuffles next month.

The land grab will keep producing new entrants, and most of them will not break out. The question that survives the churn is not which framework won. It is whether, two years from now, you can still read your agent's memory, trace why it recalled something, and correct it when it is wrong. Pick the framework that fits your loop, and put a durable, inspectable layer underneath it. For the design reasoning behind that layer, the full story is in how memnode evolved from a graph database to a memory reasoning engine, and the upstream argument for getting memory out of the context window in the first place is in the companion piece.