Comparison

Tenure vs. Agentmemory

Agentmemory is a zero-database memory runtime for coding agents. Triple-stream retrieval (BM25 + vector + knowledge graph). Zero active retrieval passes. Drift score: 0.81.

TL;DR

  • Triple-stream retrieval sounds impressive. It achieves 0.17 precision and zero active passes.
  • Recall of 0.97 confirms the pattern: everything comes back. Retrieval is not selectivity.
  • Drift score 0.81. Off-topic turns contaminate subsequent retrieval heavily.
  • MCP-only interface: the model decides when to fetch memory. Tenure injects automatically on every request.
  • No scope isolation. No supersession handling. No per-turn audit trail.
Tenure precision
1.00
43/43 active passes
Agentmemory precision
0.17
0/43 active passes
Tenure latency
9.77ms
p50 retrieval
Agentmemory latency
82.28ms
p50 retrieval

More retrieval paths does not mean better precision

Agentmemory combines BM25, vector search, and a knowledge graph into a triple-stream retrieval pipeline with on-device reranking. The architecture adds complexity at every stage. The result: 0.17 mean precision with zero active passes on 43 precision-assertion cases.

The three streams all share the same fundamental limitation: they cannot structurally exclude beliefs that are semantically proximate but irrelevant. Adding more retrieval paths with the same limitation compounds noise rather than improving precision.

Tenure uses a single retrieval path (alias-weighted BM25 with hard scope isolation) and achieves 1.00 precision. The architectural difference is not "more paths" vs. "fewer paths." It is that Tenure's primary signal is orthogonal to semantic similarity: term matching over indexed aliases where the user coined the vocabulary. That signal is precise by construction in bounded vocabulary contexts.

MCP vs. proxy: when memory fires

Agentmemory exposes memory through MCP tools (memory_recall, memory_smart_search). The model must decide to call them. In practice, models frequently do not recognize when they need prior context, especially on questions where the user assumes shared history.

Tenure operates as a proxy between your client and your provider. Memory is injected into every request automatically, before the model sees the prompt. No tool call to trigger. No prompt engineering. No silent failures where memory was available but never fetched.

Full comparison

Property Tenure Agentmemory
Mean retrieval precision 1.00 0.17
Active retrieval passes 43/43 0/43
Total passes (77 non-session) 77/77 7/77
Retrieval latency (p50) 9.77ms 82.28ms
Session latency (p50) 47.79ms 98.49ms
Drift score 0.00 0.81
Memory delivery Automatic (proxy) MCP tool call (model decides)
Scope isolation Hard filter Project tagging (soft)
Supersession Chain with audit trail Decay scoring
Per-turn injection audit Yes No
Works across every client Any OpenAI-compatible MCP-capable agents only
Runs locally Always Always
External databases None (single container) None (single process)
License MIT Apache-2.0

Full methodology: arXiv:2605.11325. Dataset: HuggingFace. Live leaderboard: HuggingFace Spaces.

Precise by construction. Not by luck.

No MCP. No tool calls. Memory in context on every request. Zero config. Thirty seconds.