Comparison

Tenure vs. GBrain

Name: precisionMemBench
Creator: Tenure
License: https://opensource.org/licenses/MIT

Precision 1.0 vs 0.14. Default behavior, no flags.

GBrain is Garry Tan's opinionated agent brain for OpenClaw and Hermes deployments. A full knowledge management system. Different scope than Tenure, but overlapping retrieval claims.

9.77ms Retrieval latency

543.84ms Retrieval latency

TL;DR

GBrain is a knowledge management system (146K+ pages, entity graph, synthesis layer). Tenure is a memory proxy for LLM sessions.
Where they overlap (retrieval precision for memory): GBrain scores 0.14 mean precision, 5/43 active passes.
GBrain's uncapped hybrid retrieval (~20 results default) produces the noise floor. Adaptive-return mode helps but fails on migrated/noisy cases.
GBrain requires an agent platform (OpenClaw/Hermes) and significant setup. Tenure: one container, thirty seconds.
GBrain excels at synthesis and gap analysis across large corpora. Tenure excels at precise, low-latency belief injection for coding sessions.

Tenure precision

1.00

43/43 active passes

GBrain precision

0.14

5/43 active passes

Tenure latency

9.77ms

p50 retrieval

GBrain latency

543.84ms

p50 retrieval

GBrain's own assessment

GBrain ran PrecisionMemBench independently using Tenure's MIT-licensed scorer, vendored byte-for-byte. Their published results confirm the default precision problem and identify Tenure's architecture as the solution.

From GBrain's own benchmark report: "gbrain's default retrieval scores 0.076 — right in the mem0/zep/vector cluster. Top-K hybrid returns ~20 results, recall is 0.99, and precision collapses. That is the benchmark working as designed: it punishes returning a pile and letting a downstream model sort it out."

GBrain built an opt-in adaptive return-sizing feature directly from PrecisionMemBench results. With aggressive settings (entity-max=1, other-max=1), it reaches 0.582 precision and 29/43 active passes, the strongest competitor result on the benchmark.

Tenure

1.00

43/43 · default behavior

GBrain adaptive (tight)

0.58

29/43 · opt-in, default OFF

GBrain default

0.08

0/43 · out of box

The adaptive feature is default-off. The 0.076 default is what users get without knowing about the flag. GBrain's own internal design document proposes adopting a "tiered belief-state contract (tenure)" as the structural fix, calling it the "biggest single precision lever, mostly architecture not model."

Different tools, overlapping claims

GBrain and Tenure solve different primary problems. GBrain is a full knowledge management system: 146K+ pages, entity graph, synthesis with citations, gap analysis, autonomous enrichment cron jobs. It is designed to be an institutional brain operated by an AI agent.

Tenure is a memory proxy for LLM sessions. It learns from your conversations and injects what it knows on every request. One container. Sub-15ms. The narrower scope is intentional: do one thing with precision 1.0 rather than many things at 0.14.

Where they overlap is retrieval precision for structured beliefs. GBrain's hybrid retrieval (vector + BM25 + RRF + reranker) returns ~20 results by default. The uncapped return set is the precision killer. GBrain's own internal analysis identifies this problem and proposes adopting a tenure-style tiered belief-state contract as the solution.

The uncapped retrieval problem

GBrain's default retrieval returns approximately 20 results per query. This is the "query your database vs. filter in application code" distinction. When retrieval returns 20 candidates for a query that has 1 correct answer, precision is 0.05 before the synthesis layer even runs.

GBrain's --adaptive-return mode (v0.41.33) tightens the result set, but tight only helps if the top result is correct. On noisy or migrated cases, it makes a wrong answer more confident by hiding the alternatives.

Tenure does not have this problem because retrieval is structurally precise: alias-weighted BM25 against indexed short fields with hard scope filtering. The correct belief is the only candidate. There is no ranking to get wrong.

When to use which

Use case	Better fit	Why
Cross-session memory for coding	Tenure	Precision 1.0, sub-15ms, zero config
Large knowledge base with synthesis	GBrain	146K+ pages, gap analysis, citations
Multi-client shared memory	Tenure	Proxy layer, any OpenAI-compatible client
Autonomous agent with cron jobs	GBrain	Dream cycle, enrichment, 43 skills
Privacy-first, fully local	Tenure	Single container, no cloud, no agent platform
Team/company institutional memory	GBrain	Multi-user scoping, schema packs, federation
Mobile aha-moment capture	Tenure	OpenClaw/WhatsApp into same belief store
Meeting prep and entity research	GBrain	Graph traversal, find_trajectory, synthesis

The "narrow-win" counterargument

GBrain's benchmark report makes a fair point: PrecisionMemBench measures one property (return the exact right belief ID and nothing else) against a 35-belief store. Their position is that general-purpose agentic memory needs recall, temporal reasoning, contradiction handling, and synthesis across thousands of pages. A system tuned to top this test would be "actively worse at most of that."

This is correct, and it is why Tenure and GBrain solve different problems. Tenure is not a knowledge management system. It is a session memory proxy. For that use case, the 35-belief scale is representative: a typical developer accumulates 30-80 active beliefs per project. Precision 1.0 at that scale means zero noise in your coding session. The "narrow win" is the entire product.

If you need a 146K-page institutional brain with synthesis, graph traversal, and autonomous enrichment, GBrain is designed for that. If you need your IDE to know what you decided yesterday without re-explaining it, and you need that memory to be precise enough that it never injects the wrong context, that is Tenure.

Full retrieval comparison

Property	Tenure	GBrain
Mean retrieval precision	1.00	0.14
Active retrieval passes	43/43	5/43
Total passes (77 non-session)	77/77	34/77
Retrieval latency (p50)	9.77ms	543.84ms
Ingestion (35 beliefs)	1.0s	28.6s
Memory delivery	Automatic (proxy)	MCP / agent-initiated
Setup time	30 seconds	~30 minutes (with agent)
Requires agent platform	No	Yes (OpenClaw/Hermes/Claude Code)
Scope isolation	Hard filter	Source-based separation
Supersession	Chain with audit	None native
Knowledge graph	Not applicable (belief store)	Self-wiring entity graph
Synthesis layer	Not applicable (injection only)	Cited prose + gap analysis
License	MIT	MIT

GBrain results from PrecisionMemBench evaluation. Full methodology: arXiv:2605.11325. Dataset: HuggingFace. GBrain source: github.com/garrytan/gbrain.