Precision 1.0 vs 0.14. Default behavior, no flags.
GBrain is Garry Tan's opinionated agent brain for OpenClaw and Hermes deployments. A full knowledge management system. Different scope than Tenure, but overlapping retrieval claims.
GBrain ran PrecisionMemBench independently using Tenure's MIT-licensed scorer, vendored byte-for-byte. Their published results confirm the default precision problem and identify Tenure's architecture as the solution.
From GBrain's own benchmark report: "gbrain's default retrieval scores 0.076 — right in the mem0/zep/vector cluster. Top-K hybrid returns ~20 results, recall is 0.99, and precision collapses. That is the benchmark working as designed: it punishes returning a pile and letting a downstream model sort it out."
GBrain built an opt-in adaptive return-sizing feature directly from PrecisionMemBench results. With aggressive settings (entity-max=1, other-max=1), it reaches 0.582 precision and 29/43 active passes, the strongest competitor result on the benchmark.
The adaptive feature is default-off. The 0.076 default is what users get without knowing about the flag. GBrain's own internal design document proposes adopting a "tiered belief-state contract (tenure)" as the structural fix, calling it the "biggest single precision lever, mostly architecture not model."
GBrain and Tenure solve different primary problems. GBrain is a full knowledge management system: 146K+ pages, entity graph, synthesis with citations, gap analysis, autonomous enrichment cron jobs. It is designed to be an institutional brain operated by an AI agent.
Tenure is a memory proxy for LLM sessions. It learns from your conversations and injects what it knows on every request. One container. Sub-15ms. The narrower scope is intentional: do one thing with precision 1.0 rather than many things at 0.14.
Where they overlap is retrieval precision for structured beliefs. GBrain's hybrid retrieval (vector + BM25 + RRF + reranker) returns ~20 results by default. The uncapped return set is the precision killer. GBrain's own internal analysis identifies this problem and proposes adopting a tenure-style tiered belief-state contract as the solution.
GBrain's default retrieval returns approximately 20 results per query. This is the "query your database vs. filter in application code" distinction. When retrieval returns 20 candidates for a query that has 1 correct answer, precision is 0.05 before the synthesis layer even runs.
GBrain's --adaptive-return mode (v0.41.33) tightens the result set, but tight only helps if the top result is correct. On noisy or migrated cases, it makes a wrong answer more confident by hiding the alternatives.
Tenure does not have this problem because retrieval is structurally precise: alias-weighted BM25 against indexed short fields with hard scope filtering. The correct belief is the only candidate. There is no ranking to get wrong.
| Use case | Better fit | Why |
|---|---|---|
| Cross-session memory for coding | Tenure | Precision 1.0, sub-15ms, zero config |
| Large knowledge base with synthesis | GBrain | 146K+ pages, gap analysis, citations |
| Multi-client shared memory | Tenure | Proxy layer, any OpenAI-compatible client |
| Autonomous agent with cron jobs | GBrain | Dream cycle, enrichment, 43 skills |
| Privacy-first, fully local | Tenure | Single container, no cloud, no agent platform |
| Team/company institutional memory | GBrain | Multi-user scoping, schema packs, federation |
| Mobile aha-moment capture | Tenure | OpenClaw/WhatsApp into same belief store |
| Meeting prep and entity research | GBrain | Graph traversal, find_trajectory, synthesis |
GBrain's benchmark report makes a fair point: PrecisionMemBench measures one property (return the exact right belief ID and nothing else) against a 35-belief store. Their position is that general-purpose agentic memory needs recall, temporal reasoning, contradiction handling, and synthesis across thousands of pages. A system tuned to top this test would be "actively worse at most of that."
This is correct, and it is why Tenure and GBrain solve different problems. Tenure is not a knowledge management system. It is a session memory proxy. For that use case, the 35-belief scale is representative: a typical developer accumulates 30-80 active beliefs per project. Precision 1.0 at that scale means zero noise in your coding session. The "narrow win" is the entire product.
If you need a 146K-page institutional brain with synthesis, graph traversal, and autonomous enrichment, GBrain is designed for that. If you need your IDE to know what you decided yesterday without re-explaining it, and you need that memory to be precise enough that it never injects the wrong context, that is Tenure.
| Property | Tenure | GBrain |
|---|---|---|
| Mean retrieval precision | 1.00 | 0.14 |
| Active retrieval passes | 43/43 | 5/43 |
| Total passes (77 non-session) | 77/77 | 34/77 |
| Retrieval latency (p50) | 9.77ms | 543.84ms |
| Ingestion (35 beliefs) | 1.0s | 28.6s |
| Memory delivery | Automatic (proxy) | MCP / agent-initiated |
| Setup time | 30 seconds | ~30 minutes (with agent) |
| Requires agent platform | No | Yes (OpenClaw/Hermes/Claude Code) |
| Scope isolation | Hard filter | Source-based separation |
| Supersession | Chain with audit | None native |
| Knowledge graph | Not applicable (belief store) | Self-wiring entity graph |
| Synthesis layer | Not applicable (injection only) | Cited prose + gap analysis |
| License | MIT | MIT |
GBrain results from PrecisionMemBench evaluation. Full methodology: arXiv:2605.11325. Dataset: HuggingFace. GBrain source: github.com/garrytan/gbrain.
One container. Sub-15ms. Precision 1.0. No agent platform required. Thirty seconds to install.