MRAgent Cuts Long-Memory Agent Queries To 118k Tokens In Benchmark Tests

BySendTech Times AI & Enterprise DeskNewsroom-edited, source-reviewed coverage|Source: VentureBeat

Newsroom brief

National University of Singapore researchers built MRAgent to reconstruct memory through a Cue-Tag-Content graph, with VentureBeat citing LongMemEval prompt use of 118k tokens per sample versus 632k for A-Mem and 3.26 million for LangMem.

Verified against source materialEdited by SendTech Times AI & Enterprise Desk

MRAgent Cuts Long-Memory Agent Queries To 118k Tokens In Benchmark Tests

Image source: VentureBeat

MRAgent Rebuilds Memory During Reasoning

Researchers at the National University of Singapore developed MRAgent, a memory framework for AI agents that replaces static retrieve-then-reason pipelines with dynamic memory reconstruction.

The framework lets an agent develop its memory path as it gathers evidence, rather than loading broad retrieval results into the model context at the start.

Classic vector-search and graph-traversal retrieval can fail on long-horizon tasks because the system cannot revise its search strategy during reasoning.

If the agent finds a missing cue, such as a date, person or place, a passive retrieval pipeline has no way to issue a new query based on that discovery.

Fixed similarity scores can also return surface-level matches that fill the context window with irrelevant material.

Cue-Tag-Content Narrows The Search Path

MRAgent treats memory as an interactive environment.

The backbone model explores candidate retrieval paths across a structured memory graph, evaluates intermediate evidence, infers new constraints and prunes branches that do not help answer the query.

The framework organizes memory through a Cue-Tag-Content mechanism.

Cues are fine-grained keywords or contextual attributes, Content stores the memory units, and Tags summarize relationships between cues and content.

The model can judge short relational summaries before spending tokens on heavier memory contents.

The authors illustrate the retrieval flow with a prompt about how Nate used prize money after winning a video game tournament.

The query starts with cues such as Nate, tournament and win.

MRAgent follows the victory-related tag, discards less relevant tournament memories, adds tournament earnings as a new cue and keeps searching until it has enough evidence to answer.

LongMemEval Shows 118k Token Prompt Use

The researchers tested MRAgent on LoCoMo and LongMemEval against standard RAG, A-Mem, MemoryOS, LangMem and Mem0.

The paper benchmarks cited by VentureBeat report that MRAgent outperformed every baseline across both models and all question types.

In the LongMemEval tests cited by VentureBeat, MRAgent used 118k prompt tokens per sample.

A-Mem consumed 632k tokens, while LangMem used 3.26 million tokens per query.

Runtime fell from 1,122 seconds to 586 seconds compared with A-Mem.

Memory Construction Remains The Deployment Work

The framework still requires the Cue-Tag-Content structure to be prepared before query time.

Developers must build an ingestion pipeline that processes raw interaction histories, extracts metadata and stores the result in a graph database.

The authors designed that construction phase to use LLMs for automated distillation rather than manual labeling.

Implementation work still includes background jobs, prompt templates and graph storage before query time.

The authors released code on GitHub.

Named production deployments, maintenance costs and customer validation remain undisclosed.

#agentic memory #AI agents #MRAgent #RAG

Anthropic’s Conway Points Claude Toward Always-On AI Agents

Anthropic is preparing a Claude expansion that includes Conway, Orbit, Operon, memory upgrades and multilingual voice mode. The move signals a shift from chat-based AI toward persistent assistants that can connect with external services and manage workspaces. Enterprises, developers and research teams could be affected if Claude becomes a broader agent platform.

NVIDIA Agent Toolkit Adds Runtime Controls But No Rollout Counts

NVIDIA is packaging Nemotron open models, NemoClaw blueprints and OpenShell runtime support for specialized enterprise agents, but the company did not disclose pricing, deployment dates or rollout counts.

Tencent Takes WorkBuddy AI Agent Global In Enterprise Productivity Push

Tencent Cloud launched WorkBuddy for overseas users after an earlier China rollout. The agent can run tasks through messaging apps and connect with GitHub, Jira, Google Drive, Gmail, Notion, and Slack. Miora and TokenHub show Tencent building a wider enterprise AI stack around agents, creative work, and model access.

Cloud & Data Centers

Vercel’s Eve Framework Tests Whether Agent Tools Can Escape Shadow AI

Vercel introduced the open-source eve agent framework and Passport controls for employee-built AI apps, putting its developer platform strategy up against enterprise concerns over unmanaged agents, data exposure and cloud cost premiums.