Memory Models: Towards Agents That Learn

Memory is the key to unlocking AI agents that are not only teachable, but also capable of continually learning and self-improving. Techniques for continual learning in token space have seen widespread real-world adoption, with agentic memory management, agent dreaming, and skill learning becoming central features in agent harnesses such as Letta Code, Claude Code, DeepAgents, and OpenClaw. Token-space learning takes advantage of the in-context learning capabilities of modern LLMs, which continue to improve across model generations, and provides an incredibly data-efficient mechanism for continual learning.

However, while LLMs are excellent at using context, they do not reliably create durable context for future tasks. This can lead to memory rot over time for long-lived agents, and limit the extent to which agents can learn independently from available context or past experience. We believe that advancing and scaling token-space learning will require a new type of specialized memory models, specifically designed for creating memory and expertise that transfers across models.

The memory & learning problem for agents

Through memory and learning, agents should be able to develop expertise over time based on their own unique experiences and the context made available to them. Without the capability to learn from experience, agents have clear limitations in use cases such as:

Adapting to proprietary datasets and workflows: Even when agents have access to proprietary data and other private knowledge, they have a limited ability to condition their behavior on it. While forms of sleep-time compute like self-study and skill learning allow for turning private datasets into compact reusable contexts, existing methods are too expensive to scale to large corpuses and suffer from issues such as overgeneralization and memory rot.
Embedding human expertise for digital cloning: Agent learning today is insufficient to fully absorb the expertise and values of human teachers or to create a true “digital clone” of humans. Approaches like skills and persona prompting can embed specialized behavior in agents, but agents today still remain far less capable than humans at learning and adapting from natural language feedback and real-world interactions.
Deep personalization: Even with clearly stated preferences, feedback, or other personal information, agents struggle to recall relevant context and generate new memories. This constrains how personalized agent experiences can become for individual users, as agents are limited in their ability to learn from their interactions with users despite an abundance of interaction data.

Solving the memory and learning problem unlocks a new scaling axis for machine intelligence, since agents can continually improve by scaling compute on context. The more time an agent spends refining their memory and context, whether through self-study or through human feedback, the more expertise they develop, better preparing them for the future.

What will continual learning for agents look like?

We believe that expertise generated through life-long continual learning will be encoded as token-space representations that many different downstream models can consume. The most valuable agents in the future will be multi-model: agents that can carry their memory and learning across model generations (and even new model architectures) to combine frontier intelligence with compounding expertise. Agents will also be compound systems composed of several underlying models working in orchestration, including general-purpose coding models and low-latency models for real-time interaction. In this multi-model future, token-space representations serve as the vital connective tissue between all models in the greater system - and the primary mechanism for continual learning.

Self-improving agents will eventually be driven by a compound system of models (which all take tokens as input) and memory models.

Multi-model learning

Any agent that is learning across many tasks must be able to learn across model generations to maintain frontier performance. What constitutes the “best” model at any given point in time varies across task categories, with different models holding specific titles (e.g. GPT-5.5 for backend engineering, GLM 5.2 for frontend design). Progress in frontier models is accelerating, with the time between frontier model releases logarithmically decreasing.

At the same time, the time horizon of tasks that can be completed by models is exponentially increasing. Recent data suggests that the length of software tasks is doubling roughly every five months. This implies that models will be able to solve tasks taking one week by late 2026, and tasks taking one month by late 2027 to mid 2028.

model release gapagent task horizon

Model vs agent lifespans — the cadence of new frontier model releases versus the length of coding tasks an agent can complete, projected to cross in late 2026. Task-horizon trend from METR; model-release cadence from Artificial Analysis.

As agents become capable of continual learning, their memory will quickly become a more valuable component of the agent than any individual underlying model. While general-purpose models are replaceable and rapidly outdated, the agent's memory will be something that is much more costly to lose, as it is specific to the user or organization as well as something that compounds in value with time and invested resources (e.g. by scaling human teaching and sleep-time compute).

Continual learning through in-context learning (ICL)

Token-space learning will also become more powerful as ICL capabilities of models continue to improve. While models are a depreciating asset, learning and memory represented in token space is an appreciating asset: as models improve, representations can be transferred to new models that leverage the learnings even more effectively through more powerful ICL capabilities. We anticipate that models will continue to be adapted at test time through ICL, but with significant advancement in how learned context (i.e. token-space representations) is generated and optimized.

The level of model adaptation that can occur through just ICL is extraordinary, and continues to scale with each model generation. ICL is perhaps the most bizarre and powerful emergent property of LLMs: through just ICL, even models only trained on data before 1930 can learn to program in Python (created in 1991). Recent research hypothesizes that the transformer actually learns to approximate gradient descent within its forward pass of inference, explaining the power of transformers to learn in-context. We are only just beginning to understand how to optimize adaptation of models through token-space learning.

As the capabilities of ICL continue to scale, future self-improving agents will learn extremely complex mechanisms for token-space learning that go beyond human comprehension, e.g. with learned in-context memories using latent representations, new languages, and extreme compression. Each new generation of frontier models is able to use context and token-space representations more effectively than the last, increasing the baseline value and overall effectiveness of memories stored in context - we call this the bitter lesson of ICL.

Learning in token space

Despite token-space learning's clear advantages for multi-model life-long continual learning, today's techniques for actually creating these token-space representations are fairly limited and rudimentary.

Limitations of general-purpose models today

Current frontier models are insufficient for powering agents that can continually learn from their own experience. Most data today is centered on task-scoped sessions; although frontier models are highly adaptable through prompting, placing them in a stateful setting inevitably hits limitations over time (as shown in Recovery-Bench). Our research has shown that memory formation through general-purpose models is not capable enough to power long-term self-improvement:

Memories become generic and lossy after repeated refinements
Memories are overly specific rather than generalizable learnings
Memories fail to consistently adapt the behavior of agents across sessions with the same model, and across different models

These limitations are unsurprising, since the current paradigm of agents and RL environment frameworks does not optimize for cross-task, cross-session, and cross-model learning, or for large-scale learning across context.

Optimizing for the present versus the future

There is a fundamental tension between agents rewarded only for task completion, or task-level context management, versus for continual learning across many tasks. Doing the task versus learning are often conflicting goals. Take coding agents as an example: there is a difference between successfully patching a single bug and understanding what parts of a large codebase are particularly bug-prone and likely to require additional testing.

LLMs and agents today are focused on the former goal of optimizing for task completion. Recent progress in model capability has largely stemmed from RL post-training on environments which generally have a stateless assumption. LLMs today are also not trained to think of themselves as having a “future,” to avoid self-preservation behavior that can be problematic from a safety perspective. As such, techniques for context management are focused on extending individual task horizons rather than cross-task learning. For example, long context (e.g. 1M or 10M token context windows), or methods for using context more efficiently during a task, like self-compaction, context folding, and Recursive Language Models (RLMs) help extend the task horizon for increasingly challenging coding tasks, but are not mechanisms for true continual learning.

Meta-learning: Learning to learn

The fundamental tension between learning individual tasks and learning to learn across tasks has been formalized in machine learning research as "meta-learning", with roots going back to the 1980s and 1990s. In meta-learning, the training objective is the act of learning itself - in essence, is it possible to train a machine to learn its own learning algorithm, such that it can adapt to new environments at test time?

Meta-learning has been studied under the lens of reinforcement learning as "meta-RL", where an agent is trained in a nested loop such that it learns to adapt to new environments at test-time. The actual learning at test time can happen through gradient updates or in a gradient-free manner. In either case, meta-RL optimizes the learning process itself, rather than optimizing purely for task reward.

Memory models: Learning to learn in token-space

We believe continual learning will be solved by creating agents that learn to learn in token-space via memory models, which are models specialized in the generation and curation of context. Memory models are optimized end-to-end through meta-RL to create token-space memories that enable the agent to successfully learn across tasks by creating durable memories that improve future performance.

Standard RL versus memory-native RL

The reward structure in the training loop of memory models resembles that of meta-RL rather than the standard RL setting. Rather than using RL to optimize a model for task performance directly, we instead use RL to optimize a compound system where a specialized memory model is responsible for cross-task and cross-model learning. As in meta-RL, this is a nested learning problem: an inner loop in which the agent draws on its token-space memory to act across a series of tasks, and an outer loop in which RL optimizes the memory model's weights from downstream reward.

Standard RL trains an agent to maximize task reward directly: a task $t$ is given to a standard agent harness $A$ with model parameters $\theta$ , producing experience $e$ , outcome $o$ , and reward $r$ . Memory-native RL instead trains an agent to produce token-space memories that improve future task performance: the memory-native harness $L$ is conditioned on token-space memory $M$ , memory model parameters $\mu$ , and task $t$ , producing a sequence of experiences $e_1, \ldots, e_N$ that update memory across an open-ended series of tasks.

ttask

Astandard agent harness

eexperiences

θgeneral-purpose model parameters

ooutcome

Lmemory-native agent harness

rreward

μmemory model parameters

Mtoken-space memory

Standard RL trains an agent to maximize task reward directly. Memory-native RL instead trains an agent to produce token-space memories that, when conditioned on, maximize reward on future tasks — meta-learning in token space.

In the standard RL setting, learning happens primarily in weight space. In the memory-native RL setting, the optimization target includes the generation and curation of token-space memory such that future agents benefit from prior learnings.

Distillation of memory into the agent harness

The harness itself provides an additional surface for learning: the memory model can directly modify the configuration of the agent harness for additional control over future outcomes. Harnesses like Letta Code and Pi are designed to be self-modifying through mods and extensions, and allow modifying many components of the agent execution loop including toolsets, pre/post-tool hooks, permissions, and even executing arbitrary code at specific control points. Recent work on "meta-harnesses" has shown that harness self-modification can unlock significant performance improvements on standard benchmarks such as Terminal-Bench, and we believe that the harness should be viewed as a form of memory that should be learned and adapted over time through distillation of token-space memory into harness controls.

Distillation of memory into the model weights

Because memory models update an agent's memory in token space (e.g. in context repositories or AGENTS.md files), they can be easily integrated with closed model APIs where model weights are unavailable. However, if the weights of the underlying models are accessible, it is possible to combine continual learning in token space and continual learning in weight space through distillation of token-space memories into model parameters. Distillation of memories from tokens to weights can provide additional gains in personalization and efficiency.

Scaling compute on context

Our recent work on sleep-time compute introduced the notion of scaling compute on context: allowing models to think offline about contexts before queries are presented. Memory models leverage this new scaling direction by training on it as an optimization objective: memory models are trained to generate memories during sleep-time (aka "agent dreaming") which improve the performance of the agent at test-time.

The more powerful the memory models become, the more gain will be had from scaling compute on context: sleep-time compute forms the backbone of the self-improvement loop, and the capabilities of the agent monotonically improve with additional memory model inference. Eventually, we expect the vast majority of the world's inference load will be driven by sleep-time compute.

Building machines that learn

Our mission at Letta is to build machines that learn: agents capable of meaningful learning and adaptation over infinite time horizons, sharing context and memory across generations of different models and unique architectures.

We believe that scaling inference for context learning is the next major scaling direction in the modern AI era, and that memory models will be the key to unlocking performant and economical continual learning at scale. The only way to unlock true recursive self-improvement is to make our agents capable of remembering and learning through persistent representations that live outside any model's weights.

We're excited to share more on our work on memory models in the coming weeks. If you're a researcher or engineer interested in agent memory and life-long continual learning, we'd love for you to join us.