How Bilt Built a Million-Agent Recommendation System with Letta

What Letta allowed us to do is rapidly iterate on this system. We changed our design multiple times throughout the process without having to throw everything away with each new iteration.
Personalizing with AI at Scale
From basic scoring algorithms to millions of AI agents serving personalized, specific recommendations to end users, Bilt transformed their neighborhood commerce platform by building an intelligent recommendation system with Letta. Their journey reveals how memory-augmented agents can deliver truly personalized experiences at unprecedented scale.

The Challenge: From Simple Scores to True Personalization
"We wanted to use generative AI for this. The problem was figuring out how to take all the transaction data we have and feed that into a large language model in a way that would be cost effective and performant."
Kosta Krauth, CTO at Bilt, faced a challenge that became familiar to many companies in 2024: how to leverage the power of generative AI for personalization without breaking the bank or sacrificing performance.
Bilt, which describes itself as a "housing and neighborhood commerce network," has built something unique in the market. Starting with a credit card product that rewards users for paying rent, it evolved into a comprehensive ecosystem connecting residents with local merchants, property managers, and curated neighborhood experiences.
"We realized that if we connect partners in the Bilt network—buildings across the U.S., local merchants, and renters—we can combine them into a really rich and local neighborhood-first program," Krauth explained.
But with such a rich ecosystem came a natural problem: personalization. As their network of merchants and offers grew, they needed to figure out what was relevant to whom. Their initial approach used a sophisticated scoring system based on user engagement patterns. "We had a model that we built ourselves that was based around awareness, engagement, and interest," Krauth said. "We analyzed user engagement with the platform and calculated scores based on things that you saw, you interacted with or actually engaged with."
While this system provided a solid foundation, the team recognized an opportunity to leverage the creative capabilities of generative AI for even more nuanced personalization. Andrew Fitz, lead engineer for the AI personalization project at Bilt, said, "Looking at the advancements and cost reductions in LLMs, we recognized the opportunity to deliver personalized recommendations at scale."
The team wanted to move beyond simple scoring to truly creative, contextualized recommendations that could understand the nuances of how users interact with their platform and neighborhood merchants.
Exploring Fine-Tuning: Learning What Doesn't Work
Before discovering Letta, the Bilt team thoroughly explored fine-tuning as a potential solution for contextualizing models with their data.
"We spent several months fine-tuning a model with the knowledge of our company, the types of benefits we offer, and campaigns," Krauth explained. "[We wanted to] create something that would be very capable at contextualizing different offers around our company knowledge."
However, this approach revealed important limitations. "The fine tuned model was very matter-of-fact and lost its creativity," Krauth said. "We wanted deeper inferences and more creative takes on how these benefits are all related to each other."
This experience clarified what they actually needed: a system that could intelligently process user transaction patterns and platform engagement while maintaining the creative capabilities that make LLMs so powerful for personalization.
Building with Memory: The Letta Solution
The breakthrough came when Krauth remembered reading about MemGPT on Hacker News around the time Letta was launching as a company. "It just popped into my head while we were discussing handling long context windows and how to get them under control. I was like, 'Oh, I heard about this thing; let's have a look.'"
Letta's approach to memory management helped solve their core challenge: how to feed rich transaction and engagement data into language models cost-effectively while maintaining performance.
"Memory blocks are at the core of what drives the system and what gets the most value out of Letta," Fitz explained. "Because we need to optimize cost and speed, we use Anthropic's batch capability on memory agents to take relatively unstructured data and write that into shared memory blocks."

Their architecture became elegantly simple yet powerful: use strong models asynchronously in batch mode to create rich memory summaries, then use these memory blocks with faster, cheaper models for real-time inference.
"The system extracts the bits and pieces and finds trends over time and how those shift," Fitz said. "Then we're able to use that memory block on other agents with faster models, cheaper models to do real-time inference."
This approach allowed them to get the best of both worlds: the deep understanding of powerful models and the speed and cost-effectiveness needed for real-time recommendations.
The Power of Contextual Understanding
What set their Letta-powered system apart from traditional recommendation engines was its ability to understand context and nuance in ways that simple collaborative filtering couldn't match.
"[LLMs can understand] contextualized transactions better than traditional approaches," Fitz explained. "It can notice patterns and understand that I get certain fast casual lunches a lot. But it also recognizes that this is likely a matter of convenience at certain times of the day."
This contextual understanding enabled them to make sophisticated distinctions. "That's what allows us to power recommendations where we can choose what type of recommendations we want to show. Whether that's date night recommendations, casual lunch recommendations, we're able to leverage that data to surface the most relevant options for each placement."
The system could also provide explanations for its recommendations—something that proved valuable both internally and for users. "We do give people descriptions of why we're recommending this particular restaurant. These descriptions can be quite specific. We're trying to make sure that they're not overly personalized and get that balance of explaining why you would really like this restaurant, but without going into a needless amount of detail."


Iterating at Scale: The Development Journey
One of Letta's most valuable features for Bilt turned out to be the flexibility to rapidly iterate on their architecture. "[We started with] a single agent per benefit per user. Then we scaled to multiple agents per user, which added up to tens of millions of agents across our user base," Krauth explained. "Then we tried a supervisor approach, then moved to category-based agents—a dining agent and a supervisor and a re-ranker."
This kind of rapid architectural experimentation would have been prohibitive with other systems. "What Letta allowed us to do is rapidly iterate on this system. We didn't have to always rebuild everything from the ground up. We changed our design multiple times throughout the process without having to throw everything away with each new iteration."
The team also found value in Letta's accessibility to non-technical team members through the ADE (Agent Development Environment). "We had our product managers, our team leads, general managers, business people—they were all in the system, able to update the behavior of agents," Krauth said. "We didn’t need an engineer to tweak weights or modify code to change the behaviors. Everyone was in the Letta system, including our CEO, managing the prompts to get the desired behaviors."
Results: Real Users, Real Recommendations
When Bilt rolled out their new recommendation system, the improvement was immediately noticeable to internal testers. "We [got] a lot of feedback [saying], 'Wow, yeah, that's me' [about] what was in the memory blocks for travel preferences, especially from our loyalty team," Andrew Fitz recalled.

Krauth had his own positive experience: "The system recommended an omakase place that was reasonably priced, a rarity in NYC. Normally my wife and I would have been suspicious, but because of the recommendation we gave it a shot and we loved it!"
From a business perspective, the new system positions Bilt to better serve all parties in their ecosystem. Users get more relevant recommendations, merchants can create more effective campaigns, and property managers can offer more appealing lease incentives using Bilt's platform instead of simple (and often more costly) cash incentives.
Scale and Future Vision
Perhaps most impressively, Bilt has achieved this level of personalization at scale. "We already have over a million agents each tailored to different aspects of neighborhood benefits," Fitz revealed."
Despite the scale and complexity of the agents, Letta added minimum overhead beyond just LLM inference. "All the latency in the end-to-end system is 99+ percent just inference time," Fitz explained. "Scaling from the perspective of agents is no longer a concern."
Looking ahead, Bilt plans to expand their AI capabilities to help merchants create better campaigns and assist property managers in creating personalized lease incentive bundles. They're also exploring how to combine their LLM-based approach with traditional recommendation systems to handle edge cases and new users more effectively.
As Fitz put it, "The models just keep getting better and cheaper. Even now, we're using completely different models than we started with a month ago. The flexibility of Letta means it takes me one minute to update everything and now all of a sudden all of our prod users are using a completely different model."
Curious to learn more?
Try Letta today, or request a demo for your team.