The brownie recipe problem
Instacart CTO Anirban Kundu has a name for his company's LLM challenge: the brownie recipe problem. It's not enough for a model to understand "I want to make brownies." The system needs to know what's in stock at the user's local market, whether they prefer organic eggs, what substitutes work if the first choice is unavailable, and whether ice cream will melt before delivery arrives.
All of this context must be processed in under a second. Any slower and users bail.
Chunking beats context overload
Instacart's solution: split processing into stages. A large foundation model handles intent and product categorization. Then specialized small language models tackle catalog context (which products work together, what substitutes make sense) and semantic understanding (what counts as a healthy snack for an 8-year-old).
This matters because loading a user's entire purchase history into a reasoning model creates an unmanageable bloat. The chunking approach keeps models focused and fast.
The catalog context layer handles Instacart's "over double digit" cases where products aren't available locally. The system must understand substitutions that work at multiple detail levels, then factor in logistics like delivery time for items that spoil quickly.
Microagents over monoliths
Instacart is experimenting with AI agents but found that a single agent handling multiple tasks becomes unwieldy. Instead, they're deploying microagents, each focused on specific tasks like payment systems or integrations with third-party point-of-sale platforms.
The company has integrated OpenAI's Model Context Protocol and Google's Universal Commerce Protocol to standardize connections between AI models and merchant systems. The Unix philosophy applies: smaller, focused tools beat monolithic systems.
The real work isn't integration. It's handling failure modes and latency. Different merchant systems behave differently, update at different intervals, and have varying reliability. Kundu's team spends two-thirds of their time fixing error cases, not building features.
What this means in practice
For enterprise teams building real-time AI systems, Instacart's architecture offers a blueprint: use foundation models for intent, specialized SLMs for domain context, and microagents for integration. The trade-off between context richness and response time is real. Loading more context improves accuracy but kills latency.
The brownie recipe problem isn't unique to grocery delivery. Any system juggling personalization, real-time inventory, and sub-second responses faces similar constraints. History suggests the answer isn't bigger models with infinite context windows. It's better chunking strategies and focused agents that know their lanes.