Shepherding Electric Minds: My AI Dream Stack
Seeking A Newfangled "LAMP" Stack for AI Engineers
When I started building my first AI-powered product in 2022 (back in the days of GPT-3, non-turbo), it felt like a fresh start—a universe of possibilities for how to build products. Now, two years into building generative AI applications, I'm seeing ideas converge into common patterns across most projects. It reminds me of my early web development days, when the LAMP stack (Linux, Apache, MySQL, PHP) became the go-to setup for launching projects quickly. In my role as a core engineer at PSL, responsible for rapidly spinning up proof-of-concepts and evolving them into launched products, I've been craving an equivalent AI stack—something to streamline the process and avoid starting from scratch every time.
This post isn't quite a repeatable tech stack yet—I'm not ready to settle on a single architecture while AI technology is evolving so rapidly. Instead, it's a dive into what my dream AI stack could be, combining the best patterns and components I've encountered or helped create with my PSL colleagues over the past few years.
The 5 AI Projects it Took to Get Here
The journey to this dream stack has been shaped by five distinct AI startups I've worked on. Each of these projects built on the concepts from the last, adding new insights and pushing boundaries:
Blueprint - A complex series of prompt chains that tracked what developers were working on.
Enzzo - A copilot for designing new hardware products. (I only helped with the base prompt)
Clara - An SMS-based companion for seniors that anticipated their personalized needs.
Glue - A copilot for engineering teams developing test plans for product manufacturing verification.
Zucca - A copilot for food manufacturing, managing development and cost analysis for large-scale food and beverage production.
Each experience brought new ideas—sometimes evolving previous concepts, sometimes diving into completely uncharted territory. Some ideas were just-in-time experiments aimed at seeing what was possible, while others refined or extended what was already out there. Often, ideas I thought were uniquely ours ended up surfacing elsewhere in open source or at big AI companies.
It’s been exciting and energizing to be at the forefront of a growing current. While it's tempting to keep new ideas under wraps, I wanted to share a cohesive list to contribute to a larger conversation about emerging patterns. This isn't about RAG, chains, indexes, or vector stores—these are second-order patterns built on top of those components. They're still largely prototypes and could use more tuning, but they represent the building blocks of what the future might look like.
Overarching Principles
The principles here (discussed in more detail in this post) have brought me the most success in building useful products, though there are certainly other ways to do it. My journey in AI engineering is probably similar to many others—filled with common lessons—but maybe sharing my story will explain my way of thinking.
About two years ago while trying to push an idea I was working on in the playground as far as I could without having to move to code, I wrote a prompt that became the basis for most of the more complex applications I've written. Reading a bit like a script for a call center employee, it defined a job role and created a scenario with distinct conversation phases, data gathering, and had an overarching goal. This prompt eliminated the need for workflow logic and decision trees, letting the AI figure out the best way to achieve the goal. For the first time, I built something that worked better than planned—it did more than I'd coded it to do. I was hooked. Since then, I've designed AI products by giving them goals, job descriptions, creativity, and the ability to take notes for improvement.
I found it easiest to think of an AI app like hiring someone and giving them the tools and instructions to do their job. When stuck, I'd ask, "What would a person do to solve this problem?" and wire it up accordingly. Sometimes I'd put the AI in "developer mode," and ask for its help designing improvements. Personifying LLMs has its challenges, but giving them detailed roles has unlocked a lot of potential for me.
In short, I think of an LLM as a "thought CPU" within a larger, goal-driven architecture. I provide it tools, best practices, and inputs/outputs to make use of. Or think of it like an intern at an imaginary desk—with a job description, a notepad, Slack, Google Docs, and a set of apps to get the work done.
Alright, enough storytelling. Let's dive into all the fun components we can use.
Copilot-As-Controller
This is my dream stack, so let's go all-in on the AI-first concept. Unlike traditional MVC (Model View Controller) architecture, where the controller is explicitly coded to handle user inputs and create logical implementations of business rules, the copilot-as-controller approach uses AI prompts to define and manage business logic. This means changes to the rules can be implemented simply by updating the prompts—without the need for complex code modifications.
The AI-driven controller can also creatively solve problems and adapt the experience based on user-specific needs, making it a more dynamic and adaptable solution. By using policy-based or job-description prompts, we can outline all business logic in prompts that power a copilot acting as the controller (in the MVC sense) for the entire app. This means when business rules change, there's no need for code changes—just update the prompts. Plus, the AI can creatively solve problems while staying within the defined business rules, even using customer-specific materials for context.
From a UI/UX perspective, the interface no longer needs to cater to every user persona at once. The copilot can display or hide UI elements based on the user's role, and the UI doesn't need to be tightly coupled to the controller. When UI actions send messages like "priority changed to high" directly to the copilot, the AI can handle updating the relevant models, trigger functions, and refresh the UI. This flexibility allows for multiple interfaces—Slack, Teams, Email, even phone calls. Imagine calling your app and telling it to update a project deadline, just like editing a field in a web app.
Token costs can be high with this approach, but this is our dream stack, after all! And there are optimization strategies that help reduce this, including using smaller models to handle simpler aspects of the application.
Multibrain with Tools
If you're thinking, "All the business logic in a single prompt? That's a lot!"—you're right. LLMs currently can't juggle too many complex concepts at once without issues. The more concepts it juggles, the more likely mistakes happen. That's where Multibrain comes in—a pattern I developed to break down business rules into manageable sub-prompts.
For Zucca, the copilot manages aspects like requirements gathering, product development, and cost analysis. Each area gets its own "brain"—a specialized prompt acting as an expert in that field, with access to specific tools like calculators, code interpreters, or database lookups. A "router" prompt decides which brain to engage based on the conversation and the data model state. It can even route messages to multiple brains and compile their answers.
This approach lets us zoom in and out of different product aspects as needed while maintaining a big-picture view. It ensures no conflicts in business rules and allows the copilot to surface creative solutions or write custom code to solve unique customer problems.
Self-healing Context Window, Memory, and Personalization
To make your application copilot-driven and message-based, you need to manage the context window efficiently. I use a combination of truncation, summarization, and Retrieval-Augmented Generation (RAG). While Langchain and similar frameworks provide tools to handle this, I've found the following structure works best for my needs.
I break messages into sessions—starting when a user picks up an interaction with the copilot and ending after a set period of inactivity. Messages are summarized from previous sessions and expanded in the current one:
--- Window Top ---
System prompt
Context and state variables
Personalization notes
Summary of previous session
Messages from current session (user, assistant)
RAG recall from previous sessions
Hints or reminders of current goals
--- Window Bottom ---
This setup works like a mix of short-term and long-term memory—much like how our brains remember immediate details while keeping the bigger picture in mind. The short-term memory handles current context, while RAG acts as long-term memory to bring back details lost in summarization. Hints or reminders at the very bottom keep the copilot aware of important rules or unfinished tasks.
A key feature is the LLM's ability to record personalized notes, similar to MemGPT or how ChatGPT stores memory. This helps the copilot adapt to user preferences and capture customer-specific styles or guidelines. Personalized context really unlocks the most creative solutions when the LLM can customize its logic to best meet individual needs.
Document Storage and Research Agent
AI applications that work like employees are most effective when they have access to key documents—examples, style guides, templates, or references needed to perform their functions. They may also need to create deliverables or take notes. I prefer using platforms like Google Drive or Office 365 to store these shared documents, accessible to both people and the AI.
For instance, if a copilot's notes are stored in a Google Doc, it creates a human-readable preference system that anyone can update or add to. This requires setting up a multi-modal RAG index for documents, enabling the AI controller to access relevant info as needed. Depending on the application, adding an agent that can dig through documents or even search the web for specific information makes the setup even more powerful.
Thoughtfulness Engine
Acting as a kind of autonomously creative run-loop, the thoughtfulness engine was developed when I was trying to make Clara more proactive and able to anticipate when a user might need help. Clara began as a series of check-ins triggered by cron jobs, but that wasn't very intelligent. Sometimes users would update Clara on their own and invalidate the need for a future check-in, so I needed a system that could evaluate the current state and decide if a check-in was still necessary. This led to the creation of a dynamic intelligent run loop:
1. Wake up, check the data model, message history, calendars, etc.
2. Determine required actions and brainstorm additional ways to be helpful.
3. Decide whether to act immediately or defer.
4. Perform actions or send messages as needed.
5. Schedule the next wake-up time to repeat the loop.
The challenge here is ensuring this loop runs without errors or bad data causing issues, while also not being overly aggressive about taking actions. When tuned properly, it becomes a powerful tool for staying current on real-time updates and proactively spotting issues. You'll need to build thoughtful halt-prevention mechanisms, especially since LLMs aren't good at relative time, but be careful—avoid creating anything that uses tools in unconstrained ways just to keep the loop running.
Generative UI
Full credit to Ty Fiero for the core of this idea. The more we know about the user, the more we can tailor the UI to their needs. For example, a head of customer success will care about different things than a VP of engineering or a head of product. With the copilot-as-controller model, the AI can generate custom components specific to each user's preferences.
Imagine a head of customer success logging in to find a tailored dashboard that automatically highlights tickets that are most relevant to immediate customer concerns. Meanwhile, the head of product could see a different interface with progress details related to the features they're about to release.
AI can generate React components from scratch, though this can be error-prone or risky. However, Markdown already specifies UI elements like tables, and extensions can be written for buttons or other UI components. Personalized CSVs with optional headings can also help power configurable React components, exposing relevant data in a personalized way for each user.
Virtual Desktop And Computer Use
With Anthropic's Claude AI bringing in computer use capabilities, a whole new world of agentic possibilities opens up—especially when you can spin up and tear down virtual desktops with keyboard and mouse controls for agents to use. Imagine bypassing the need for custom API integrations with a customer's ERP system. Instead, you use read-only login creditials on a browser running on a virtual machine snapshot powered by a service like Kasm, paired with precise instructions for retrieving the needed information. You could even integrate a 1Password plugin to manage credentials or stored payment details. Since AI has been trained on all of the tools that people use, the capabilities this unlocks are endless.
Telemetry, Evals, and Metaprompting
Given the complexity and autonomy of all the patterns above, we need a robust logging system with alerts for long-running jobs, broken loops, and general errors. Ideally, this would include a detailed log for every LLM call, with telemetry like execution time, token count, cost, and the ability to generate evaluations for regression checks when making future changes. A great bonus would be the ability to use these evals to metaprompt a fix directly to the relevant system prompt, without needing code changes or new deployments.
Additionally I like to let my copilots and agents trigger alerts when they encounter an ambiguous scenario that would benefit from escalation to a human-in-the-loop. You could even go as far as to make a support queue for agents that can be answered in real time via messaging.
Final Thoughts
Arriving at the concepts above has taken a lot of iteration and exploration. Most of them have come from polishing the edges of awkward user experiences or clunky interactions that made me want to give my applications more capabilities. There’s still work to be done to refine and simplify them into reusable components that can be easily integrated into new projects. But over time, I believe these ideas will become second nature to product developers like us.
With these patterns, we can build the applications of the future—boosting human capabilities, creating new job categories, and unlocking opportunities for innovation beyond what we can imagine today. The challenge now is figuring out what’s missing and how to best implement these ideas in scalable, efficient ways.