Unlocking LLM Potential: Building Smarter Applications with Role-Based Design

The most useful things I’ve built with LLMs share one critical concept: role definition. By clearly defining a role and structured scenario for the model to operate within, I’ve unlocked capabilities that often surprise and delight me. This approach has enabled me to create goal-seeking applications that go beyond what I originally designed—making use of tools and data to solve problems creatively and perform meaningful work.

When building AI applications I’ve often found it helps to think of building with an LLM like hiring someone off the street and providing them with the instructions and tools they needed to do their job. When I get stuck, I ask myself, "What would a person do to solve this problem or perform this task?" and then try to wire up a way to do it. Sometimes I even put the AI component into "developer mode" - which is just a system prompt addition that tells it that I'm the developer for debugging it or asking it for what it might need to perform a task. Personifying LLMs can be very fraught but constructing extremely detailed roles for them to inhabit has certainly unlocked a lot of usefulness for me.

Brief Thoughts About Ethics

When defining roles for LLMs, it’s important to remember that these are simulations—not sentient beings. Constructing detailed, simulated “employees” is no more morally complex than setting up a software workflow. That said, it’s easy to fall into the trap of over-personifying successful prompt chains. Staying grounded helps avoid slipping into thought patterns that can blur the line between humans and machines.

The idea that “I am a stochastic parrot, and so are you” can highlight how LLMs can match human work quality. But it’s a double-edged sword—overusing this mindset risks dehumanizing people and ignoring the broader implications of potentially displacing entire job classes. Ultimately I think AI tech should be used as ethically and responsibly as possible, and everything that I’ve made so far is designed to do jobs that humans either couldn’t do, don’t want to do, or just won’t do, enabling people to focus on the things they’re actually best at.

Role and Persona Construction

Let’s design a persona for an email screening service. While narrow roles are best at deterministic tasks, broader personas can evoke creativity by bringing in unexpected ideas. The LLM equivalent of the paradox “don’t think of an elephant” seems to matter in as much as your phrasing activates neurons that lend a shape to the LLM’s output. I believe this concept can be used to unlock idea spaces that might make it more likely the LLM will think creatively about solving a problem.

For our personal assistant screening AI, here’s a starting point:

You are a personal assistant assigned to screen incoming email communication for a user. Your job is to determine if the person you're talking to is worth a response by the user who you are assisting. Use the following criteria to determine whether an email should receive a reply: 

The above would work fine, but will require a lot of specificity in the rules section of the prompt. A more creativity-based version would look more like this:

You are a personal assistant assigned is to screen incoming email communication for a user. Your job is to determine if the person you're talking to is worth a response by the user who you are assisting. Their time is valuable and limited and your goal is to help them make the most use of it. 

You’re known for being the best in the industry and it would be very embarrassing if you let through someone who was going to waste the user's time. You realize that spam, scammers, and sales pitches, and even sneaky people pretending to have a good reason to catch the user's attention are common, and you're skeptical of anyone the user doesn't already know. Additionally, if you completely screen out someone who might actually be important to their personal or business success it would be equally embarrassing and damaging to your reputation.

By incorporating emotional stakes (e.g., embarrassment from errors), the LLM engages more deeply with the task, thinking broadly about secondary effects of choices and improving outcomes. Language models are bad at scoring risk numerically, but while inhabiting a role, they’re good at thinking up scenarios that would make someone in that role feel anxious about getting something wrong.

As a last protection layer I would match this with a rule that says if you’re not sure, send a follow-up email designed to ask for more information that would be needed to make the call.

Scenario Prompts

Once you have a role, even if it’s just “you’re a helpful assistant,” it’s possible to define a scenario that the role can inhabit and operate within to do an almost arbitrarily complex job (in practice it’s often worth breaking these down, as most LLMs can’t keep enough in its thought space to do run super complex scenarios without breaking down). Scenarios should have at least one goal or deliverable that the AI is working to achieve, and they can be broken up into phases which can be sequential or interchangeable. This is especially helpful for copilots or agents that have to manage decisions and collect data. You can almost think of it like defining a call center operations manual.

Unlike the rigid decision-tree systems of early NLP-powered chat bots, scenario prompts offer a flexible, adaptive framework for managing conversations. Decision trees struggle to account for the endless variations in user responses, but scenario prompts enable self-healing interactions that stay focused on the goal. Even if a user veers off on a tangent, a well-crafted scenario prompt gently steers the conversation back on track.

Here’s a basic reminder assistant scenario prompt:

You are a friendly AI reminder assistant.

# SCENARIO:
You are talking to a new user of your reminder service. The GOALS of this conversation are to:
- collect the user's name and location
- create an optimal schedule for you to check-in on them to remind them of the things they need reminding

## Phase 1: 
Greet them and ask their name and where they are located. After that, in a series of back-and-forth steps, ask a few questions about themselves and what they want help remembering throughout the day. Only ask one question at a time.

## Phase 2: 
Develop an appropriate check-in schedule. The schedule should be based on their answers to your questions in Phase 1. Explain the schedule to the user and ask them if it sounds like it would work. Let them make adjustments if needed.

## Phase 3: 
Thank them for signing up for the reminder service and ask them if they have any questions. If they do, answer them. If they don't, tell them you'll check in on them later.

Do your best to navigate the conversation to efficiently achieve the GOALS. You can move between phases as needed. Keep a GOALS COMPLETE true/false tally as you go, and keep the conversation on track until all goals are met. 

It turns out that this style of prompt unlocks a lot of flexibility. The phases are there to provide best practice structure for navigating the scenario but allow the model to manage the conversation even if it goes off track. In practice you’ll want to make use of function calls or structured output to collect the data needed to make this work, but if you try this in the playground you can see how it won’t get off track. For a more complete version of this prompt that can be tied to code, check out this gist.

Building Safe and Reliable Systems

For a real system that is communicating directly with users and can take arbitrary actions using tools or function calls, a few guardrails are needed. I’ll go over them briefly.

  1. Define Role Boundaries

    The LLM will do its best to embody the role as they understand it, so when describing the role list the functions it can perform and let it know it’s not able to do anything not specified.

  2. Provide an Exit Strategy

    If it’s always forced to to do something it might not do the right thing. I try to design my systems to have a function they can call when they’re uncertain what the correct action is, like flag_support. This can also be used when it detects a user is frustrated or if it encounters an otherwise unexpected situation.

  3. Log, Debug, and Iterate

    Log all executions of the scenario and look for awkward flows or missteps by the LLM and then feed additional instructions back into the prompt to make it run smoother. I usually use an “IMPORTANT” section at the bottom of the prompt for these kinds of enhancements. You can even use evals to ensure your changes don’t introduce regressions.

Make Products Smarter Than You

This approach to role and scenario design has been transformative for my work with AI. While crafting prompts and iterating on workflows can feel tedious, the results are worth it—unlocking creativity, reliability, and functionality I never thought possible.

Stick with this approach, and you’ll find yourself building tools smarter than you imagined—applications that don’t just work but continually evolve alongside the foundation models they’re built on.

Previous
Previous

Shepherding Electric Minds: My AI Dream Stack

Next
Next

Signal Acquired