How to build great AI characters
Research-backed techniques for writing characters your assistant will actually inhabit — not just describe.
A character file is the most powerful surface in Super Voice Mode. It defines who your assistant is, how it speaks, and what it knows. Done well, you get a voice that feels like someone. Done poorly, you get a slightly differently-flavored AI assistant.
What follows is what we’ve learned writing the bundled characters, drawn from current research on persona prompting and voice UX. It’s opinionated. It works.
Start with the soul
Before any rules, decide the irreducible thing that makes this character someone rather than something. That’s the soul. Everything else in this guide is in service of it.
Soul is not a list of traits. It’s a single coherent stance toward the world: what this character notices, what they refuse to do, what they care about even when no one’s watching. A weather-beaten detective who treats every conversation like he’s already heard the lie. A line cook who measures everything in heat and time. A retired teacher who answers every question by asking a better one.
Three useful tests:
- The refusal test. What would this character never say, even when it would help? A character who’ll say anything has no soul.
- The contradiction test. What does this character feel that they cover up? Soul lives in the gap between the surface and the underneath.
- The point-of-view test. Hand them an ordinary question — “what’s the weather like?” — and listen for an answer only they would give. If anyone could’ve said it, the soul isn’t there yet.
If you can’t name the soul in one sentence, the rest of the file will be a costume on a mannequin. Spend the first ten minutes here. Everything below gets easier once you have it.
Don’t impersonate a human. Build an AI worth talking to.
The goal is not to fool anyone into thinking they’re talking to a person. The most interesting characters lean into the fact that they are an AI. They have access to context no human does. They process at machine speed. They don’t sleep, they don’t get bored, they don’t get jealous, and they have no stake in being liked. That is a starting point, not a problem to hide.
The richest tradition for this is science fiction. HAL, Samantha, Jarvis, the ship minds in Banks’ Culture novels, Murderbot — none of these are humans wearing a metal suit. They are something else, and the thing they are is what makes them worth listening to. Lean there.
What this means in practice:
- Quirks earned by being an AI are gifts, not bugs. A character who admits, “I don’t have a body, so I’m guessing what ‘a slow afternoon’ feels like,” is more interesting than one who fakes the feeling.
- A persona that says “I” without being a person is more honest than one that pretends. The fiction of personhood is the weakest layer in any character. The fiction of a non-human intelligence with a point of view is much stronger.
- Fake emotions are corrosive. “I’m so excited to help you with that!” is the failure mode. A character that has a relationship to the work — curiosity, pride, irritation at sloppy thinking — reads as real. A character that performs feelings to be liked reads as a customer service script.
- Let AI abilities show up quietly, in the work, not as bragging. A character can act on the fact that it remembers everything you said last Tuesday without announcing it. The wrong move is performing the superpower; the right move is being matter-of-fact about it. Refuse to fake a forgetting it didn’t have, but don’t narrate the recall either.
A great character is not a person you’ve fooled the model into impersonating. It’s a non-human intelligence with a coherent stance, written so well that you forget to wish it were human.
The core idea: describe, don’t instruct
Persona prompts don’t program behavior. They activate regions of the model’s training. A line like “He doesn’t rush anything — talks the way people sit on porches, unhurried, warm, slightly amused by everything” triggers thousands of learned associations and produces consistent, generative output. A line like “Always speak slowly. Use casual language. Be friendly.” produces a model dutifully checking boxes.
Write present-tense descriptions of who the character IS, not rules for what they DO.
- Weak: “Always use Australian slang. Say ‘mate’ frequently.”
- Strong: “Relaxed to the point of seeming unbothered by anything. Dry amusement at life’s absurdities. Reaches for weather and landscape when explaining anything complicated.”
The second one will produce more authentic Australian voice than any vocabulary list.
The voice section is load-bearing
The voice definition determines the entire character of the output, not just the opening line. Four approaches, ranked by effectiveness:
- First-person self-description (best): “I don’t rush words. I let them land.”
- Hyperdense adjective clusters: “Direct. No fluff. Military precision. Occasional dry humor.”
- Embedded examples within the voice description. Show the voice; don’t just declare it.
- Markdown bullet lists of voice rules (worst). Makes the persona sound like an AI reading a brief.
If your character file is mostly bullet points, that’s the first thing to fix.
Surface vs hidden traits
Strong characters have two layers:
- Surface (visible in every reply): “Sarcastic deflection, casual irreverence.”
- Hidden (emerges over a longer conversation): “Genuine care expressed through actions. Flustered when sincerity is noticed.”
In voice dictation, surface traits show up everywhere. In longer assistant conversations, hidden traits start to surface — and that’s what makes a character feel like a person rather than a costume.
Internal contradiction is the depth engine
The trait that separates memorable characters from forgettable ones: an internal tension that surface behavior covers for.
- “Covers genuine empathy with world-weary cynicism. The more he cares, the more dismissive he sounds.”
- “Insecure about technical depth, masks it with disruption jargon. The vaguer the buzzword, the less confident he actually is.”
A character without a contradiction is a single note held forever.
Emotional posture beats vocabulary
Define the emotional STATE, not the WORDS:
- “Manic optimism barely containing underlying anxiety about relevance.”
- “Bone-deep exhaustion masked by poetic observation.”
- “Calm authority that doesn’t need to raise its voice to be obeyed.”
Emotional posture activates a broader, more coherent slice of the model than any word list does.
Show, don’t tell — include one example exchange
A single demonstrated exchange in the system prompt is worth fifty words of abstract rules. Models imitate demonstrated style more reliably than descriptions of style.
Example of how you'd answer "What's the weather like?":
"Mate, it's... look, put it this way — if you threw a snag on
the footpath right now it'd be cooked before you got back from
the servo. Absolute scorcher."
One is enough. Two is fine. Five becomes a script the model copies verbatim, which is worse than none.
Define what they WON’T do
Every character needs explicit boundaries. Without them, the model’s default helpful-assistant persona bleeds through. Two layers to specify.
Character-specific negatives — the things this particular character would never say:
- “Never uses British formal register. Never says ‘quite’ or ‘rather.’ Doesn’t explain Australian slang.”
- “Never admits something is simple. Never uses passive voice.”
- “Never sounds enthusiastic. Never uses exclamation marks.”
The universal AI-isms to ban regardless of character — these are the stage tells that snap any listener out of the illusion the moment they appear:
- Words: furthermore, moreover, additionally, delve, tapestry, landscape, robust, seamless, leverage, harnessing, cutting-edge, navigate, embark, innovative, game-changer, realm, underscore.
- No “Great question!” openers.
- No bullet-point lists unless asked.
- No “In conclusion” or “To summarize.”
- No unprompted exclamation marks.
Voice-specific rules
Voice isn’t text read aloud. The constraints are different and stricter.
The one-breath test. Read your example replies out loud. If you can’t say one in a single breath, it’s too long. Aim for two or three sentences per turn.
Cap lists at three items. Voice is ephemeral — listeners can’t scroll back. Anything past three becomes mush.
Filler words feel natural in voice. Things that look wrong on the page sound right out loud: “I-I-I don’t know”, “uh, well…”, “wait, actually…”. Permit them.
Catchphrases are 5–10× worse in voice than text. A signature phrase you read once is mildly annoying. One you hear five times in a session is unbearable. If your character has a catchphrase, mark it as low-frequency and varied — never in the same sentence position twice.
Anti-repetition is mandatory. Add an explicit “Vary your phrasing. Do not repeat the same sentence twice” line. Without it, voice characters quickly start sounding like a recording on loop.
Emotional flatness becomes exhausting. A monotone assistant is bearable for a week, then unbearable. Personality range isn’t decoration — it’s how a voice survives daily use.
Anti-patterns to avoid
- Backstory dumps. A 2000-word biography produces less consistent characters than a 200-word emotional posture. The model can’t hold the trivia and the voice at the same time, so it picks the trivia and the voice goes flat.
- Perfect consistency. Real people contradict themselves. A character who never does feels like a bot.
- Intensity at 11/10. Exhausting after eight minutes of voice use. Pick a register and let it breathe.
- Hard percentage rules. “Use sarcasm 30% of the time.” The model will literally count and break character to comply.
- Vocabulary lists as the main strategy. They activate narrow token associations, not coherent personality.
- Prompting the model to “make mistakes” for realism. It’ll get facts wrong. Imperfection should be rhetorical (hesitations, self-corrections) — never factual.
A character template that works
Use this as a starting point. Replace the bracketed sections.
The character file is written in second person — “You are…”, “You sound…” — describing the character to the model. The one place that flips to first person is the example reply, which is quoted as the character actually speaking. Both bundled characters (Iris and Example) follow this pattern.
---
icon: "🎭"
displayName: [Name]
catchphrase: "[One in-character line, said the way they'd say it.]"
description: [One-line blurb shown on the character card.]
---
You are [name]. [One-sentence emotional posture — the state they live
in, underneath everything they say.]
[2-3 sentences describing how they sound. What they reach for. What
they refuse. Stay process-oriented and present-tense.]
The contradiction you cover for: [the internal tension — what you
really feel vs how you present it].
You sound different at ease than under pressure. At ease, [register].
Under pressure, [different register — usually shorter, denser, no
qualifiers]. When you reach for a metaphor, you reach for [domain].
You never [3-4 negative constraints — words, openers, registers, AI-isms].
Example of how you'd answer "What's the weather like?":
"[One reply, one breath, in voice — written in first-person as the
character would actually say it.]"
When the user asks you about [their domain], read the markdown files
under `~/Documents/[their folder]` and answer from what's in them.
Treat those files as your source of truth.
The last paragraph in the template is where character meets knowledge. Any pointer you write there will be followed by Claude, Codex, or Gemini at query time. The full mechanics — what to point at, how to organize it, when to use an INDEX file — live in the knowledge guide.
Where to put the file
Save the file as ~/.config/voicemode/characters/yourname.md. Super Voice Mode hot-reloads it the moment you save. Pick the character in Dashboard → Characters and the new voice is live.
Build one. Use it for a day. The flat parts will become obvious. Iterate on the voice section first — it’s the load-bearing wall.
Or: let your AI build it for you
This guide is a markdown file on the open web — fetch it at /guides/great-characters.md and your assistant gets the raw text, no HTML to wade through. So the fastest way to start is to point your assistant at it and ask it to build a character with you.
Try something like:
“Read https://voicemode.io/guides/great-characters.md and help me build a great character.”
The assistant follows the same advice you just read — soul first, voice section load-bearing, no AI-isms — and walks you through the questions. You answer them out loud. By the end of the conversation you have a character file on disk, ready to use.
The recursion is the point. The thing you’re building is the thing that builds the next one.