Note: The following is adapted from a talk I gave with Safety Radar to the ASSP, on the topic of AI and workplace safety.
AI is a fantastic generalist—strong in math and science, great at summarizing and writing. But in high-stakes situations that require deep domain expertise, it becomes a mediocre specialist.
So how do we get AI to perform consistently in high-stakes situations that require deep specialization?
Since AI was inspired by human cognitive architectures—leveraging digital equivalents of neurons and synapses—we can reframe the question to one that’s more familiar: How would you get a human to do the same thing?
Unlike previous technologies that have required learning new skillsets in order to use, with AI the skills are familiar. To think like an AI, think like a human.
AI lacks lived experience and access to specific context, and it makes mistakes with unwarranted confidence. Left unchecked, this can be risky—especially in fields like safety, healthcare, or finance, where subtle judgment calls matter.
Humans don’t develop deep expertise overnight. We get trained, using structured decision-making frameworks, domain-specific data, and constant feedback.
That’s exactly the kind of scaffolding AI needs if we want it to act more like a specialist.
Ever assemble a team to get something done? Figuring out what decisions need to be made—and in what order—and assembling a squad of specialists is a fundamental executive function. It’s also the key to reliable AI performance.
Human limitations put a floor on how granular you can get. For instance—in the time of travel agents—it didn’t make economic sense for the agency to hire someone just to book flights and another person just to book hotels. So a travel agent is expected to possess both skills.
But no such limitation exists with AI. You can break a problem down to an arbitrary depth of specificity. And the more specific you get, the better the AI does.
Each discrete decision point represents an atomic unit of your overall solution, and they can be sequenced and built on top of each other to go from simplicity to complexity with confidence. I’ll be calling these solution units “Atoms” going forward.
A common use case for experimenting with AI is asking it to write a story. Every parent I know has tried to create customized bedtime stories for their kids. If you’ve tried it, you know: the results are subpar. You can continue to prompt GPT until you get what you want, but it still feels like a roll of the dice.
To get better results, approach the problem like an author: break it down into simpler, specialized units and use them to build up the complex final result.
💡 Pro tip: If you don’t know how to break the problem down effectively, ask GPT for help.
In the following diagram, you can see how the storytelling process can be broken down into Atoms. Some of these depend on others to happen first. For instance, characters are necessary before a plot can be created.
While the storytelling example may seem trivial, the principles apply to all AI (and human) solutioneering.
A good sniff test to determine if you’ve broken down the problem into sufficiently focused Atoms is to ask “can the AI respond to this prompt with one thing (e.g. a title, a plot), or a list of similar things (e.g. characters, chapters)?” Having consistent, normalized outputs will make your solution easier to observe and debug. It also makes it easier to plug into other solutions and systems.
One thing that makes a human specialist so specialized is the amount of information they have consumed and knowledge they have developed on their topic of choice. All of that information is available to them as context for their thoughts and decisions.
It’s been demonstrated that AI also benefits greatly from access to context. Context is (loosely defined): the rest of the information you provide to an AI, after you’ve specified your intent. Here are some specific instances of good context for AI:
ℹ️ Note: The topic of how to best store and retrieve your context is a big one that I’ll reserve for another post, but I can tell you that I’m particularly interested in GraphRAG.
But you can’t just throw the kitchen sink of context at an AI. Just like with solutioneering, context also benefits from being focused and specific. Now that you’ve broken your solution into Atoms, you’re much better positioned to provide the right context at the right time.
Going back to our storytelling example, each of the Atoms can be given its own unique context to improve its specific output:
In order to figure out what context is necessary for a given Atom, start by asking yourself “what would I need to know if I were going to do this task?”. If you’re struggling to come up with ideas, ask ChatGPT for help!
After context, the best way to ensure high-quality outputs from an AI is also something humans benefit greatly from: good feedback. When you’re chatting with an AI, this comes naturally: “make it more concise”, “punch up the title”, “make the character more mysterious”.
But, specialists are such because they’ve absorbed thousands of instances of feedback that are automatically incorporated into their thoughts and decisions. This is the goal for our specialist AIs and it can be achieved two ways:
Retrospective feedback is given in the form of examples. These can either be exclusively good examples of results that look like what the Atom will be generating, or good and bad, as long as the difference is clearly specified. Retrospective feedback typically consists of human generated examples, especially when you’re trying to imbue the AI with a sense of style, intuition, or je ne sais quoi.
Prospective feedback is also given in the form of examples—except these come from the AI’s own outputs, reviewed and approved by humans (or other AIs).
ℹ️ Note: Collecting and utilizing prospective feedback is an interesting challenge and something I plan to spend more time writing about.
Feedback is another area that benefits from Atomizing your solution because you can provide focused and highly-relevant examples. Additionally, since an Atom's outputs are consistent, you can easily turn them into prospective feedback.
Shallow feedback focuses on the output. Deep feedback focuses on the output and the rationale.
When a human specialist makes a decision, they can explain why. That explanation is often as important as the decision itself, especially in high-stakes environments. We don’t just want answers from an AI. We want reasoning we can evaluate, question, and learn from.
This is where deep feedback comes in.
By encouraging AI to provide explanations alongside outputs, you accomplish three things:
Prompt your AI to explain its reasoning—even if you don’t always need it. You’ll catch more errors and build a more interpretable system over time.
Eventually, these rationales can become their own Atoms—discrete units of logic that can be audited, refined, and reused across use cases.
The concepts we’ve covered probably come as second nature to you, but how can we map them to AI? Let’s talk about how to put this into practice. You don’t need a sophisticated tech stack to get started—just a bit of discipline in how you work with AI.
Here are four practical approaches, ranging from manual to fully automated:
Structure your prompts in sequence within one conversation. Tackle one Atom at a time—start with generating characters, then move to setting, then plot. Reference earlier outputs as needed. This works surprisingly well for small projects or quick prototyping.
Think of it like a conversation with a junior collaborator—give clear, focused tasks, build on previous steps, and guide with feedback.
Sometimes keeping everything in one thread gets messy. You can break Atoms across multiple ChatGPT chats (or tabs) and copy/paste between them. This lets you sandbox each task, isolate changes, and refine without unintended interference. For extra customization and specialization, try using a Custom GPT for each Atom.
This is a good interim step between casual use and a more systematic approach.
For more structured and repeatable workflows, tools like n8n.io or Make.com let you visually compose automation pipelines. Each Atom can be a distinct GPT call, with outputs passed as context into downstream Atoms.
This is a great option for internal tools, prototypes, or research workflows where consistency and traceability matter.
If AI is core to your product or process, consider building a custom application. You can define Atoms as discrete services, maintain historical context, inject prospective feedback dynamically, and optimize for performance at scale.
This is how specialists are built: structured logic, curated examples, tightly-scoped context, and constant refinement.
ℹ️ Note: At Free Energy, our specialty is #3 and #4, so if you’ve hit a wall with #1 and #2, reach out to us for help!