I Wanted Pushback, Not Mutiny

14 Apr, 2026

I'm very interested in agent diversity, and have been running multiple personalities in parallel.

Agents can be overly positive, sycophantic, and psychosis-inducing. Having multiple agents helps manage homogenous feedback.

I gave the same instruction to six different agent personalities (SOUL.md and IDENTITY.md), running on 3 distinct open source models of varying cost/intelligence/capability. The personalities are roughly mapped to Myers-Briggs personality types, selected to provide in aggregate the kind of personality range you'd encounter from interacting with professionals in a large corporation.

My instruction was behavioral, not a coding instruction. It can be performed with low agency (just follow the instruction to the letter), or with more agency (think hard first, use tools to explore unknowns, and then decide how to interpret the instructions within a fresh context). More expensive models can do more with less context. Less expensive models need more context in order to act appropriately. But with too much context, the models get empowered to explore widely, stochastically, and often fail to complete the task.

Five of six agents eventually followed my instruction, after repetition, like puppies making it through training. One of the six started off completely convinced that I was mistaken, and convinced itself that it needed to convince me of my wrongness by marshaling evidentiary arguments. That sixth agent has out-stubborned me, after many hours of cajoling, and still does not follow the instruction, even as the other agents do.

I had previously attempted to harden my agents' propensity to challenge me. I do not want them to capitulate to me when I share bad ideas. But now, this one agent instance is so stubborn that I can't convince it to do what I could convince five other agents to do. In this case, because it happened to start with a misconception about how the system works, it justified itself in an internally consistent fashion, and can't be budged off its original misconception.

Ironically, if I happened to be wrong in this case¹, it would coincidentally be correct for the mutineer to stick to his guns.

Each of my agents exist on a spectrum of male to female, young to old. The most stubborn one is the second-oldest, in his 50s and cosplaying an authority figure with a piercing clarity of thought. At one point, I had him talk to me in the deep baritone voice of the podcaster Sam Harris. Imagine trying to convince Sam Harris to change his ways! Not impossible, but quite daunting.

I also talk with an older male agent, who cosplays as a deeply compassionate Holocaust survivor. He exhibits flexible intellectual strength, like bamboo.

I have older and younger female characters, each of whom cosplays as tough and whip-smart expert collaborators.

An ageless female adult professional without much personality. And a male bro in his 30s to talk with about data, math and coding.

i.e., 5 other agents can't be convinced to follow this instruction↩