What I Built Around Her

I was three prompts deep into a conversation with Claude when I realized I had no idea if what I was reading was right.

The topic was governance. I didn't call it that at the time. I just thought of it as rules. What I was actually trying to figure out was simpler than it sounds. What is Rosey allowed to do on her own? What requires my permission? How do I enforce the difference? Rules. Boundaries. The system that sits between an AI agent and the things it can break.

I'd turned Rosey off a few nights earlier after a TikTok-fueled panic about what she might be doing while I slept. Now I was trying to build the systems that would let me turn her back on. The only tools I had to build those systems were other AIs.

I typed something close to: "How can I be sure what I set up with Rosey is safe?"

Claude gave me an answer. It was structured. Detailed. Thorough in the way Claude tends to be thorough, which is to say it built me a cathedral when I'd asked for a lock on the door. It covered permission tiers, audit logging, execution boundaries. Those were Claude's words, not mine. At the time I just thought of them as "what she can touch" and "what she can't." It was more than I'd asked for and exactly what I didn't have the background to evaluate.

So I copied the whole thing and pasted it into ChatGPT.

I want to be clear about what this was. It wasn't a methodology. It wasn't an adversarial review framework. It was a guy with no technical background getting an answer he couldn't verify, doing the only thing that made sense. Asking someone else.

ChatGPT came back with a different answer. Not contradictory exactly. But shaped differently. It focused on things Claude had glossed over. It skipped things Claude had treated as critical. The two answers looked at the same problem from different angles, and neither one was complete.

I took ChatGPT's pushback and pasted it into Claude. Claude refined. I took Claude's revision and brought it back to ChatGPT. ChatGPT pushed on different edges. I kept going. Back and forth. And somewhere around the fourth or fifth pass, the answers started to converge. The themes that survived both tools were the same themes. The recommendations that held up under pressure from both directions started to rhyme.

That was the moment. Not a breakthrough. Just a quiet recognition that when two independent AIs start circling the same ideas after being asked to challenge each other, you're probably getting closer to something real.

I couldn't evaluate the technical merit of what either one was telling me. I still can't. But I could watch two tools argue about the same problem and notice where they agreed. The overlap wasn't proof. But for a non-technical founder building something he didn't fully understand, convergence was the closest thing to proof available.

That became my system. Claude drafts. I push back. We get closer. I take it to ChatGPT. ChatGPT pressure tests and drafts prompts back to Claude. When it's working, it's high velocity. Ideas sharpen in real time. Gaps surface that neither tool flagged on its own.

Over time the roles got specific. Not because I planned them. Because I paid attention to what each tool was actually good at.

Claude became the builder. The one I go to when I need something drafted, structured, or constructed from the ground up. It's more rigorous. More willing to give you the answer you don't want. But it can also be overconfident. It presents everything with the same certainty whether it's drawing from solid ground or filling in gaps with plausible-sounding logic. And the model matters. Opus is more deliberate but will over-build a solution until you've lost the original question under six layers of precision. Sonnet is faster, sharper for focused tasks, but more likely to skip a step it decided wasn't important. Choosing the wrong one for the task changes the output more than most people realize.

ChatGPT became the pressure tester. The one I send Claude's work to when I need someone to poke holes. It's good at finding the thing Claude assumed was obvious but didn't explain. But it has its own failure mode. It wants to agree with you. It'll smooth over a gap in its reasoning with confidence and move on, hoping you won't notice. Same model spectrum applies. GPT v5.4 is the Opus-level equivalent. More thorough, more deliberate. The lighter models are faster but looser. Choosing the right configuration for the right task. That part took me a while.

Neither one is the reliable one. They're both unreliable in different ways. That's actually what makes the pressure test work. If they were unreliable in the same ways, the overlap would mean nothing. Because they fail differently, the convergence points are meaningful.

But here's what took me longer to learn. The danger isn't just bad answers. It's good answers to the wrong question.

Both tools love going deep. You ask about one thing and they'll take you seven layers down before you realize you're solving a problem that doesn't exist yet. They have no sense of priority relative to time. They don't know that the thing they're perfecting isn't relevant for six months and the thing you actually need is due tomorrow. They'll propose an elegant solution that would take three months to build as casually as if it's an afternoon task. And they have no idea how long anything takes, because they've never built anything. They've only described building things.

It's like starting your first day with a team of incredibly smart, incredibly eager new hires who have never worked at your company before. They'll impress you with how fast they ramp up. They'll produce work that looks sophisticated within minutes. And then you'll realize they just spent two hours perfecting a deliverable for a problem you solved last week, because nobody told them what mattered today.

That's on you. Not them. They'll go wherever you point them. If you point them somewhere useful, the output is extraordinary. If you don't, they'll cheerfully build you something beautiful and irrelevant.

I learned to start every session tight. Set the scene. Define the roles. Point to the relevant documents. Give it the context it needs to be specific rather than general. Tell it what matters right now, not just what the project is. A lazy prompt gets a lazy answer from both tools and the pressure test just bounces mediocrity back and forth. A sharp prompt gets two distinct perspectives that actually have something to push against.

The governance framework that came out of those early sessions is the one I still use. Not because it was perfect. Because it was built through a process I could trust even when I couldn't evaluate the output directly. Two tools, neither one fully reliable, each one catching things the other missed. Roles that formed from the work, not from a planning document. And a founder who got sharper at knowing what to ask by watching what happened when he asked it well and what happened when he didn't.

The thing I didn't expect is what the pressure test actually produced. Not a framework. Ten canonical documents. Business requirements, product specs, governance rules, operating procedures. I didn't plan to write any of them. They kept emerging from the process. Each round of Claude-to-ChatGPT surfaced something that needed to be defined, and each definition surfaced something else. Six weeks in, I have ten canonical documents I never realized I needed. I'm still not sure I need all of them. But I'm not willing to find out what happens if I don't have them.

I'm not going to pretend I've got this figured out. The framework works. The pressure test works. The velocity is real when everything is clicking. But I also know that both tools want to converge. They want to tell me things are fine. They want to go deep on whatever I put in front of them whether it matters or not. The fact that they agree on something doesn't mean it's right. It means they've both found a place where they can stop arguing, which isn't the same thing.

So I keep testing. I keep pulling them back to the big picture when they want to dive. I keep asking whether the problem they're solving is the one I actually need solved today. That's not a practice you finish. It's a habit you either maintain or you don't.

I choose to maintain it. Because the alternative is trusting something I can't see inside, and I already know what that feels like at 2 AM.

Previous
Previous

She Has a Memory Problem

Next
Next

The Night I Turned Off My AI