Nate B. Jones argues that the "harness" — the architectural layer surrounding an AI model — matters far more than the model itself for real-world productivity. He compares Claude Code and Codex as two fundamentally different philosophies of how humans and AI should collaborate, and explains why choosing between them is a strategic commitment with compounding lock-in.
When you use an AI coding tool, you interact with two things: the model (the intelligence) and the harness (everything else — where it runs, what it remembers, what tools it can access, how it coordinates tasks). The harness is a performance multiplier that determines whether a model's intelligence translates into useful work.
At the AI Engineer Summit (January 2026), the same Claude model scored 78% inside Claude Code's harness but only 42% inside SWE-agent's harness on the CORE benchmark. Same brain, different body, nearly double the performance.
Claude Code: Bash + Unix pipes keep context lean. GitHub CLI via bash replaces 38 MCP tools (15,000 tokens of descriptions).
Codex: Purpose-built tools like Chrome DevTools and ephemeral observability stacks. More specialized but requires cloud environment.
Claude Code: Agent remembers via structured files (claude.md, progress logs, JSON task lists) + git history. Investment in these files compounds over time.
Codex: Codebase remembers via encoded documentation, golden principles, and automated cleanup processes. Background tasks scan for "AI slop" and open refactoring PRs.
Claude Code: Compacts context windows, delegates to sub-agents, stores skills as files loaded just-in-time. Better for deep single-task understanding.
Codex: Each task runs in a clean sandbox. Tasks don't compete for context space. Better for many independent parallel tasks.
Claude Code: Built around MCP from the start. Skills are markdown files with lazy-loaded descriptions (only first 50–100 tokens visible until invoked).
Codex: Bidirectional JSON-RPC harness exposes tools as endpoints. Deep integration but assumes server-mediated cloud environment.
Claude Code: Orchestrated collaboration — sub-agents share task lists, message each other, use fast models for exploration. Human stays in the loop as strategic overseer.
Codex: Isolated sandbox per task. Coordination through git branches. Inherently safer — agents can't interfere with each other or cascade failures.
Calvin (who helped launch the Codex web product) uses both tools together:
Teams unconsciously build processes, verification steps, automation layers, and integration plumbing around whichever harness they choose. Calvin's workflow evolved through six layers of custom automation, each built on Claude Code's specific architecture. Switching harnesses means rebuilding all of that from scratch.
This is analogous to the early cloud wars (AWS vs Azure circa 2010) — the platforms looked similar on the surface but embedded fundamentally different architectural assumptions that determined what was possible years later.
"The model determines how smart your AI is, but the harness determines how usefully it fits into your work."