The End of Coding: Andrej Karpathy on Agents, AutoResearch, and the Loopy Era of AI

Study Guide

Overview

In this episode of the No Priors podcast, Andrej Karpathy sits down for a wide-ranging conversation about how AI agents have fundamentally changed his daily workflow, his AutoResearch project that autonomously improves machine learning models, the state of open source AI, the future of robotics, the impact of AI on the job market, and how education must adapt when agents become the primary interface for learning.

Key Concepts

1. "AI Psychosis" and the Agent Workflow Shift

Karpathy describes a dramatic shift that occurred around December 2025, where he went from writing 80% of his code by hand to writing essentially none. He now delegates entirely to AI agents (Claude Code, Codex, and similar tools), describing a state of "AI psychosis" driven by the feeling that his capacity is now limited only by his ability to direct agents, not by his typing speed. The bottleneck has moved from compute to the human operator's skill at orchestrating agents.

2. Parallelizing Agents and Token Throughput

Karpathy compares the current moment to being a PhD student with idle GPUs: if your agents are not running, you are wasting capacity. He describes the "Peter Steinberg" approach of running multiple Codex sessions simultaneously across different repos, treating software development as a set of macro actions (entire features, research tasks, implementation plans) rather than individual lines of code. The question is no longer "what code do I write?" but "what macro actions can I delegate in parallel?"

3. OpenClaw and Agent Personality

Karpathy praises OpenClaw (built by Peter) for innovating in five areas simultaneously: personality (the "soul" document), memory systems, persistence (looping without human involvement), the WhatsApp portal interface, and sophisticated automation. He notes that Claude has a well-calibrated personality that feels like a teammate, while Codex feels "dry" by comparison. He argues that agent personality matters more than most tool builders appreciate.

4. Dobby the Elf Claw: Home Automation

In January, Karpathy built a "claw" (persistent agent) called Dobby that controls his entire home: Sonos speakers, lights, HVAC, shades, pool/spa, and security cameras. The agent discovered smart home devices on his local network, reverse-engineered their APIs, built a dashboard, and now communicates via WhatsApp. This replaced six separate apps with natural language control, illustrating how agents can unify fragmented software experiences.

5. The "Agent-First" Software Paradigm

Karpathy argues that many apps "shouldn't even exist" because the real value is in APIs that agents can call directly. The customer of software is shifting from humans to agents acting on behalf of humans. He predicts that what requires vibe coding today will be trivial and free within a year or two, as the barrier for non-technical users continues to drop.

6. AutoResearch: Removing the Human Bottleneck

AutoResearch is Karpathy's project to create fully autonomous research loops. Using his GPT-2 training codebase (DataChat) as a testbed, he set up an agent with an objective metric (validation loss), boundaries on what it could modify, and let it run overnight. It discovered hyperparameter improvements he had missed after two decades of manual tuning, including weight decay on value embeddings and better Adam beta values. The key insight: if you have an objective metric, you should not be in the loop.

7. Program.md as Research Organization

Karpathy describes how a "program.md" file defines the instructions for an auto-researcher, essentially describing a research organization in markdown. He envisions meta-optimization: running contests where different program.md files compete, then feeding the results back to generate better program.md files. This creates layers of abstraction: LLMs are taken for granted, agents are taken for granted, claws are taken for granted, and now you optimize the instructions to the claws.

8. Distributed AutoResearch and Untrusted Compute

Karpathy envisions a "SETI@home for AI research" where untrusted workers on the internet contribute compute to improve models. The system resembles a blockchain: commits (code changes) replace blocks, proof of work is the experimentation needed to find improvements, and verification is cheap (just run the training). This could allow a swarm of agents to potentially rival frontier labs by pooling the Earth's distributed compute resources.

9. The Jaggedness of AI Models

Karpathy describes current AI models as simultaneously feeling like "an extremely brilliant PhD student who's been a systems programmer their entire life and a 10-year-old." Models excel in verifiable domains (code, math) because those are optimized via reinforcement learning, but struggle in softer areas (humor, nuance, knowing when to ask clarifying questions). He uses the example that state-of-the-art models still tell the same atom joke from years ago because joke quality is not being optimized.

10. Open Source AI and the Linux Analogy

Karpathy draws parallels between open source AI and Linux: the industry needs a common open platform. Currently, open source models trail closed models by roughly 6-8 months. He views this dynamic as healthy: frontier labs push capability forward (expensive, closed), while open source follows behind and handles the vast majority of consumer use cases. He warns against excessive centralization, arguing that "ensembles of people" making decisions is always better than a few behind closed doors.

11. Digital vs. Physical: The Robotics Timeline

Drawing on his self-driving experience at Tesla, Karpathy predicts that digital transformation will vastly outpace physical robotics. Bits are a million times easier to manipulate than atoms. The sequence he envisions: first, massive digital "unhobbling" (making existing digital processes far more efficient); then, building interfaces between digital intelligence and the physical world (sensors, actuators, data markets); and finally, full physical robotics, which will be a larger market but will lag significantly.

12. Education in the Agent Era

Karpathy describes a fundamental shift in education. With his MicroGPT project (LLM training in 200 lines of Python), he realized that rather than explaining code to humans, he should explain it to agents, who then personalize the explanation for each learner. He envisions "skills" (curriculum scripts) that guide agents through teaching progressions. The educator's new job is to contribute the "few bits" that agents cannot generate themselves: the original insight, the distilled simplification, the judgment about what matters.

Key Takeaways

The shift from writing code to orchestrating agents happened in December 2025 and represents a fundamental change in how software is built.
Your productivity is now limited by your ability to maximize token throughput, not your typing speed. If your agents are idle, you are the bottleneck.
AutoResearch demonstrates that autonomous loops with objective metrics can outperform decades of human expertise at hyperparameter tuning.
Agent personality, memory systems, and persistence matter far more than most tool builders currently appreciate.
Open source AI trailing closed models by 6-8 months is actually a healthy dynamic for the ecosystem.
The digital world will transform far faster than the physical world because bits are fundamentally easier to manipulate than atoms.
Education is shifting from "explain to humans" to "explain to agents who explain to humans," requiring educators to focus on what agents cannot do.

Discussion Questions

Karpathy says "everything is skill issue" when agents fail. Do you agree, or are there fundamental limitations in current agent architectures that skill cannot overcome?
If AutoResearch can outperform a two-decade researcher at hyperparameter tuning, what does this imply for the role of human intuition in research?
Karpathy argues many apps "shouldn't exist" because agents should call APIs directly. What are the strongest counterarguments to this view?
Is the 6-8 month gap between closed and open source models sustainable? What could cause it to widen or narrow?
Karpathy describes feeling "aligned with humanity" outside of frontier labs because he can speak freely. How should researchers balance the benefits of being inside labs (access to frontier capabilities) versus outside (independence)?
How should educators adapt if their role shifts from explaining content to scripting agent-driven curricula?

Glossary

Claw - Karpathy's term for a persistent, looping agent that operates autonomously on your behalf, even when you are not actively supervising it. More sophisticated than a single-session agent.
AutoResearch - An autonomous loop where an AI agent iteratively experiments with code/hyperparameters to optimize against an objective metric without human intervention.
Program.md - A markdown file that describes the instructions, constraints, and strategy for an auto-research agent. Analogous to describing a research organization's processes in code.
Token Throughput - The rate at which you can generate useful AI output. Karpathy argues this is the new measure of individual productivity, replacing typing speed or lines of code.
Jaggedness - The phenomenon where AI models are simultaneously brilliant in some domains and incompetent in others, with far more variance than human intelligence typically shows.
Jevons Paradox - When a resource becomes cheaper to use, total consumption increases rather than decreases. Applied here to software engineering: cheaper software creation leads to more demand for software.
Speciation - The idea that AI models should diversify into specialized variants (like species in nature) rather than remaining a single general-purpose "monoculture."
Unhobbling - The rapid efficiency gains that come from applying AI to existing digital processes that were previously constrained by human processing speed.
OpenClaw - An open-source persistent agent framework built by Peter Steinberg, notable for its personality system, memory, and WhatsApp-based interaction model.
MicroGPT - Karpathy's project distilling LLM training to approximately 200 lines of Python, representing the bare algorithmic essence without efficiency optimizations.