In this MLST (Machine Learning Street Talk) interview, host Tim Scarfe sits down with Jeremy Howard — deep learning pioneer, Kaggle Grandmaster, fast.ai co-founder, and creator of ULMFiT — at his home in Queensland, Australia. The conversation spans Howard's foundational work on transfer learning, the current state of AI-assisted coding, the psychology of "vibe coding," why software engineering and coding are fundamentally different skills, the case for interactive exploratory programming, and broader questions about AI risk and power centralization.
Howard recounts creating ULMFiT, the paper that established the now-standard three-stage training pipeline: pre-training on a general corpus, mid-training (domain adaptation), and fine-tuning on a downstream task. Key innovations included training on a general-purpose Wikipedia corpus (not task-specific data), using discriminative learning rates (different learning rates per layer), gradual unfreezing of layers, and fine-tuning all normalization layers. The entire pre-training ran overnight on a single gaming GPU, and the fine-tuned model beat specialized PhD-level results on sentiment analysis in minutes.
Howard's original hypothesis for ULMFiT was that predicting the next word of Wikipedia text would force a model to build a hierarchy of abstractions — from the concept of objects, to people, to social hierarchies, to specific historical facts. He argues this is how LLMs develop what looks like understanding: by compressing statistical relationships into layered representations. However, he emphasizes this "understanding" is bounded by the training distribution and breaks down sharply outside it.
Howard argues LLMs are capable of combinatorial creativity — recombining elements from their training data in novel ways — but cannot perform transformative creativity that moves outside the training distribution. He uses the example of Anthropic's AI-generated C compiler: while impressive, it is fundamentally a style transfer from existing compiler source code (much of it based on LLVM) into Rust, not a genuinely novel design. Chris Lattner, the creator of LLVM and Clang, confirmed that the AI replicated specific design decisions (including ones Lattner considers mistakes) that only existed in LLVM.
Howard draws a sharp distinction between coding (translating specifications into syntax) and software engineering (designing the right abstractions, finding the right-sized pieces, and composing them into larger systems). He references Fred Brooks' "No Silver Bullet" essay, which argued decades ago that typing code was never the bottleneck — the hard part is the design work. While LLMs now write ~90% of Howard's code, he reports only modest productivity gains because the typing was never the slow part. LLMs are good at coding but empirically bad at software engineering, and there is no evidence this gap is closing.
Howard and his wife Rachel Thomas identified that AI-assisted coding shares the psychological hallmarks of gambling addiction: an illusion of control (crafting prompts and configurations), stochastic rewards (occasionally getting a working feature), losses disguised as wins (producing code that looks functional but no one understands), and self-deception about skill (believing productivity went up when studies show it barely changed). Howard describes 14-hour Claude Code marathon sessions that left him drained, contrasting them with the energized feeling of interactive exploratory programming.
Howard warns that delegating cognitive tasks to LLMs erodes knowledge within organizations. He cites the concept of desirable difficulty from learning science: memories and skills only form through effortful practice, and removing friction removes learning. An Anthropic study confirmed that most users of AI coding tools showed minimal learning. Howard tells his own team that personal capability growth matters more than feature output, invoking John Ousterhout's principle: "A little bit of slope makes up for a lot of intercept." Companies betting on AI replacing software engineers are making a speculative gamble that could destroy their ability to maintain or evolve their products.
Howard describes a telling case study: fixing a major version upgrade of iPyKernel (the engine behind Jupyter notebooks) using AI. Over two weeks, alternating between Codex and GPT models, he produced a working implementation — but one that nobody fully understands. He now faces an unprecedented software engineering question: should he bet his company's product on code that works but has no human comprehension behind it? There is no established practice for managing this situation.
Howard advocates for notebook-based, REPL-driven development where humans and AI work together in a rich, stateful environment. He traces this lineage from Smalltalk and APL through Mathematica to his own tools (nbdev, Solara). The core principle, inspired by Bret Victor's work: humans build understanding by directly manipulating objects in real time. Traditional software development — editing dead text files through a terminal — removes this connection. Howard considers it "inhumane" and argues it produces worse outcomes for both humans and AI.
Howard co-authored a rebuttal (with Arvind Narayanan) to the AI existential risk statement signed by Hinton, Hassabis, and others. His argument: regardless of how powerful AI becomes, centralizing control of it in the hands of a few companies or governments is the actual danger. History shows that powerful technologies (writing, printing, voting) were always subject to monopolization attempts by those in power. The solution is broad distribution, not concentration. His current primary concern is not autonomous AI but competence erosion — people and organizations losing their ability to grow and adapt.