Overview
This video demonstrates a creative real-world application of the Claude Agent SDK: a voice-controlled AI pool (billiards) trainer. The system uses an ElevenLabs voice clone of legendary pool player Earl Strickland, combined with computer vision, a table-mounted projector, and Claude's Agent SDK to create an interactive coaching experience where you can request new features by voice and have them coded and deployed in real time, without ever leaving the pool table.
Key Concepts
The AI Pool Trainer Setup
The project combines several technologies into one integrated system:
- ElevenLabs voice clone of Earl Strickland (one of the greatest pool players ever) that listens and responds conversationally
- Computer vision that detects ball positions on the table in real time
- A projector mounted above the pool table that displays shot route projections, agent output, and visual overlays directly onto the playing surface
- Claude Agent SDK running in the background as a coding agent that can modify the Python-based application on the fly
- Smart studio controls (e.g., lighting) also connected to the voice agent
How the Agent SDK Integration Works
The architecture uses a multi-agent pattern with two distinct agents communicating with each other:
- The "Earl" voice agent handles natural language conversation, listens to the user, and provides pool coaching advice (shot routes, cue ball positioning, cut angles)
- The coding agent (Agent SDK) runs in the background and can modify the Python codebase when Earl forwards a feature request to it
When the user asks Earl for a new capability the system does not yet have, Earl recognizes it as a feature request and dispatches it to the coding agent via the Agent SDK. The coding agent then modifies the code, and the results are projected onto the pool table so the user can monitor progress without going to a computer.
Live Feature Request Demo
The video walks through a complete feature request cycle:
- Earl analyzes ball positions and provides two different shot routes (a conservative stun shot and an aggressive follow shot)
- The user realizes there is no way to recall a previous route projection after Earl has moved on to a new analysis
- The user asks Earl to submit a feature request to the coding agent so that "go back to the first route" will recall the earlier projection
- Earl confirms the request and sends it to the Agent SDK
- The coding agent works on the code while its output streams onto the table projector
- The agent reports back that the feature is ready and
projector.py needs a restart
- After restarting, the user tests the new feature by asking to see the first route projection, and it works
Technical Considerations
Cost and Model Selection
The Agent SDK uses the Claude API (pay-per-token), which is more expensive than a flat-rate Claude subscription (like Claude Max). The demo uses Claude Sonnet as a deliberate tradeoff: it is faster and cheaper than Opus while still capable enough for the coding tasks required.
Architecture Pattern
This demo illustrates a powerful pattern for agentic applications:
- Voice-to-agent pipeline: Natural speech triggers structured tool calls behind the scenes
- Agent-to-agent communication: The conversational agent delegates specialized tasks (code changes) to a purpose-built coding agent
- Physical-world feedback loop: Results are projected back into the physical environment (the pool table), closing the loop between digital and physical
- Iterative development without context switching: The user never has to leave the activity to interact with a terminal, IDE, or browser
Additional Capabilities Shown
Beyond pool coaching and live coding, the Earl agent also controls studio hardware. A quick voice command ("turn the lights red") changes the studio lighting instantly, showing that the same agent framework can be extended to control IoT devices and other physical systems.
Key Takeaways
- The Claude Agent SDK enables multi-agent architectures where specialized agents handle different responsibilities (conversation, coding, hardware control)
- Voice interfaces combined with agentic coding create a hands-free development experience suited to physical or immersive environments
- Projecting agent output into the physical workspace eliminates the need to context-switch to a screen
- Using Sonnet instead of Opus is a practical cost-optimization strategy when the coding tasks are well-scoped
- The same agent framework that powers the pool trainer can extend to any domain where real-time voice interaction, code generation, and physical feedback converge