⚡️GPT5-Codex-Max: Training Agents with Personality, Tools & Trust — Brian Fioca + Bill Chen, OpenAI

December 26, 2025

AI Summary

5 min read

🎙️ The Voices & The Context

The Format: This casual live interview at the AIE Code conference connects insider OpenAI developers' insights directly to the rapid evolution of AI coding agents, fostering an enthusiastic, technical deep-dive into model training and future trends.
The Format: Casual interview with conference banter.
The Key Players:
- Guests: Bill and Brian from OpenAI's Codex team – They're fascinating as frontline engineers who shaped Codex Max, the new long-running coding agent, sharing proprietary training secrets, real-world hacks, and bold 2026 predictions from their roles close to GPT-5 training and agent evals.

🗝️ Key Themes & Topics

The episode buzzes with excitement over AI's shift from raw models to opinionated agents, blending technical depth on training quirks with visionary talks on evals and abstractions – perfect water-cooler fodder for devs dreaming of AI pair programmers.

Topic 1: Codex Max Launch and Capabilities – Bill and Brian unpack the freshly launched Codex Max, OpenAI's frontier coding model optimized for marathon runs (24+ hours, even days on local setups), emphasizing speed, maximization, and seamless integration in their harness; they differentiate it from prior versions, highlighting how it crushes complex problems faster while maintaining behavioral excellence like planning and self

Continue reading the full summary in the app — free to try.

Read Full Summary →

Free • No credit card required

Listen to Audio Summary Open in App

Never miss an episode of Latent Space: The AI Engineer Podcast

Get every new episode summarized in your inbox — free, ~5 minutes to read.

No spam. Unsubscribe anytime.

What you'll learn

1 (00:00) **🎙️ Introduction: Bill and Brian (OpenAI)**
2 (01:20) **Codex Max Launch and Capabilities**
3 (02:46) **Training Insights: Personality for Coding Agents**
4 (04:11) **Tool Integration and Partner Collaboration**
5 (05:20) **Codex vs. Mainline Models (GPT-5 Line)**
6 (07:38) **Model Habits and Creative Adaptations**
7 (09:15) **Personality in Coding: Communication and Trust**

+ Full timestamped outline available in the app

Show Notes

From the frontlines of OpenAI's Codex and GPT-5 training teams, Bryan and Bill are building the future of AI-powered coding—where agents don't just autocomplete, they architect, refactor, and ship entire features while you sleep. We caught up with them at AI Engineer Conference right after the launch of Codex Max, OpenAI's newest long-running coding agent designed to work for 24+ hours straight, manage its own context, and spawn sub-agents to parallelize work across your entire codebase.

We sat down with Bryan and Bill to dig into what it actually takes to train a model that developers trust—why personality, communication, and planning matter as much as raw capability, how Codex is trained with strong opinions about tools (it loves rg over grep, seriously), why the abstraction layer is moving from models to full-stack agents you can plug into VS Code or Zed, how OpenAI partners co-develop tool integrations and discover unexpected model habits (like renaming tools to match Codex's internal training), the rise of applied evals that measure real-world impact instead of academic benchmarks, why multi-turn evals are the next frontier (and Bryan's "job interview eval" idea), how coding agents are breaking out of code into personal automation, terminal workflows, and computer use, and their 2026 vision: coding agents trusted enough to handle the hardest refactors at any company, not just top-tier firms, and general enough to build integrations, organize your desktop, and unlock capabilities you'd never get access to otherwise.

We discuss:

What Codex Max is: a long-running coding agent that can work 24+ hours, manage its own context window, and spawn sub-agents for parallel work
Why the name "Max": maximalist, maximization, speed and endurance—it's simply better and faster for the same problems
Training for personality: communication, planning, context gathering, and checking your work as behavioral characteristics, not just capabilities
How Codex develops habits like preferring rg over grep, and why renaming tools to match its training (e.g., terminal-style naming) dramatically improves tool-call performance
The split between Codex (opinionated, agent-focused, optimized for the Codex harness) and GPT-5 (general, more durable across different tools and modalities)
Why the abstraction layer is moving up: from prompting models to plugging in full agents (Cod