AI Summary
5 min readClaire Vaux, host of How I AI and a product leader, shares her hands-on testing of OpenAI's newly released GPT 5.5 and GPT 5.5 Pro models. She emphasizes their ability to handle complex coding problems that stumped prior models like GPT-5.0 and Claude, particularly in the Codex environment, where they deliver higher intelligence and efficiency for software engineers.
Model Details and Initial Impressions
GPT 5.5 and the more powerful GPT 5.5 Pro are now available in Codex and ChatGPT, though not yet via API. OpenAI claims improved capacity for complex work and token efficiency, which Vaux confirms from weeks of testing. Pricing is high—$5 per million input tokens and $30 for output on GPT 5.5; $30 input and $180 output on Pro—but Vaux finds the results justify it for ambitious tasks. Unlike speed gains from earlier AI tools, GPT 5.5 enables tackling problems previously impossible due to its raw intelligence and reduced need for babysitting. It maintains context better during long sessions, allowing more autonomous progress. Vaux notes an "intelligence overhang": everyday users may struggle to find problems worthy of it, but developers will benefit most.
Continue reading the full summary in the app — free to try.
Read Full Summary →Free • No credit card required
What you'll learn
- 1 (00:00) **Introduction to GPT-5.5**
- 2 (03:04) **GPT-5.5 in ChatGPT for Simple Tasks**
- 3 (07:14) **GPT-5.5 Pro in Codex: General Impressions**
- 4 (08:20) **Codex Example 1: Security Issue Remediation**
- 5 (09:46) **Codex Example 2: Data Migration Tech Debt**
- 6 (15:57) **Ultimate Eval: Hacking Diveoom Mini 2 Device**
- 7 (21:48) **Final Thoughts on GPT-5.5**
+ Full timestamped outline available in the app
Show Notes
In this mini episode, I break down OpenAI’s new GPT 5.5 and GPT 5.5 Pro after weeks of early testing. I walk through three real jobs I threw at the model: building an app for me to teach my second grader more advanced subtraction concepts, tackling a tech debt problem in the ChatPRD codebase, and hacking into a proprietary Bluetooth pixel display that every other model had failed me on. My verdict: higher intelligence, better efficiency, and genuinely autonomous long-running loops that change what I think is worth tackling.
What you’ll learn:
- How I think about GPT 5.5 Pro’s pricing vs engineering time, and when I believe the “intelligence tax” is worth paying
- Why I treat GPT 5.5 as a developer model first, and why I couldn’t find a consumer use case that justified its intelligence
- The exact prompt pattern I use to unlock a long-running autonomous subagent loop
- How I got a near-six-hour autonomous run to one-shot 98% of edge cases in a migration over millions of chat threads and drop my Sentry error rate to the floor
- Why I’m now throwing GPT 5.5 at tech debt, flaky tests, and security backlogs first
- How I combined a Bluetooth packet sniffer and GPT 5.5 to reverse-engineer a proprietary pixel speaker after Claude Code and GPT 5.4 both gave up
- How I use the /personality command inside Codex to swap the default “baked potato” tone for something I actually enjoy working with
—
In this episode, I cover:
(00:00) Introduction to GPT 5.5 testing
(00:40) What is GPT 5.5 and how much does it cost?
(03:23) Testing GPT 5.5 in ChatGPT: the intelligence overhang problem
(07:12) Moving to Codex: where GPT 5.5 really shines
(16:01) Hacking a Chinese Bluetooth speaker
(21:47) Final thoughts on GPT 5.5’s intelligence and efficiency
—
Tools referenced:
• GPT 5.5 and GPT 5.5 Pro: https://openai.com/index/introducing-gpt-5-5/
• Codex: https://openai.com/codex/
• ChatGPT: https://chat.openai.com/
• Claude Code: https://claude.ai/code
• Sentry: https://sentry.io/
• Divoom MiniToo: https://divoom.com/products/minitoo
—
Other references:
• OpenAI Codex Security: h
More from this podcast
How I AI →