Cheeky Pint
Cheeky Pint

The 20-year journey to fully autonomous cars with Dmitri Dolgov of Waymo

March 24, 2026

AI Summary

5 min read

Dmitri Dolgov, co-CEO of Waymo, recounts his 20-year path in self-driving cars, starting as one of Google's first engineers on the project in 2009. Waymo now delivers over 500,000 fully autonomous rides weekly across 11 U.S. cities with about 3,000 vehicles, logging more than 4 million autonomous miles per week. Drawing from his Soviet-era upbringing and physics/math training in Russia and the U.S., Dolgov explains the technical foundations, evolution, and path to scaling.

Sensing and Real-Time Driving

A Waymo vehicle uses three complementary sensor types—LiDAR (high-resolution 3D mapping via laser pulses), radar (lower resolution but robust in fog, snow, or rain), and cameras (strong in clear conditions)—all providing 360-degree coverage. Microphones add audio cues, but the core trio feeds data into on-vehicle computers for real-time AI inference, with no cloud dependency for driving. Encoders process raw sensor data into world models (detecting objects like cars, pedestrians, roads, and signs), while a decoder generates actions like steering or braking via a vehicle interface. This setup handles multi-agent interactions, predicting behaviors in complex scenarios, such as nudging around a bus while detecting a hidden pedestrian via faint LiDAR reflections under it.

Continue reading the full summary in the app — free to try.

Read Full Summary →

Free • No credit card required

What you'll learn

  • 1 (00:01) **Dmitri Dolgov's Background** - Introduces Waymo co-CEO's journey from Google's 2009 self-driving project to leading 500k+ weekly autonomous rides
  • 2 (03:06) **Waymo Ride Technical Architecture** - Describes real-time onboard processing from sensors to vehicle actuation
  • 3 (05:36) **Cloud vs Onboard Processing** - Explains non-real-time cloud uses like post-ride mess detection and lost item alerts
  • 4 (06:32) **Debates on Self-Driving Approaches** - Addresses end-to-end vs modular, cameras-only vs multi-sensor as settled but nuanced
  • 5 (09:59) **20-Year Tech Evolution** - Traces progress from early dead-ends to AI breakthroughs like transformers
  • 6 (11:54) **World Encoding and End-to-End Learning** - Details sensor fusion into learned embeddings, augmented with structured reps (objects, roads)
  • 7 (19:46) **Optimization Objectives** - Outlines safety, smoothness, predictability, social fit in driving behavior

+ Full timestamped outline available in the app

Show Notes

Waymo is now doing 500,000 rides a week across 11 cities. Co-CEO Dmitri Dolgov came to the pub to discuss how the team went from scientific research to global scale. He gives a masterclass on the sensor stack (and why you still need Lidar), how they use "Teacher" and "Critic" models to train the AI, and why he believes cars that require human supervision will never naturally evolve into robotaxis. They also cover the new custom-built vehicle that feels like a living room, the economics of ride-hailing in rural Alaska, and the "Russian math nerd" diaspora that seems to run the UK tech scene.

Timestamps

(00:00:22) Russia

(00:02:51) Waymo architecture

(00:09:59) Why now?

(00:19:46) Driving nuance

(00:29:37) Stripe Agentic Commerce Suite

(00:30:17) Hardware

(00:40:20) Emergent behavior

(00:46:36) Scaling

(00:57:56) Google


Article:

EMMA: End-to-End Multimodal Model for Autonomous Driving – Waymo Research: https://waymo.com/research/emma/

Cheeky Pint

More from this podcast

Cheeky Pint →