The First Mechanistic Interpretability Frontier Lab — Myra Deng & Mark Bissell of Goodfire AI

February 6, 2026

AI Summary

5 min read

🎙️ The Voices & The Context

The Format: Casual tech podcast chat with live demos, blending interview-style Q&A, technical deep dives, and real-time product showcases.
The Key Players:
- Guests: Mark (Technical Lead at Goodfire, ex-Palantir healthcare) and Myra (Head of Product, ex-Two Sigma); they're from Goodfire, an AI interpretability lab announcing a $150M Series B at $1.25B valuation (unicorn status).
- Hosts: Vivek (main interviewer, AI podcaster), with co-hosts Mochi (human?) and Mochi the Doggo; banter-heavy chemistry on AI hype, demos, and "what is interpretability?"
The Vibe: Educational yet exciting—nerdy mech interp breakdowns mixed with "wow" demo moments, optimistic futurism, light humor on AI quirks like Gen Z slang.

🗝️ Key Themes & Topics

The episode unpacks AI interpretability (mech interp) as the "next frontier" for safe, controllable AI, bridging research to production via Goodfire's tools like SAEs, probes, and steering.

Topic 1: Goodfire's Mission & Fundraise
Applied interp lab shipping APIs (Ember) for real-world use; focuses on understanding model internals for editing behaviors, data curation, and training. Big news: $150M Series B amid rapid growth from 10 to 40+ employees.

Continue reading the full summary in the app — free to try.

Read Full Summary →

Free • No credit card required

Listen to Audio Summary Open in App

Never miss an episode of Latent Space: The AI Engineer Podcast

Get every new episode summarized in your inbox — free, ~5 minutes to read.

No spam. Unsubscribe anytime.

What you'll learn

1 (00:00) **🎙️ Introduction: Mark and Mira from Goodfire**
2 (02:50) **Defining Interpretability**
3 (07:00) **Post-Training Applications and Challenges**
4 (13:30) **Research Workflow and Priorities**
5 (18:00) **Production Use Case: Rakuten PII Detection**
6 (21:00) **Live Steering Demo on Kimika K2 (1T params)**
7 (25:00) **Finding and Interpreting SAE Features**

+ Full timestamped outline available in the app

Show Notes

Tickets for AIE Miami and AIE Europe are on sale now!

From Palantir and Two Sigma to building Goodfire into the poster-child for actionable mechanistic interpretability, Mark Bissell (Member of Technical Staff) and Myra Deng (Head of Product) are trying to turn “peeking inside the model” into a repeatable production workflow by shipping APIs, landing real enterprise deployments, and now scaling the bet with a recent $150M Series B funding round at a $1.25B valuation.

In this episode, we go far beyond the usual “SAEs are cool” take. We talk about Goodfire’s core bet: that the AI lifecycle is still fundamentally broken because the only reliable control we have is data and we post-train, RLHF, and fine-tune by “slurping supervision through a straw,” hoping the model picks up the right behaviors while quietly absorbing the wrong ones. Goodfire’s answer is to build a bi-directional interface between humans and models: read what’s happening inside, edit it surgically, and eventually use interpretability during training so customization isn’t just brute-force guesswork.

Mark and Myra walk through what that looks like when you stop treating interpretability like a lab demo and start treating it like infrastructure: lightweight probes that add near-zero latency, token-level safety filters that can run at inference time, and interpretability workflows that survive messy constraints (multilingual inputs, synthetic→real transfer, regulated domains, no access to sensitive data). We also get a live window into what “frontier-scale interp” means operationally (i.e. steering a trillion-parameter model in real time by targeting internal features) plus why the same tooling generalizes cleanly from language models to genomics, medical imaging, and “pixel-space” world models.

We discuss:

* Myra + Mark’s path: Palantir (health systems, forward-deployed engineering) → Goodfire early team; Two Sigma → Head of Product, translating frontier interpretability research into a platform and real-world deployments

* What “interpretability” actually means in practice: not just post-hoc poking, but a broader “science of deep learning” approach across the full AI lifecycle (data curation → post-training → internal representations → model design)

* Why post-training is t

More from this podcast
Latent Space: The AI Engineer Podcast →