Lenny's Podcast: Product | Career | Growth
Lenny's Podcast: Product | Career | Growth

The coming AI security crisis (and what to do about it) | Sander Schulhoff

December 21, 2025

AI Summary

5 min read

🎙️ The Voices & The Context

  • The Format: This is a gripping interview between host Lenny and guest expert Sander Schulhof unpacking the hidden vulnerabilities in AI systems, delivered in a urgent, no-holds-barred tone that builds tension like a cybersecurity thriller.
  • The Key Players:
    • Guest: Sander Schulhof – Leading AI researcher specializing in adversarial robustness, creator of the first (now largest) AI red teaming competition, author of the top EMNLP 2023 paper on prompt injections, and educator via his Maven course; his insider view exposes industry flaws with unfiltered candor.

🗝️ Key Themes & Topics

The episode dives into the precarious state of AI security, warning that current defenses are illusory amid rising agentic AI risks, blending real-world examples, critiques, and pragmatic fixes.

  • Topic 1: Jailbreaking vs. Prompt Injection – Sander demystifies jailbreaking (tricking standalone models like ChatGPT into forbidden outputs) versus prompt injection (hijacking developer-built apps to ignore instructions), stressing their inevitability due to infinite prompt variations.
  • Topic 2: Failures of AI Guardrails and Red Teaming – Automated red teaming tools always expose vulnerabilities (working "too well"), while guardrails (input/output classifiers) fail spectacularly against adaptive human attacks, with claims of

Continue reading the full summary in the app — free to try.

Read Full Summary →

Free • No credit card required

What you'll learn

  • 1 `(00:00)` **🎙️ Introduction: Sander Schulhof**
  • 2 `(05:17)` **Guest Background and Episode Overview**
  • 3 `(07:29)` **Defining Jailbreaking vs. Prompt Injection**
  • 4 `(10:52)` **Real-World Attack Examples**
  • 5 `(18:01)` **Escalating Risks with Agents and Robots**
  • 6 `(19:45)` **AI Security Industry Breakdown**
  • 7 `(23:50)` **Key Concepts: Adversarial Robustness and ASR**

+ Full timestamped outline available in the app

Show Notes

Sander Schulhoff is an AI researcher specializing in AI security, prompt injection, and red teaming. He wrote the first comprehensive guide on prompt engineering and ran the first-ever prompt injection competition, working with top AI labs and companies. His dataset is now used by Fortune 500 companies to benchmark their AI systems security, he’s spent more time than anyone alive studying how attackers break AI systems, and what he’s found isn’t reassuring: the guardrails companies are buying don’t actually work, and we’ve been lucky we haven’t seen more harm so far, only because AI agents aren’t capable enough yet to do real damage.

We discuss:

1. The difference between jailbreaking and prompt injection attacks on AI systems

2. Why AI guardrails don’t work

3. Why we haven’t seen major AI security incidents yet (but soon will)

4. Why AI browser agents are vulnerable to hidden attacks embedded in webpages

5. The practical steps organizations should take instead of buying ineffective security tools

6. Why solving this requires merging classical cybersecurity expertise with AI knowledge

Brought to you by:

Datadog—Now home to Eppo, the leading experimentation and feature flagging platform: https://www.datadoghq.com/lenny

Metronome—Monetization infrastructure for modern software companies: https://metronome.com/

GoFundMe Giving Funds—Make year-end giving easy: http://gofundme.com/lenny

Transcript: https://www.lennysnewsletter.com/p/the-coming-ai-security-crisis

My biggest takeaways (for paid newsletter subscribers): https://www.lennysnewsletter.com/i/181089452/my-biggest-takeaways-from-this-conversation

Where to find Sander Schulhoff:

• X: https://x.com/sanderschulhoff

• LinkedIn: https://www.linkedin.com/in/sander-schulhoff

• Website: https://sanderschulhoff.com

• AI Red Teaming and AI Security Masterclass on Maven: https://bit.ly/44lLSbC

Where

Lenny's Podcast: Product | Career | Growth