[Latent Space LIVE @ NeurIPS] State of AI Startups 2025 — with Sarah Catanzaro, Amplify Partners

December 30, 2025

AI Summary

5 min read

🎙️ The Voices & The Context

The Format: This is a candid interview between a data/AI-savvy host and a prominent VC, weaving personal history in data infrastructure with sharp analysis of AI trends, mergers, and frothy funding, delivered in a technical yet conversational tone that feels like shop talk among insiders.
The Format: This is an interview.
The Key Players:
- Guest: Sarah Captain Zarrow from Amplify, a VC partner with deep roots in data engineering (early career in symbolic AI) and investments in hits like DBT; she's fascinating for her oscillation between data and AI, candid admissions of past prediction misses (e.g., data catalogs), and prescient views on AI infrastructure's symbiosis with frontier labs.

Continue reading the full summary in the app — free to try.

Read Full Summary →

Free • No credit card required

Listen to Audio Summary Open in App

Never miss an episode of Latent Space: The AI Engineer Podcast

Get every new episode summarized in your inbox — free, ~5 minutes to read.

No spam. Unsubscribe anytime.

What you'll learn

1 (00:00) **🎙️ Introduction: Sarah Captain Zarrow**
2 (00:56) **Modern Data Stack: DBT-Fivetran Merger**
3 (03:56) **Data Workloads: Analytics vs. AI Training**
4 (08:17) **AI Labs' Data Stacks & Infrastructure**
5 (10:14) **2025 Funding Environment: Crazy Raises**
6 (17:16) **Hot AI Research Themes: World Models**
7 (18:56) **Memory Management & Continual Learning**

+ Full timestamped outline available in the app

Show Notes

From investing through the modern data stack era (DBT, Fivetran, and the analytics explosion) to now investing at the frontier of AI infrastructure and applications at Amplify Partners, Sarah Catanzaro has spent years at the intersection of data, compute, and intelligence—watching categories emerge, merge, and occasionally disappoint. We caught up with Sarah live at NeurIPS 2025 to dig into the state of AI startups heading into 2026: why $100M+ seed rounds with no near-term roadmap are now the norm (and why that terrifies her), what the DBT-Fivetran merger really signals about the modern data stack (spoiler: it's not dead, just ready for IPO), how frontier labs are using DBT and Fivetran to manage training data and agent analytics at scale, why data catalogs failed as standalone products but might succeed as metadata services for agents, the consumerization of AI and why personalization (memory, continual learning, K-factor) is the 2026 unlock for retention and growth, why she thinks RL environments are a fad and real-world logs beat synthetic clones every time, and her thesis for the most exciting AI startups: companies that marry hard research problems (RAG, rule-following, continual learning) with killer applications that were simply impossible before.

We discuss:

The DBT-Fivetran merger: not the death of the modern data stack, but a path to IPO scale (targeting $600M+ combined revenue) and a signal that both companies were already winning their categories
How frontier labs use data infrastructure: DBT and Fivetran for training data curation, agent analytics, and managing increasingly complex interactions—plus the rise of transactional databases (RocksDB) and efficient data loading (Vortex) for GPU-bound workloads
Why data catalogs failed: built for humans when they should have been built for machines, focused on discoverability when the real opportunity was governance, and ultimately subsumed as features inside Snowflake, DBT, and Fivetran
The $100M+ seed phenomenon: raising massive rounds at billion-dollar valuations with no 6-month roadmap, seven-day decision windows, and founders optimizing for signal ("we're a unicorn") over partnership or dilution discipline
Why world models are overhyped but underspecified: three competing definitions, unclear generalization across use cases (video games ≠ robotics ≠ autonomous driving), and a research problem masquerading as a product category
The 2026 theme: