[State of Evals] LMArena's $100M Vision — Anastasios Angelopoulos, LMArena
December 31, 2025
AI Summary
5 min read🎙️ The Voices & The Context
- The Format: This casual interview dives into the rapid evolution of Arena (formerly LMS Arena), blending insider anecdotes on AI benchmarking with startup hustle, delivered in an enthusiastic, technical tone that buzzes with industry optimism and light-hearted banter.
- The Format: Casual interview between hosts and guest.
- The Key Players:
- Anastasius (Anastasios Angelopoulos): Co-founder/CTO of Arena, a Berkeley alum who spun out LMSYS Arena into a $100M-funded company; famous for building the world's largest organic AI model evaluation platform with millions of real-user votes.
- Hosts: Unnamed but AI-savvy podcasters with sharp questions on funding, drama, and scaling; their chemistry sparks lively back-and-forth, riffing on AI hype like "nano banana" while probing business realities.
Continue reading the full summary in the app — free to try.
Read Full Summary →Free • No credit card required
What you'll learn
- 1 `(00:00)` **🎙️ Introduction: Anastasius from Arena**
- 2 `(00:44)` **Company Origins and Spin-Out Story**
- 3 `(03:36)` **Fundraising and Resource Allocation**
- 4 `(08:38)` **Technical Evolution: Moving Off Gradio**
- 5 `(06:02)` **Competitors and Differentiation**
- 6 `(10:09)` **Leaderboard Illusion Controversy with Cohere**
- 7 `(12:31)` **Pre-Release Hits: Nano Banana and Impact**
+ Full timestamped outline available in the app
Show Notes
From building LMArena in a Berkeley basement to raising $100M and becoming the de facto leaderboard for frontier AI, Anastasios Angelopoulos returns to Latent Space to recap 2025 in one of the most influential platforms in AI—trusted by millions of users, every major lab, and the entire industry to answer one question: which model is actually best for real-world use cases? We caught up with Anastasios live at NeurIPS 2025 to dig into the origin story (spoiler: it started as an academic project incubated by Anjney Midha at a16z, who formed an entity and gave grants before they even committed to starting a company), why they decided to spin out instead of staying academic or nonprofit (the only way to scale was to build a company), how they're spending that $100M (inference costs, React migration off Gradio, and hiring world-class talent across ML, product, and go-to-market), the leaderboard delusion controversy and why their response demolished the paper's claims (factual errors, misrepresentation of open vs. closed source sampling, and ignoring the transparency of preview testing that the community loves), why platform integrity comes first (the public leaderboard is a charity, not a pay-to-play system—models can't pay to get on, can't pay to get off, and scores reflect millions of real votes), how they're expanding into occupational verticals (medicine, legal, finance, creative marketing) and multimodal arenas (video coming soon), why consumer retention is earned every single day (sign-in and persistent history were the unlock, but users are fickle and can leave at any moment), the Gemini Nano Banana moment that changed Google's market share overnight (and why multimodal models are becoming economically critical for marketing, design, and AI-for-science), how they're thinking about agents and harnesses (Code Arena evaluates models, but maybe it should evaluate full agents like Devin), and his vision for Arena as the central evaluation platform that provides the North Star for the industry—constantly fresh, immune to overfitting, and grounded in millions of real-world conversations from real users.
We discuss:
The $100M raise: use of funds is primarily inference costs (funding free usage for tens of millions of monthly conversations), React migration off Gradio (custom loading icons, better developer hiring, more flexibility), and hiring world-class talent
The scale: 250M+ conversations on the platform, tens of millions per month, 25% of users do software for a living, and ha
More from this podcast
Latent Space: The AI Engineer Podcast →