The invited lineup covers the full agent-failure pipeline: mechanisms, closed-loop diagnostics, security, evaluation, and practical fixes.

Universite de Montreal, LawZero, and Mila
AI safety, frontier model governance, and deep learning foundationsYoshua Bengio is a Full Professor of Computer Science at Université de Montréal, Founder of Mila (the Quebec AI Institute), and Co-President of LawZero. A recipient of the 2018 Turing Award alongside Geoffrey Hinton and Yann LeCun, his research spans deep learning foundations and, increasingly, AI safety and the governance of frontier systems.
GFlowNetsInternational AI Safety Report

Stanford University and CZ Biohub
Tool use failures and automated red-teamingJames Zou is an Associate Professor of Biomedical Data Science, Computer Science, and Electrical Engineering at Stanford. His work focuses on making AI more reliable, human-compatible, and statistically rigorous, with major applications in health and biomedicine.
AvaTaRAutoRedTeamer

University of Illinois Urbana-Champaign and Virtue AI
Agent attack surfaces and tool-chain exploitsBo Li is the Wexler AI Scholar and an Associate Professor at UIUC, where she works on trustworthy machine learning, AI safety, security, privacy, and robustness. She is also the founder and CEO of Virtue AI.
ShieldAgentAutoRedTeamer

TU Darmstadt and MBZUAI
Long-horizon drift and error localizationIryna Gurevych is a Professor of Computer Science at TU Darmstadt and founder of the UKP Lab. Her research spans natural language processing, trustworthy AI, and machine learning methods for robust and interpretable language systems.
OpenFactCheckError localization for long-form QA

New York University
Verification gaps and grounded checkingGreg Durrett is an Associate Professor at NYU whose research centers on natural language processing, factuality, verification, and reasoning with language models in knowledge-intensive settings.
MiniCheckMolecular Facts

Apple and EPFL
Reasoning brittleness and evaluation artifactsSamy Bengio is a longtime AI research leader whose work spans large-scale machine learning, reasoning, and evaluation. He has held senior research leadership roles at Apple, Google, and major ML conferences.
The Illusion of ThinkingReasoning's Razor

Allen Institute for AI
Faithfulness failures and feedback groundingNouha Dziri is a researcher at AI2 working on large language models with a focus on reasoning limits, faithfulness, post-training, and safety-oriented evaluation.
FaithDialBEGIN

Carnegie Mellon University and AI2
Social failure modes and human-agent riskMaarten Sap is an Assistant Professor at Carnegie Mellon University and a part-time researcher at AI2. His work studies social intelligence, human-centered language systems, safety risks, and how people interact with language agents.
SOTOPIASOTOPIA-pi

NVIDIA Research
Tool and runtime reliability, plus system regressionsYi Dong is a principal research scientist at NVIDIA working on reasoning models and virtual agents. His research spans model reliability, prolonged reinforcement learning, and practical agent deployment for enterprise settings.
ProRLProRLv2

The Ohio State University and industry
Web-agent failures and online evaluation gapsYu Su is an Associate Professor at The Ohio State University whose work studies grounded language understanding, web agents, interactive systems, and evaluation in realistic environments.
Online-Mind2WebSEEACT

Stanford HAI
Frontier AI governance and deployment implicationsRishi Bommasani is a Senior Research Scholar at Stanford HAI whose work examines the societal, economic, and governance implications of frontier AI. He will speak on the economics and governance of frontier AI.
Foundation ModelsHELM