Why Did It Say That?! A Guide to LLM Observability with OpenTelemetry

According to a 2025 IBM study of 2,000 CEOs globally, only 1 in 4 AI initiatives delivered the expected ROI and only 16% have scaled enterprise-wide. A big reason? Teams are still treating AI as a black box. When your LLM-based application hallucinates, costs spike, or latency tanks, you might be stuck asking "why?" with no good answers. Traditional debugging doesn't cut it for non-deterministic, multi-step AI workflows.

Here's the thing: LLM observability follows the same principles as distributed tracing, and you already know more than you think. In this talk, I'll share honest lessons on instrumentation, common questions and issues, and what actually works. I'll cut through the overwhelming landscape of AI observability tools and show you a practical path forward using OpenTelemetry instrumentation.

You'll learn to capture the signals that matter, debug latency per step, evaluate quality before shipping, and finally answer "why did it do that?" in minutes, not days. Debug your LLM-based application like you debug everything else, with confidence and clarity.

Niki Anderson

Senior Sales Engineer

Datadog

Niki is a Senior Sales Engineer at Datadog, partnering with customers to architect observability strategies for cloud-native and distributed systems. After 25 years of building and breaking things, she's learned that great systems aren't just about the tech; they're about the culture and clarity that help teams thrive. When not debugging, she's rock climbing, dancing, or finding balance on a yoga mat.

Why Did It Say That?! A Guide to LLM Observability with OpenTelemetry

Niki Anderson

Stay in the Loop!