All Deep Dives For Infosec Conference Talks Covering LLM Evaluation. Talks analyzed in full.
Learn how NVIDIAs Project Marinade uses LLM coding agents to inject realistic, tunable vulnerabilities into real codebases - giving you ground-truth benchmarks to evaluate your security tools.
Learn how Stripe built and deployed two production AI security agents with multi-agent architecture, LLM-as-judge eval pipelines, and phased rollout.
Learn how Adobe built a RAG-powered security guidance platform delivering org-specific recommendations across Jira, Slack, and IDE at scale.
Learn why precision and recall fail for autonomous AI security agents — and how rubric-based LLM judge evaluation gives your team a reliable deployment bar.