Agentic Evals By Shishir Patil

Media Summary: Introducing the Agent Arena by Gorilla X LMSYS Chatbot Arena How do different agents stack up in tasks like search, ... As agents evolve from text conversations to autonomous agents capable of multi-step reasoning, tool use, and real-world task ... This week, Michelle Jackson and Edgar Berrios, internal audit leaders at MIT, join the show. Drawing from their work at MIT, they ...

Agentic Evals By Shishir Patil - Detailed Analysis & Overview

Introducing the Agent Arena by Gorilla X LMSYS Chatbot Arena How do different agents stack up in tasks like search, ... As agents evolve from text conversations to autonomous agents capable of multi-step reasoning, tool use, and real-world task ... This week, Michelle Jackson and Edgar Berrios, internal audit leaders at MIT, join the show. Drawing from their work at MIT, they ... On the Agenda: Team AI Directives as Part of the Connect and grow with me on other social media platforms: Linkedin: The Lakehouse made big data accessible. But it did not come with the management layer needed for what comes next.

Episode 98 of the Stanford MLSys Seminar Series! Teaching LLMs to Use Tools at Scale Speaker: With nearly two-thirds of enterprise developers planning production deployments of large language models this year, LLM ... Join Mahesh Yadav, top Maven instructor and former AI PM leader at Google, Meta, and Microsoft. In this session, Mahesh breaks ... Today, I want to share a new episode with Aman Khan. The best way to learn about AI

Photo Gallery

Agentic Evals by Shishir Patil

LLM Agent Arena (agent-arena.com)

Agentic Evaluations Workshop - Deep Dive on the Future on Evals for Agents.

Ep 284: How MIT Leaders Are Building AI That Works w/ Michelle Jackson and Edgar Berrio (MIT)

Team AI Directives: From Agentic SDLC Methodology to Production Evals Webinar 25/5/26 (Hebrew)

Agentic AI in the Enterprise 2026

Generative AI vs AI Agents vs Agentic AI Explained with Examples

Ep 7: DJ Patil – How Agentic AI Breaks Data Platforms

Teaching LLMs to Use Tools at Scale - Shishir Patil | Stanford MLSys #98

Shishir Patil: Teaching AI to Use APIs with Gorilla LLM | Humans of AI Podcast #7

Lessons from the Trenches: Building LLM Evals That Work IRL: Aparna Dhinkaran

How to set Evaluation for AI Agents & Scale them

View Detailed Profile

Agentic Evals by Shishir Patil

Agentic Evals by Shishir Patil

Shishir

LLM Agent Arena (agent-arena.com)

LLM Agent Arena (agent-arena.com)

Introducing the Agent Arena by Gorilla X LMSYS Chatbot Arena How do different agents stack up in tasks like search, ...

Agentic Evaluations Workshop - Deep Dive on the Future on Evals for Agents.

Agentic Evaluations Workshop - Deep Dive on the Future on Evals for Agents.

As agents evolve from text conversations to autonomous agents capable of multi-step reasoning, tool use, and real-world task ...

Ep 284: How MIT Leaders Are Building AI That Works w/ Michelle Jackson and Edgar Berrio (MIT)

Ep 284: How MIT Leaders Are Building AI That Works w/ Michelle Jackson and Edgar Berrio (MIT)

This week, Michelle Jackson and Edgar Berrios, internal audit leaders at MIT, join the show. Drawing from their work at MIT, they ...

Team AI Directives: From Agentic SDLC Methodology to Production Evals Webinar 25/5/26 (Hebrew)

Team AI Directives: From Agentic SDLC Methodology to Production Evals Webinar 25/5/26 (Hebrew)

On the Agenda: Team AI Directives as Part of the

Agentic AI in the Enterprise 2026

Agentic AI in the Enterprise 2026

Agentic

Generative AI vs AI Agents vs Agentic AI Explained with Examples

Generative AI vs AI Agents vs Agentic AI Explained with Examples

Connect and grow with me on other social media platforms: Linkedin: https://www.linkedin.com/in/balaji-chippada-0317/ ...

Ep 7: DJ Patil – How Agentic AI Breaks Data Platforms

Ep 7: DJ Patil – How Agentic AI Breaks Data Platforms

The Lakehouse made big data accessible. But it did not come with the management layer needed for what comes next.

Teaching LLMs to Use Tools at Scale - Shishir Patil | Stanford MLSys #98

Teaching LLMs to Use Tools at Scale - Shishir Patil | Stanford MLSys #98

Episode 98 of the Stanford MLSys Seminar Series! Teaching LLMs to Use Tools at Scale Speaker:

Shishir Patil: Teaching AI to Use APIs with Gorilla LLM | Humans of AI Podcast #7

Shishir Patil: Teaching AI to Use APIs with Gorilla LLM | Humans of AI Podcast #7

Shishir Patil

Lessons from the Trenches: Building LLM Evals That Work IRL: Aparna Dhinkaran

Lessons from the Trenches: Building LLM Evals That Work IRL: Aparna Dhinkaran

With nearly two-thirds of enterprise developers planning production deployments of large language models this year, LLM ...

How to set Evaluation for AI Agents & Scale them

How to set Evaluation for AI Agents & Scale them

Join Mahesh Yadav, top Maven instructor and former AI PM leader at Google, Meta, and Microsoft. In this session, Mahesh breaks ...

Complete Beginner's Course on AI Evaluations in 50 Minutes (2025) | Aman Khan

Complete Beginner's Course on AI Evaluations in 50 Minutes (2025) | Aman Khan

Today, I want to share a new episode with Aman Khan. The best way to learn about AI