Aparna Dhinakaran – Medium

Aparna Dhinakaran

Published in
TDS Archive

Choosing Between LLM Agent Frameworks

The tradeoffs between building bespoke code-based agents and the major agent frameworks.

Sep 21, 2024

Choosing Between LLM Agent Frameworks

Sep 21, 2024

Published in
TDS Archive

Navigating the New Types of LLM Agents and Architectures

The failure of ReAct agents gives way to a new generation of agents — and possibilities

Aug 30, 2024

Navigating the New Types of LLM Agents and Architectures

Aug 30, 2024

Published in
TDS Archive

Evaluating SQL Generation with LLM as a Judge

Results point to a promising approach

Jul 31, 2024

Evaluating SQL Generation with LLM as a Judge

Jul 31, 2024

Published in
TDS Archive

Large Language Model Performance in Time Series Analysis

How do major LLMs stack up at detecting anomalies or movements in the data when given a large set of time series data within the context…

May 1, 2024

Large Language Model Performance in Time Series Analysis

May 1, 2024

Published in
TDS Archive

Tips for Getting the Generation Part Right in Retrieval Augmented Generation

Results from experiments to evaluate and compare GPT-4, Claude 2.1, and Claude 3.0 Opus

Apr 6, 2024

Tips for Getting the Generation Part Right in Retrieval Augmented Generation

Apr 6, 2024

Published in
TDS Archive

Model Evaluations Versus Task Evaluations

Understanding the difference for LLM applications

Mar 26, 2024

Model Evaluations Versus Task Evaluations

Mar 26, 2024

Published in
TDS Archive

Why You Should Not Use Numeric Evals For LLM As a Judge

Testing major LLMs on how well they conduct numeric evaluations

Mar 8, 2024

Why You Should Not Use Numeric Evals For LLM As a Judge

Mar 8, 2024

Published in
TDS Archive

The Needle In a Haystack Test

Evaluating the performance of RAG systems

Feb 15, 2024

The Needle In a Haystack Test

Feb 15, 2024

Published in
TDS Archive

LLM Evals: Setup and the Metrics That Matter

How to build and run LLM evals — and why you should use precision and recall when benchmarking your LLM prompt template

Oct 13, 2023

LLM Evals: Setup and the Metrics That Matter

Oct 13, 2023

Published in
TDS Archive

Safeguarding LLMs with Guardrails

A pragmatic guide to implementing guardrails, covering both Guardrails AI and NVIDIA’s NeMo Guardrails

Sep 1, 2023

Safeguarding LLMs with Guardrails

Sep 1, 2023

Aparna Dhinakaran

Aparna Dhinakaran

Co-Founder and CPO of Arize AI. Formerly Computer Vision PhD at Cornell, Uber Machine Learning, UC Berkeley AI Research.

Following

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams