Skip to content
agentscamp

Evaluation Guides

A curated collection of 2 evaluation guides for building with AI coding agents.

Guide

Best LLM & RAG Evaluation Tools in 2026: DeepEval vs RAGAS vs LangSmith vs Phoenix vs promptfoo

A decision guide to the LLM eval landscape — code-first frameworks vs. eval-and-observability platforms, open-source vs. hosted, and which fits your stack.

3m read· AgentsCamp
Guide

Write Evals for an LLM App: From Zero to a CI Gate

How to evaluate an LLM feature — build a dataset, choose metrics, set a baseline, score offline, add an LLM judge, and gate CI so quality changes are measured.

3m read· AgentsCamp