Guides
Long-form guides and tutorials for building with AI coding agents.
Long-form tutorials and deep-dives on building with AI coding agents. From first principles to advanced workflows, learn how to design agents, write skills, wire up MCP servers, and ship faster with Claude Code.
42 results
Building an MCP Server
An accurate introduction to the Model Context Protocol: server anatomy, transports, and connecting a tool to Claude Code.
Building Multi-Step Agent Workflows
Patterns for decomposing big tasks and coordinating multiple agents.
LLM Cost and Latency Engineering: Caching, Right-Sizing, and p95 Budgets
A practical playbook for cutting LLM cost and tail latency — caching, model right-sizing, prompt trimming, and enforced p95 budgets — without losing quality.
LLM Gateways Compared: Portkey vs Helicone vs LiteLLM for Caching & Cost Control
How Portkey, Helicone, and LiteLLM compare for caching, cost control, and observability — each one's 2026 status and which fits self-hosted vs. hosted.
Multi-Agent Orchestration
Four patterns for coordinating multiple agents — fan-out, pipeline, orchestrator-worker, and verify/critic — and when each earns its overhead.
Defending Against Prompt Injection: A Practical Guide for LLM Apps
Prompt injection can't be solved at the model layer — so you defend in depth: trust boundaries, least privilege, human approval, guardrails, and red-teaming.
Securing AI Agents: The OWASP Agentic Top 10 in Practice
Agents add risks LLM-app security misses — autonomy, tools, memory, multi-agent trust. The key OWASP agentic threats and how to mitigate each in practice.
Which Agent Framework in 2026? LangGraph vs CrewAI vs AutoGen vs OpenAI Agents SDK vs Claude Agent SDK
A decision guide to the major AI agent frameworks — control vs. abstraction, multi-agent models, state and durability, and which fits your project.
Agent Memory Architecture: Short-Term, Long-Term, and When to Use Each
How AI agents remember — working memory vs. persistent long-term memory, what to store, how to retrieve it, and how to keep context small.
Calling Any Model: Unified LLM Gateways & SDKs in 2026
Why teams put a unified layer in front of LLM providers — and how LiteLLM, OpenRouter, and the Vercel AI SDK compare for fallback and cost control.
Choosing Embeddings in 2026: OpenAI vs Cohere vs Voyage vs Open-Source
A decision guide for picking an embedding model for retrieval — accuracy, dimensions, cost, multilingual and domain fit, self-hosting, and lock-in.
How RAG Actually Works: Ingestion, Chunking, Retrieval & Reranking
A clear, practical walkthrough of the retrieval-augmented generation pipeline — what each stage does, where it fails, and how the pieces fit together.
Hybrid Search & Reranking: From Top-50 Recall to Top-5 Precision
How production RAG combines dense and sparse search, fuses with RRF, and reranks — turning a wide candidate set into the few passages that actually answer.
Production Tool & Function Calling: Feed Errors Back as Observations
How agents use tools — the call/observe/retry loop, why errors must return to the model, and the schemas, idempotency, and limits that keep it reliable.
Structured Output vs JSON Mode vs Function Calling: Which to Use in 2026
The reliable ways to get typed data out of an LLM — what JSON mode, function calling, and native structured outputs each guarantee, and when to use which.
CLAUDE.md Best Practices
How to write a CLAUDE.md that actually helps — what to include, what to leave out, and how to keep it current.
Best Vector Database in 2026: pgvector vs Pinecone vs Qdrant vs Weaviate vs Milvus vs Chroma vs LanceDB
A decision guide to vector databases — embedded, server, or managed; whether you already run Postgres; and which fits your scale, filtering, and RAG needs.
Indexing Postgres at Scale: B-Tree vs GIN vs BRIN and the Hidden Cost of Over-Indexing
A practical guide to choosing Postgres index types — B-Tree, GIN, BRIN, partial, and covering — and why every index you add taxes every write.
Zero-Downtime Postgres Migrations: The Expand-Contract Playbook for 2026
How to change a live Postgres schema without downtime or broken deploys — the expand-contract pattern, safe column changes, batched backfills, and CONCURRENTLY.
Best LLM & RAG Evaluation Tools in 2026: DeepEval vs RAGAS vs LangSmith vs Phoenix vs promptfoo
A decision guide to the LLM eval landscape — code-first frameworks vs. eval-and-observability platforms, open-source vs. hosted, and which fits your stack.
Write Evals for an LLM App: From Zero to a CI Gate
How to evaluate an LLM feature — build a dataset, choose metrics, set a baseline, score offline, add an LLM judge, and gate CI so quality changes are measured.
Choosing the Right Model: Haiku vs Sonnet vs Opus
How to pick the right Claude model tier for an agent or task.
Getting Started with Claude Code Agents
What Claude Code subagents are, why they help, and how to add your first one.
Installing Claude Code
Install Claude Code, authenticate, start a session in a real project, and add a minimal CLAUDE.md.
What Is Claude Code?
A grounded explanation of Claude Code: an agentic command-line coding tool that reads files, runs commands, and works in a loop toward a goal.
Writing Your First Custom Agent
A step-by-step guide to authoring a focused, effective custom subagent.
Deploying a Remote MCP Server: Stateless, Streamable HTTP, and Horizontal Scaling
Take an MCP server from local stdio to a remote, multi-user HTTP service — Streamable HTTP, stateless vs. stateful sessions, OAuth, and horizontal scaling.
Connecting and Governing MCP Servers: Registries, Gateways, and Tool Sprawl
As MCP servers multiply, discovery, trust, and tool sprawl become the problem. How registries, gateways, and curation keep a growing fleet secure and usable.
Preparing a Fine-Tuning Dataset: Cleaning, Synthetic Data, and Eval Splits
The dataset is the model. How to build a fine-tuning dataset that works — format, curation, cleaning, synthetic augmentation, and a leak-free eval split.
Fine-Tune vs RAG vs Prompt vs Distill: The 2026 Decision Tree
When to reach for prompt engineering, RAG, fine-tuning, or distillation — what each actually changes, where each fails, and how to combine them.
Self-Host vs API: When Does Running Your Own LLM Actually Pay Off?
The real economics of self-hosting an LLM vs. calling a hosted API — GPU utilization, privacy, latency, and the hidden ops costs that decide the crossover.
AI Coding Agents in 2026: The Open-Source & CLI Edition
Cursor and Windsurf vs the open-source agents — Cline, Aider, Codex CLI, Roo Code, and more. Who should bring their own model, and when to stay in the terminal.
Context Engineering
Treating the context window as a finite budget — what to load, what to leave out, and when to reset.
Cursor vs Claude Code vs GitHub Copilot vs Windsurf in 2026
A practical, opinionated comparison of the four mainstream AI coding tools — form factor, agentic depth, model choice, and who each one is for.
Programmatic Prompt Optimization with DSPy: Stop Hand-Tuning Prompts
Hand-tuning prompts doesn't scale. DSPy treats prompting as programming — declare tasks as typed signatures and let an optimizer compile the prompts for you.
Effective Tool Use: Scoping an Agent's Toolset
How to scope tools and permissions so an agent reaches for the right one and can't do damage.
Prompt Patterns for Coding Agents
Practical prompting patterns: chaining, few-shot, context management, tool use, and output structuring.
Few-Shot vs Chain-of-Thought vs Structured Prompting: What to Use When (2026)
When to reach for few-shot examples, chain-of-thought reasoning, or structured/output-constrained prompting — a 2026 decision guide to the core techniques.
Skills vs Agents vs Commands
How Claude Code's two extension mechanisms — subagents and skills — differ across three invocation patterns, with a decision table for choosing the right one.
Writing Your First Skill
A step-by-step guide to packaging a reusable procedure as a Claude Code skill that loads exactly when it's needed.
Using Vision-Language Models for OCR, Documents, and Video Understanding
How to use vision-language models for OCR, documents, and video: how they differ from traditional OCR, their failure modes, and getting reliable output.
How to Build a Voice Agent: The STT → LLM → TTS Pipeline
How to build a real-time voice agent: the STT → LLM → TTS pipeline, the latency budget that makes or breaks it, and how to wire each stage.
Frequently asked questions
- Who are these guides for?
- Developers building with AI coding agents — whether you're just starting with Claude Code or designing production agent workflows. Each guide is self-contained and practical.
- Are the guides free to read?
- Yes. Every guide on AgentsCamp is free to read, with no paywall and no signup.
- Where should I start?
- If you're new, begin with the getting-started guides on agents and skills, then move into MCP, prompting, and workflow deep-dives as you grow more comfortable.
- How often are the guides updated?
- Guides are revised as Claude Code and the surrounding tooling evolve. Each guide shows its publish and last-updated dates.