# Production Tool & Function Calling: Feed Errors Back as Observations

> How agents use tools — the call/observe/retry loop, why errors must return to the model, and the schemas, idempotency, and limits that keep it reliable.

Tool calling is a loop: the model proposes a call, your code runs it, and the result — success OR error — goes back to the model as an observation it reasons about. The reliability comes from the engineering around that loop: schemas the model can't misuse, errors returned (never swallowed), bounded retries, idempotent side effects, and human gates on irreversible actions.

An agent is a language model in a loop with tools. The model can't do anything in the world by itself — it can only emit text, including a structured request to call a tool. Everything an agent *does* — search, query a database, send an email, run code — happens because your code executed a tool call and handed the result back. Getting that loop right is most of what makes an agent reliable.

## The loop

Tool calling is a cycle, not a one-shot:

1. You give the model a set of **tools** with schemas.
2. The model **proposes a call** — a tool name and arguments — when it decides one is needed.
3. Your code **validates and executes** it.
4. You **return the result as an observation** to the model.
5. The model reads the observation and either calls another tool or answers.

Repeat until the task is done. The model is the planner; your tool layer is the hands — and the safety system.

## The one rule: errors are observations, not exceptions

The single most important — and most violated — principle: **when a tool fails, return the error to the model as an observation.** Not a swallowed exception, not a crash, not nothing. An agent that receives `"404: invoice not found"` can adapt: fix the ID, try another tool, or tell the user. An agent that receives *nothing* assumes the call worked and proceeds on a result that doesn't exist — the classic "silent failure, then confidently wrong action."

> [!WARNING]
> Swallowing tool errors is the most common and most damaging agent bug. A failed payment that the agent thinks succeeded, a missing record it hallucinates around — these come from errors that never made it back to the model.

## What makes it production-grade

The loop is simple; the reliability is in the engineering around it:

- **Schemas the model can't misuse.** Tool definitions are prompt surface — precise types, enums, honest required fields, and model-facing descriptions prevent most bad calls before they happen (the [tool-definition-generator](/skills/api/tool-definition-generator) skill builds these). See also [Effective Tool Use](/guides/prompting/effective-tool-use) on scoping the toolset.
- **Bounded retries.** Retry transient failures (timeouts, rate limits) with backoff and a hard cap; don't retry non-retryable ones (bad request, auth) — that just burns budget.
- **Idempotent side effects.** For tools that change state, use idempotency keys or pre-checks so a retry or re-run can't double-charge or duplicate.
- **Human gates on irreversible actions.** Payments, deletions, deploys, outbound messages — gate behind approval enforced at the tool layer, not requested in the prompt ([human-in-the-loop-gate](/skills/workflow/human-in-the-loop-gate)).
- **Termination.** Always cap steps and budget so the loop can't run forever.
- **Safe parallelism.** Run independent calls concurrently for latency, but keep dependent or state-mutating calls ordered.

Most agent frameworks ([the comparison](/guides/concepts/agent-frameworks-2026)) implement the loop for you — but the schema quality, error handling, idempotency, and gates are still yours to get right. The [agent-tool-integration-engineer](/agents/data-ai/agent-tool-integration-engineer) builds this layer, and the [agent-reliability-reviewer](/agents/meta-orchestration/agent-reliability-reviewer) audits it before you ship.

---

_Source: https://agentscamp.com/guides/concepts/production-tool-calling — Guide on AgentsCamp._
