LiteLLM
Call 100+ LLM APIs with one OpenAI-format interface — as a Python library or a self-hosted gateway/proxy.
LiteLLM lets you call 100+ LLMs (OpenAI, Anthropic, Google, Bedrock, local, and more) through one OpenAI-compatible interface. Use it as a Python library, or run its proxy as a self-hosted gateway with central keys, fallbacks, retries, caching, cost tracking, and rate limits.
LiteLLM gives you one interface to call virtually any LLM. Write your code once against the OpenAI format and LiteLLM translates to 100+ providers — Anthropic, Google, Azure, AWS Bedrock, local models, and more — so switching or mixing models is a config change, not a rewrite. It comes in two forms: a Python library for in-process calls, and a proxy server you run as a centralized gateway.
It is aimed at teams who don't want to be locked to one provider's SDK, and at platform teams who want a single control point for all LLM traffic. The proxy is where it becomes infrastructure: central API-key management, fallbacks across providers, retries, caching, cost tracking, and rate limits for every app behind it.
Highlights
- One format, many providers — OpenAI-compatible calls to 100+ models; swap models via config.
- Gateway/proxy — self-hosted control point with key management, budgets, and per-team rate limits.
- Fallbacks & retries — automatically route around a failing or rate-limited provider.
- Caching & cost tracking — cut spend and latency, and attribute cost per key/team.
- Library or server — embed in code or run centrally for the whole org.
In an AI-assisted workflow
from litellm import completion
# same call, any provider — just change the model string
completion(model="anthropic/claude", messages=[...])
completion(model="gpt-5", messages=[...])Run the proxy and point every app at it to centralize keys, fallbacks, and cost.
TIP
Use the library for simple multi-provider code; run the proxy when you want one place to manage keys, budgets, fallbacks, and cost across many apps — the gateway pattern in Calling Any Model.
Good to know
LiteLLM is open source (MIT) and free to self-host; an enterprise edition adds advanced gateway features and support. As a hosted-key gateway it's infrastructure you operate — plan for its availability. Compare the fully-hosted OpenRouter if you'd rather not run a proxy.
Related
- OpenRouterA hosted unified API to hundreds of models from many providers, with one key, one bill, and automatic fallbacks.
- Calling Any Model: Unified LLM Gateways & SDKs in 2026Why teams put a unified layer in front of LLM providers — and how LiteLLM, OpenRouter, and the Vercel AI SDK compare for fallback and cost control.
- LLM Gateways Compared: Portkey vs Helicone vs LiteLLM for Caching & Cost ControlHow Portkey, Helicone, and LiteLLM compare for caching, cost control, and observability — each one's 2026 status and which fits self-hosted vs. hosted.
- Provider Fallback WrapperWrap LLM calls so a provider outage, rate limit, or timeout degrades gracefully — with multi-provider fallback, bounded retries with backoff, and timeouts. Use when an app depends on a single model/provider and needs production resilience.
- Vercel AI SDKAn open-source TypeScript toolkit for building AI apps — unified model API, streaming, structured output, tool calling, and UI hooks.
- HeliconeOpen-source LLM observability and AI gateway with one-line integration — logging, tracing, caching, and cost/latency tracking across providers.
- PortkeyAn AI gateway and LLMOps platform: route to many LLMs through one API with caching, retries, fallbacks, load balancing, guardrails, and full observability.