LiteLLM

LiteLLM lets you call 100+ LLMs (OpenAI, Anthropic, Google, Bedrock, local, and more) through one OpenAI-compatible interface. Use it as a Python library, or run its proxy as a self-hosted gateway with central keys, fallbacks, retries, caching, cost tracking, and rate limits.

LiteLLM gives you one interface to call virtually any LLM. Write your code once against the OpenAI format and LiteLLM translates to 100+ providers — Anthropic, Google, Azure, AWS Bedrock, local models, and more — so switching or mixing models is a config change, not a rewrite. It comes in two forms: a Python library for in-process calls, and a proxy server you run as a centralized gateway.

It is aimed at teams who don't want to be locked to one provider's SDK, and at platform teams who want a single control point for all LLM traffic. The proxy is where it becomes infrastructure: central API-key management, fallbacks across providers, retries, caching, cost tracking, and rate limits for every app behind it.

Highlights

One format, many providers — OpenAI-compatible calls to 100+ models; swap models via config.
Gateway/proxy — self-hosted control point with key management, budgets, and per-team rate limits.
Fallbacks & retries — automatically route around a failing or rate-limited provider.
Caching & cost tracking — cut spend and latency, and attribute cost per key/team.
Library or server — embed in code or run centrally for the whole org.

In an AI-assisted workflow

from litellm import completion
# same call, any provider — just change the model string
completion(model="anthropic/claude", messages=[...])
completion(model="gpt-5",            messages=[...])

Run the proxy and point every app at it to centralize keys, fallbacks, and cost.

TIP

Use the library for simple multi-provider code; run the proxy when you want one place to manage keys, budgets, fallbacks, and cost across many apps — the gateway pattern in Calling Any Model.

Good to know

LiteLLM is open source (MIT) and free to self-host; an enterprise edition adds advanced gateway features and support. As a hosted-key gateway it's infrastructure you operate — plan for its availability. Compare the fully-hosted OpenRouter if you'd rather not run a proxy.

Frequently asked questions

What is LiteLLM?

LiteLLM is an open-source tool that lets you call 100+ LLM providers — Anthropic, Google, Azure, AWS Bedrock, local models, and more — through one OpenAI-format interface. It comes as a Python library for in-process calls and as a proxy server you run as a centralized gateway with key management, fallbacks, retries, caching, and cost tracking.

Is LiteLLM free?

Yes — LiteLLM is open source under MIT and free to self-host. An enterprise edition adds advanced gateway features and support.

LiteLLM vs OpenRouter?

Both put many models behind one OpenAI-compatible interface. LiteLLM is software you run — a library or a self-hosted proxy you operate, with full control over keys and policies — while OpenRouter is a fully hosted gateway with one key and one bill. Choose by whether you want to operate the gateway yourself.

When should I use the LiteLLM proxy instead of the library?

Use the library for simple multi-provider code inside one app. Run the proxy when you want a single control point for many apps — central API-key management, per-team budgets and rate limits, fallbacks, and cost tracking across the org.

Highlights

In an AI-assisted workflow

Good to know

Frequently asked questions

Related