# LLM Gateways Compared: Portkey vs Helicone vs LiteLLM for Caching & Cost Control

> How Portkey, Helicone, and LiteLLM compare for caching, cost control, and observability — each one's 2026 status and which fits self-hosted vs. hosted.

An LLM gateway centralizes caching, fallback, cost tracking, and budgets across your model traffic. Portkey is a gateway-plus-LLMOps platform; LiteLLM is an open-source library or self-hosted proxy; Helicone is observability-first with a one-line proxy, but is now in maintenance mode after its 2026 Mintlify acquisition. Pick by what you'll operate and the control plane you need.

Once more than one app talks to an LLM, you start wanting a single place to handle caching, fallback, keys, cost, and budgets — instead of reimplementing them in every service. That place is an **LLM gateway**. This guide compares the three most common choices for **caching and cost control** — [Portkey](/tools/portkey), [LiteLLM](/tools/litellm), and [Helicone](/tools/helicone) — including each one's current status, which matters more than usual in 2026.

## What a gateway gives you

- **Caching** — serve repeated calls from cache to cut cost and latency (the cost lever that matters most).
- **Reliability** — fallback across providers and load balancing so one outage doesn't take you down.
- **Cost control** — central key management, per-team budgets, cost tracking, and rate limits.
- **One interface** — usually OpenAI-compatible, so existing code and SDKs work with a base-URL change.

## The three, by shape

### [Portkey](/tools/portkey) — gateway + LLMOps control plane

The most platform-complete option. An **open-source (MIT) routing gateway** — 1,600+ models, retries, fallbacks, load balancing, and both **simple and semantic caching** — paired with a **freemium hosted control plane** for observability, prompt management, virtual keys, budgets, guardrails, and governance. Best when you want caching and cost control as a managed, batteries-included service. (Palo Alto Networks acquired Portkey in 2026 — unlike Helicone's, a continuity move: it becomes the gateway in PANW's AI-security platform and stays actively developed.)

### [LiteLLM](/tools/litellm) — open-source library or self-hosted proxy

Call 100+ models through one OpenAI-format interface as a **library**, or run its **proxy** as a self-hosted gateway with central keys, fallbacks, caching, cost tracking, and rate limits. Best when you want to **own** the gateway end-to-end — for data control, custom policy, or on-prem — with no third party in the request path. (It's also the unified-access layer covered in [Calling Any Model](/guides/concepts/calling-any-model-gateways).)

### [Helicone](/tools/helicone) — observability-first, one-line proxy

Famous for the lowest-friction on-ramp: change your base URL and your calls are logged, traced, and analyzed, with proxy-level **caching** and great cost/latency visibility. Open source (Apache-2.0) and self-hostable.

> [!WARNING]
> **Helicone's 2026 status:** Mintlify [acquired Helicone](https://www.helicone.ai/blog/joining-mintlify) in March 2026, and it's now in **maintenance mode** — security and bug fixes only, no new features or roadmap, with migration assistance for customers. The open-source proxy still works and self-hosts fine, so existing users aren't stranded, but for a **new** project weigh that it's no longer actively developed.

## Caching & cost control, head to head

All three cache and track cost; the difference is how much is managed for you:

| | Caching | Cost control | Form factor | 2026 status |
|---|---|---|---|---|
| **Portkey** | Simple + semantic | Budgets, virtual keys, rate limits, cost analytics | OSS gateway + hosted plane | Actively developed |
| **LiteLLM** | Proxy cache | Cost tracking, budgets, rate limits (self-run) | Library or self-hosted proxy | Actively developed |
| **Helicone** | Proxy cache | Cost/latency analytics | One-line proxy / self-host | Maintenance mode |

## Operate a self-hosted gateway like security infrastructure

A gateway sees **every prompt and every key** you route through it, so self-hosting one is a security decision, not just an ops one. 2026 made this concrete for LiteLLM: a brief **supply-chain compromise** of its PyPI packages (remediated in a clean release with a hardened CI pipeline) and a critical proxy **SQL-injection vulnerability** (CVE-2026-42208, patched) that was exploited soon after disclosure. None of this makes LiteLLM a bad choice — it's a mature, widely used project that responded with hardening — but it's the reminder that applies to **any** self-hosted gateway, Portkey's included: pin and verify package versions, patch promptly, lock down network and key access, and monitor the proxy.

## How to choose

- **Want a managed, batteries-included caching + cost-control plane** → **Portkey**.
- **Want to self-host and fully own the gateway** → **LiteLLM**.
- **Already running Helicone** → keep the self-hosted proxy if it serves you; **starting fresh** → factor in its maintenance-mode status and consider the actively-developed options.
- **Just need a hosted router with zero ops** (not a full control plane) → the hosted [OpenRouter](/tools/openrouter) is the lighter-weight cousin.

For the techniques these gateways automate — caching, right-sizing, and p95 budgets — see [LLM Cost and Latency Engineering](/guides/advanced/llm-cost-latency-engineering), restructure prompts for cache hits with the [prompt-cache-optimizer](/skills/performance/prompt-cache-optimizer), and let the [llm-cost-optimizer](/agents/data-ai/llm-cost-optimizer) run the whole optimization loop.

---

_Source: https://agentscamp.com/guides/advanced/llm-gateways-compared — Guide on AgentsCamp._
