Skip to content
agentscamp

DevOps & Infra — AI Agents, Skills & Tools

Agents, skills, guides, tools, and commands for devops & infra — 15 curated resources for building with AI coding agents.

Agent

LLM Cost Optimizer

Use this agent to cut the cost and latency of an application's LLM API usage without losing quality — audit where the tokens and dollars go, then apply caching, model right-sizing, prompt trimming, batching, and budgets, proven against an eval bar. Examples — "our OpenAI bill tripled, find where the spend is and cut it", "this endpoint's p95 is 8s, bring it down", "right-size models per task and add prompt caching to our chat feature".

sonnet6
Agent

Dependency Manager

Use this agent to upgrade project dependencies safely — batching low-risk bumps apart from breaking majors and verifying each step. Examples — clearing months of stale packages, taking a single major version with migration notes, resolving a peer-dependency conflict.

sonnet5
Agent

Cloud Architect

Use this agent to design a cloud architecture on AWS, GCP, or Azure — compute, networking, data stores, IAM, and cost trade-offs. Examples — choosing serverless vs containers for a new service, designing a multi-account network boundary, picking a database and estimating its monthly cost.

sonnet3
Agent

DevOps Engineer

Use this agent for CI/CD, infrastructure, and automation. Examples — writing a CI pipeline, containerizing an app, infrastructure-as-code changes.

sonnet
Agent

Kubernetes Specialist

Use this agent for Kubernetes — manifests, Helm, troubleshooting, scaling, and resource tuning. Examples — debugging a CrashLoopBackOff, writing a Deployment, tuning requests/limits.

sonnet
Agent

SRE Engineer

Use this agent to make reliability measurable: SLIs/SLOs and error budgets, observability, symptom-based alerting, incident response, and capacity. Examples — defining an SLO for a checkout API, fixing a noisy pager, writing a blameless postmortem.

sonnet6
Agent

Terraform Specialist

Use this agent for Terraform and infrastructure-as-code — module design, remote state, plan/apply safety, drift, and provider pinning. Examples — reviewing a plan for destroys before apply, designing a reusable module, resolving state drift after a console change.

sonnet6
Skill

Prompt Cache Optimizer

Restructure an LLM call to maximize prompt-cache hit rate and add response/semantic caching — move the stable prefix (system prompt, instructions, few-shot, context) to the front and variable input to the end, set cache breakpoints, and measure the hit rate and savings. Use when repeated calls share large common context and token cost or latency is too high.

invocablev1.0.0
Skill

Dependency Audit

Audit project dependencies for known vulnerabilities and turn the raw scanner output into a triaged, prioritized upgrade plan. Use when an audit is noisy, a CVE was reported, or you need to know which advisories actually matter.

invocablev1.0.0
Guide

LLM Cost and Latency Engineering: Caching, Right-Sizing, and p95 Budgets

A practical playbook for cutting LLM cost and tail latency — caching, model right-sizing, prompt trimming, and enforced p95 budgets — without losing quality.

3m read· AgentsCamp
Guide

LLM Gateways Compared: Portkey vs Helicone vs LiteLLM for Caching & Cost Control

How Portkey, Helicone, and LiteLLM compare for caching, cost control, and observability — each one's 2026 status and which fits self-hosted vs. hosted.

4m read· AgentsCamp
Tool

Helicone

Open-source LLM observability and AI gateway with one-line integration — logging, tracing, caching, and cost/latency tracking across providers.

open sourceobservability
Tool

LiteLLM

Call 100+ LLM APIs with one OpenAI-format interface — as a Python library or a self-hosted gateway/proxy.

open sourcesdk
Tool

Portkey

An AI gateway and LLMOps platform: route to many LLMs through one API with caching, retries, fallbacks, load balancing, guardrails, and full observability.

freemiumplatform
Command

Set Perf Budget

Define and enforce a cost and latency budget for an LLM feature or endpoint — set p95/p99 latency and cost-per-request ceilings, instrument to measure them against real traffic, and wire a check that fails when the budget is breached.

/set-perf-budget<the LLM endpoint/feature to budget, plus any target numbers (e.g. 'chat API, p95 < 2s, < $0.02/req')>