Skip to content
agentscamp

AI Safety & Security — AI Agents, Skills & Tools

Agents, skills, guides, tools, and commands for ai safety & security — 9 curated resources for building with AI coding agents.

Agent

Prompt Injection Auditor

Use this agent to audit an LLM app or agent for prompt-injection exposure — mapping where untrusted content enters the model's context (user, RAG, tools, web), assessing the blast radius if an injection succeeds, probing with adversarial inputs, and recommending architectural mitigations. Examples — "audit our RAG agent for indirect prompt injection", "what's the blast radius if our agent gets injected — which tools and credentials are exposed?", "review our LLM app's trust boundaries and tell us what to fix".

sonnet4
Skill

LLM Guardrails Designer

Design input and output guardrails for an LLM app — decide what to check (injection patterns, PII, secrets, policy, schema, leakage, toxicity), place them as input vs. output rails, implement with a library like NeMo Guardrails or LLM Guard, and fail closed. Use when adding a safety/validation layer around an LLM, not relying on the prompt alone.

invocablev1.0.0
Skill

Prompt Pii Redactor

Detect and redact PII and secrets from prompts (and logs/traces) before they reach an LLM provider — mask or tokenize emails, phone numbers, names, IDs, and API keys, reversibly where the response needs the real values back. Use when sending user or document data to a third-party model, or when LLM request logs may capture sensitive data.

invocablev1.0.0
Guide

Defending Against Prompt Injection: A Practical Guide for LLM Apps

Prompt injection can't be solved at the model layer — so you defend in depth: trust boundaries, least privilege, human approval, guardrails, and red-teaming.

5m read· AgentsCamp
Guide

Securing AI Agents: The OWASP Agentic Top 10 in Practice

Agents add risks LLM-app security misses — autonomy, tools, memory, multi-agent trust. The key OWASP agentic threats and how to mitigate each in practice.

4m read· AgentsCamp
Tool

LLM Guard

An open-source security toolkit of input and output scanners for LLM apps — prompt injection, PII/anonymize, secrets, toxicity, and more, from Protect AI.

open sourcesdk
Tool

NeMo Guardrails

NVIDIA's open-source toolkit for adding programmable guardrails to LLM apps — input, dialog, retrieval, and output rails defined in the Colang language.

open sourcesdk
Tool

promptfoo

An open-source CLI for testing, comparing, and red-teaming LLM prompts, models, and apps.

open sourceevaluation
Command

Red Team LLM

Red-team an LLM app or agent for prompt injection, jailbreaks, and data leakage — probe the real attack surface (input, RAG, tools, system prompt) with adversarial inputs and report what got through and how to fix it.

/red-team-llm<the app/endpoint/agent to test, or a description of its inputs, tools, and data>