LLM Guard
An open-source security toolkit of input and output scanners for LLM apps — prompt injection, PII/anonymize, secrets, toxicity, and more, from Protect AI.
LLM Guard is an open-source security toolkit for LLM interactions, built around a library of input and output scanners you compose into a guardrail layer. On the way in, it can detect and sanitize prompt injection, strip PII, catch secrets, and ban topics; on the way out, it can check responses for sensitive-data leakage, relevance, and unsafe content. It's the ready-made scanner library you reach for when you don't want to hand-roll each detector.
It is aimed at developers hardening an LLM app who want practical, drop-in checks rather than building injection/PII/secret detection from scratch. LLM Guard comes from Protect AI (acquired by Palo Alto Networks in 2025) and is widely used as the input/output validation layer in production LLM stacks.
Highlights
- Input scanners — prompt-injection detection, PII anonymization, secrets detection, banned topics/substrings, token-limit and more, to sanitize prompts before the model sees them.
- Output scanners — sensitive-data and PII leakage, relevance, no-refusal, and safety checks before a response is trusted.
- Composable — enable the scanners you need and chain them; each returns a sanitized value and a risk signal.
- Self-hosted — runs in your environment, so the data being scanned never leaves it.
In an AI-assisted workflow
Scan and sanitize the prompt before sending it, then scan the model's output before using it:
from llm_guard.input_scanners import PromptInjection, Anonymize
from llm_guard import scan_prompt
sanitized, results, scores = scan_prompt(
[Anonymize(vault), PromptInjection()], user_input,
)
# ...call the model with `sanitized`, then run output scanners on the responseTIP
LLM Guard's Anonymize scanner pairs with a vault to restore PII in the response — the same reversible-tokenization pattern as the prompt-pii-redactor skill. Treat scanners as defense in depth alongside least privilege, per Defending Against Prompt Injection.
Good to know
LLM Guard is free and open source under MIT and self-hosted, so scanned data stays in your environment. It's a scanner library; for programmable conversational rails (Colang flows, dialog control) it pairs naturally with NeMo Guardrails, and for adversarially testing whether your guardrails hold, with promptfoo.
Related
- NeMo GuardrailsNVIDIA's open-source toolkit for adding programmable guardrails to LLM apps — input, dialog, retrieval, and output rails defined in the Colang language.
- LLM Guardrails DesignerDesign input and output guardrails for an LLM app — decide what to check (injection patterns, PII, secrets, policy, schema, leakage, toxicity), place them as input vs. output rails, implement with a library like NeMo Guardrails or LLM Guard, and fail closed. Use when adding a safety/validation layer around an LLM, not relying on the prompt alone.
- Prompt Pii RedactorDetect and redact PII and secrets from prompts (and logs/traces) before they reach an LLM provider — mask or tokenize emails, phone numbers, names, IDs, and API keys, reversibly where the response needs the real values back. Use when sending user or document data to a third-party model, or when LLM request logs may capture sensitive data.
- Defending Against Prompt Injection: A Practical Guide for LLM AppsPrompt injection can't be solved at the model layer — so you defend in depth: trust boundaries, least privilege, human approval, guardrails, and red-teaming.
- promptfooAn open-source CLI for testing, comparing, and red-teaming LLM prompts, models, and apps.