NeMo Guardrails
NVIDIA's open-source toolkit for adding programmable guardrails to LLM apps — input, dialog, retrieval, and output rails defined in the Colang language.
NeMo Guardrails is an open-source toolkit from NVIDIA for adding programmable guardrails to LLM-based applications. Instead of trusting a system prompt to keep a model on-topic and safe, you define explicit rails — rules that run at specific points in the request/response flow — to constrain what the model sees, says, and does. It's a structured way to build the input/output validation layer that prompt-injection and safety defenses depend on.
It is aimed at developers who want enforceable boundaries around an LLM app: keeping a conversation on allowed topics, detecting jailbreak/injection attempts, moderating output, and checking responses against policy or for hallucination. Rails are authored in Colang, NeMo's modeling language for conversational guardrails, and configured alongside your app.
Highlights
- Multiple rail types — input rails (e.g. jailbreak/injection detection), dialog rails (keep the conversation in bounds), retrieval rails (filter retrieved context), and output rails (moderation, fact-checking, policy).
- Colang — a purpose-built language for expressing conversational flows and guardrail logic declaratively.
- Composable checks — combine built-in and custom checks, and integrate third-party safety models/scanners as rails.
- Framework-friendly — works alongside common LLM app stacks and providers as a wrapping safety layer.
In an AI-assisted workflow
Wrap your app with a guardrails config that defines the rails, then route calls through it:
from nemoguardrails import RailsConfig, LLMRails
config = RailsConfig.from_path("./config") # rails + Colang flows live here
rails = LLMRails(config)
response = rails.generate(messages=[{"role": "user", "content": user_input}])
# input/dialog/output rails run around the model callTIP
Guardrails are defense in depth, not prevention — pair NeMo Guardrails with least privilege and human approval for high-impact actions (see Defending Against Prompt Injection). Design which rails you actually need with the llm-guardrails-designer skill.
Good to know
NeMo Guardrails is free and open source under Apache-2.0 and runs as a Python layer around your LLM app. It's strongest on programmable conversational rails; for a ready-made library of input/output scanners (PII, secrets, prompt injection, toxicity), LLM Guard is complementary — many teams use both.
Related
- LLM Guardrails DesignerDesign input and output guardrails for an LLM app — decide what to check (injection patterns, PII, secrets, policy, schema, leakage, toxicity), place them as input vs. output rails, implement with a library like NeMo Guardrails or LLM Guard, and fail closed. Use when adding a safety/validation layer around an LLM, not relying on the prompt alone.
- LLM GuardAn open-source security toolkit of input and output scanners for LLM apps — prompt injection, PII/anonymize, secrets, toxicity, and more, from Protect AI.
- Defending Against Prompt Injection: A Practical Guide for LLM AppsPrompt injection can't be solved at the model layer — so you defend in depth: trust boundaries, least privilege, human approval, guardrails, and red-teaming.
- Securing AI Agents: The OWASP Agentic Top 10 in PracticeAgents add risks LLM-app security misses — autonomy, tools, memory, multi-agent trust. The key OWASP agentic threats and how to mitigate each in practice.