Prompt Pii Redactor
Detect and redact PII and secrets from prompts (and logs/traces) before they reach an LLM provider — mask or tokenize emails, phone numbers, names, IDs, and API keys, reversibly where the response needs the real values back. Use when sending user or document data to a third-party model, or when LLM request logs may capture sensitive data.
Install to ~/.claude/skills/prompt-pii-redactor/SKILL.md
Strips PII and secrets from prompts (and logs/traces) before they leave for an LLM provider: it detects emails, phones, names, IDs, and API keys and masks or tokenizes them — reversibly when the response needs the originals restored — so sensitive data isn't sent to a third party or captured in logs.
Every prompt you send to a hosted model leaves your environment, and every request you log may persist sensitive data. This skill puts a redaction layer in front of that boundary: it detects PII and secrets in outgoing prompts (and in traces/logs), masks or tokenizes them before they're sent, and — where the model's answer needs the real values — restores them on the way back. The goal is that third parties and log stores never see data they shouldn't.
When to use this skill
- Sending user messages or document content to a third-party LLM API where PII/secrets shouldn't leave your environment.
- LLM request/response logging or tracing that could capture sensitive data in plaintext.
- A compliance or data-residency requirement to minimize personal data sent to or stored by external services.
Instructions
- Define what's sensitive here. Enumerate the categories that matter for this app and jurisdiction: direct identifiers (names, emails, phones, addresses), government/financial IDs (SSN, card numbers), and secrets (API keys, tokens, credentials). Don't over-redact data the task genuinely needs — redaction that breaks the use case gets turned off.
- Detect with layered methods. Combine high-precision pattern/format detection (regex/validators for emails, cards, keys) with NER/model-based detection for free-form PII (names, locations). A library like LLM Guard's anonymize/secrets scanners covers much of this; match it to your data.
- Choose mask vs. reversible tokenize. For data the model never needs in the clear, mask (irreversible placeholder). For data the response must reference or return, tokenize reversibly — replace with a stable placeholder, then re-insert the original in the model's output (a vault/map held only in your environment).
- Apply at the boundary — both directions. Redact on the request before it leaves for the provider, and de-tokenize on the response if you tokenized. Apply the same redaction to anything written to logs/traces, which are an equally common leak.
- Verify and measure. Test against representative data for both misses (sensitive data that slipped through) and over-redaction (broke the task), and log redaction counts (not the values) so coverage is auditable.
- State the residual risk. Detection is imperfect — novel formats and contextual PII evade detectors. Note what's covered and recommend pairing with least-data-collection and provider data-handling controls (no-retention/zero-retention options) rather than relying on redaction alone.
WARNING
Reversible tokenization means the mapping from placeholder to real value lives in your environment and never in the prompt. If you send the model a key to reverse the tokens, you've sent the data — defeating the point. Keep the vault server-side and re-insert originals only after the response returns.
NOTE
Don't forget the logs. Teams redact the prompt to the provider but log the raw request for debugging — and the sensitive data lands in the log store anyway. Redact on the way to logs/traces too, or scrub at the logging layer.
Output
A redaction layer applied at the LLM boundary: the sensitive-data categories handled, the detection methods, the mask-vs-reversible-tokenize decisions, request/response and logging integration, and a coverage check (misses and over-redaction) — plus a clear statement of residual risk and the complementary controls (data minimization, provider no-retention) it should sit alongside.
Related
- LLM Guardrails DesignerDesign input and output guardrails for an LLM app — decide what to check (injection patterns, PII, secrets, policy, schema, leakage, toxicity), place them as input vs. output rails, implement with a library like NeMo Guardrails or LLM Guard, and fail closed. Use when adding a safety/validation layer around an LLM, not relying on the prompt alone.
- LLM GuardAn open-source security toolkit of input and output scanners for LLM apps — prompt injection, PII/anonymize, secrets, toxicity, and more, from Protect AI.
- Defending Against Prompt Injection: A Practical Guide for LLM AppsPrompt injection can't be solved at the model layer — so you defend in depth: trust boundaries, least privilege, human approval, guardrails, and red-teaming.
- Secret ScannerScan a repo or a diff for committed secrets — API keys, tokens, private keys, .env files, and high-entropy strings — then triage real leaks from fixtures. Use before pushing, in review, or when a credential may have leaked.