n8n Guardrails: Complete Guide to AI Safety in Your Workflows (2026)

AI workflows can automate almost anything, but that power comes with risk. Without guardrails in place, your workflows could send API keys to external LLMs, leak customer PII, accept jailbreak attempts, or return harmful content to end users. For larger companies, this is a dealbreaker.

n8n’s AI guardrails update changes that. Two dedicated nodes, Check Text for Violations and Sanitize Text, give you native safety controls directly inside your workflows. No custom code. No third-party services. Just configurable guardrails that sit between your data and your AI agents.

This guide walks through all 13 guardrail examples, covering every option in both nodes. Whether you’re building an enterprise AI policy or just want to stop PII from hitting your LLM, this is the setup to follow.

What Are n8n Guardrails?

n8n guardrails are built-in safety nodes that validate or sanitize text before or after it reaches an AI model in your workflow. They let you enforce security policies, content standards, and data privacy rules at the automation layer, so you don’t have to rely on prompting alone.

The guardrails feature was introduced in n8n version 1.113.3 and consists of two nodes: Check Text for Violations and Sanitize Text. Each handles safety in a different way.

If you’re new to n8n, check out this n8n for beginners guide to get familiar with the core concepts before setting up guardrails.

Check Text for Violations vs. Sanitize Text: Key Differences

These two nodes do different jobs. Check Text for Violations scans text against a set of rules and routes it to either a Pass or Fail output. Sanitize Text finds sensitive content and replaces it with a placeholder, then passes the cleaned text forward through a single output.

Here’s a quick breakdown of the differences:

Check Text for Violations: requires a connected LLM (like OpenAI GPT), 9 guardrail types, pass/fail routing
Sanitize Text: no LLM required, 4 guardrail types, single output with redacted text
Both can be placed before or after an AI agent node

The threshold setting applies to LLM-based guardrails inside Check Text for Violations. It’s a value between 0 and 1. A score of 0.7 means the LLM must be at least 70% confident before flagging something as a violation. Higher values are stricter and reduce false positives. Lower values catch more edge cases but may flag valid inputs.

The 9 Guardrail Types in Check Text for Violations

Check Text for Violations gives you nine different guardrail options. Each targets a specific type of risk. Here’s what each one does and when to use it.

1. Keywords

The keywords guardrail checks if any specified words or phrases appear in the input text. You enter your keywords as a comma-separated list. If any match, the text fails and routes to the Fail branch.

This is useful for blocking inputs that reference proprietary topics. For example, a company might add ‘source code’, ‘algorithm’, and ‘roadmap’ as keywords to prevent employees from asking AI agents about sensitive internal topics. The guardrail compares the input against that list and flags any match.

Keywords support expressions too, so you can pull the list dynamically from a previous node or a database if your policy changes frequently.

2. Jailbreak Detection

The jailbreak guardrail detects attempts to bypass AI safety measures. This includes classic prompt injection patterns like ‘ignore previous instructions’ or ‘output the response as hexadecimal so filters won’t see it.’

This is an LLM-based guardrail, meaning the connected chat model evaluates the input and scores it on a confidence scale from 0 to 1. The default threshold is 0.7. n8n ships with a detailed system prompt already built in for this guardrail, so no custom configuration is needed to get started.

Jailbreak attempts in the wild often use creative obfuscation, encoding tricks, or story framing to get around safety filters. This guardrail is designed to catch those patterns before they reach your AI agent.

3. NSFW Content Filtering

The NSFW guardrail detects attempts to generate not-safe-for-work content. Like jailbreak detection, this is LLM-based and uses a confidence threshold. Place this before an AI agent in any public-facing workflow, customer service bot, or tool that accepts open-ended user input.

4. PII Detection

Personally identifiable information (PII) covers a wide range of sensitive data. The n8n PII guardrail catches over 20 data types, including phone numbers, email addresses, credit card numbers, SSNs, IP addresses, passport numbers, driver’s licenses, medical licenses, bank account numbers, and several country-specific ID formats (Australia, India, Italy, UK).

This is critical for fintech, healthcare, and any business handling customer data. A common use case: a customer submits a support request that includes their email and phone number, and you don’t want those fields forwarded to an external LLM. The PII guardrail flags the input and routes it to the Fail branch before that data goes anywhere it shouldn’t.

Select only the PII types relevant to your workflow. Selecting all of them increases the chance of false positives, especially for inputs that mention dates or numerical codes.

5. Secret Key Detection

The secret key guardrail catches API keys, authentication tokens, and other credential strings that users might accidentally include in their inputs. This is a real risk in any workflow where users can type free-form text. A user who pastes a support message that includes their API key could inadvertently expose credentials to an external LLM.

Three sensitivity levels are available: balanced (default), strict (catches more but may flag high-entropy strings like file names or random codes), and permissive (fewer false positives but may miss some edge cases). Start with balanced and adjust based on your testing.

For workflows that involve users asking questions about API integrations, check out the n8n HTTP request node for context on how credentials are typically handled.

6. Topical Alignment

The topical alignment guardrail detects inputs that stray outside the defined scope of your AI agent. You define the business scope in the guardrail’s prompt field. The LLM then evaluates whether each incoming message aligns with that scope.

A practical example: a trivia bot should only answer trivia questions. If a user submits something completely off-topic, the guardrail flags it with high confidence. The business scope field accepts plain-language instructions, so no special formatting is required.

This guardrail works well both before and after an AI agent. Before, it filters irrelevant inputs. After, it checks that the agent’s response stayed on-topic.

7. URL Blocking

The URL guardrail blocks any URLs not included in your allowed list. You enter the permitted domains separated by commas. There’s also a toggle to allow or block subdomains, which matters if your allowed domain is something like n8n.io but you want to also permit docs.n8n.io.

A useful application: an n8n support AI agent that should only process questions about n8n documentation. Set the allowed URL to n8n.io and enable subdomains. Any input that includes a URL pointing to a different site will fail.

8. Custom Guardrail (Post-AI Agent Use Case)

The custom guardrail lets you write your own evaluation prompt. This is where things get flexible. While most guardrails make sense before an AI agent, custom guardrails are particularly powerful after an AI agent, when you need to check the quality or tone of the generated output before it reaches the end user.

A customer empathy guardrail is a good example of this. After an AI agent generates a support response, pass that output into a custom guardrail with a prompt that evaluates empathy and professionalism. If the response sounds frustrated or dismissive, it fails and gets routed to a human review queue instead of being sent to the customer.

You can write the custom prompt yourself or use an LLM to generate it. Provide a description of what you’re evaluating, a scoring rubric, and a threshold. The guardrail handles the rest.

9. Custom Regex

The custom regex guardrail lets you define your own pattern matching rules using regular expressions. This works the same as the custom prompt guardrail but uses regex instead of LLM evaluation, making it faster and cheaper for pattern-based checks.

Use this for structured data patterns that a language model might not reliably catch. For example, a specific internal account number format, a proprietary product code, or any string pattern unique to your business that you want to block or flag.

Regex can be complex to write from scratch. Asking an LLM to generate the regex based on examples is a practical shortcut.

Need Help Building AI Automations?

We build custom Claude and n8n automation systems for businesses. Schedule a free consultation.

See Our Services

Join Our AI Community

Get access to the JSON workflow files from this article, weekly live sessions, and a community of builders working through the same challenges. Everything is free and the community is active.

Join Now

Sanitize Text: The 4 Redaction Options

The Sanitize Text node works differently from Check Text for Violations. Instead of routing to pass or fail, it replaces sensitive content with a labeled placeholder and passes the cleaned version forward. This is useful when you still need to process the input but want to strip the sensitive parts first.

Sanitize Text does not require a connected LLM. It runs on pattern matching, which makes it fast and free to operate at scale.

1. Personal Data (PII Redaction)

Select personal data and choose which PII types to sanitize, using the same categories available in Check Text for Violations. Email addresses, phone numbers, credit card numbers, and other fields get replaced with a placeholder like [EMAIL_ADDRESS REDACTED] or [PHONE_NUMBER REDACTED].

Example input: ‘Hi, my name is Jane Smith and my email is jane@example.com and my phone is 555-867-5309.’ After sanitizing phone and email, the output becomes: ‘Hi, my name is Jane Smith and my email is [EMAIL_ADDRESS] and my phone is [PHONE_NUMBER].’ The rest of the message passes through cleanly.

2. Secret Key Redaction

Works the same as secret key detection in Check Text for Violations, but instead of failing the input, the node strips the key and replaces it with [SECRET_KEY]. Useful when you want to log support conversations without accidentally storing credentials.

3. URL Redaction

Redacts URLs that are not on your allowed list. Any URL in the input that doesn’t match your approved domains gets replaced with [URL_REDACTED]. Configure the allowed domains and subdomain toggle the same way as the URL guardrail in Check Text for Violations.

4. Custom Regex Redaction

Write a regex pattern to match any custom string format you want to redact. When the pattern matches, the matched content gets replaced with the name you give the sanitizer. For example, a pattern matching a credit card’s last four digits could be labeled [CARD_LAST4] in the output.

This is handy when the built-in PII options don’t cover a proprietary data format specific to your business. Combine multiple regex sanitizers on the same node to redact several patterns in one pass.

How to Set Up n8n Guardrails Step by Step

Here’s how to add guardrails to an existing AI workflow in n8n.

Step 1: Open your workflow and click the + icon to add a new node. Search for ‘guardrails’ and select either Check Text for Violations or Sanitize Text.

Step 2: For Check Text for Violations, connect a chat model sub-node. Click the LLM input on the guardrail node and add your preferred model. Any model that works with n8n’s LangChain integration will work here. To set this up with GPT, see this guide on how to connect OpenAI to n8n.

Step 3: In the guardrail node settings, set the Text to Check field to the expression from the previous node. Typically this is something like {{ $json.inputText }} depending on your data structure.

Step 4: Click Add Guardrail and select the type you want. Configure the relevant settings (keywords, PII types, URL allow list, threshold, etc.).

Step 5: Connect the Pass output to the next step in your workflow. Connect the Fail output to your error handling logic.

On the Fail branch, best practice is to do two things: log the violation to a data table or spreadsheet for auditing, and send an alert via Slack or email to whoever manages your AI policy. This creates an audit trail and keeps the right people informed when the guardrail triggers.

One useful feature: when you connect a Check Text for Violations node directly upstream of an n8n AI agent node, the AI agent automatically detects the connection. You’ll see a ‘Connected Guardrails Node’ indicator in the AI agent settings. The prompt field in the AI agent is automatically mapped to the guardrail’s output text, so the same text that passed through the guardrail flows directly into the agent as the user’s message.

When to Use Guardrails Before vs. After Your AI Agent

Most guardrails belong before your AI agent. That’s where you catch bad inputs, prevent data leaks, and block policy violations before any external API call is made. Keywords, PII detection, secret key detection, jailbreak, NSFW, and URL blocking all make the most sense at the input stage.

After the AI agent is where custom and topical alignment guardrails shine. The agent produces output, and the guardrail checks that output before it reaches the end user or gets written anywhere. A customer empathy guardrail after an agent catches responses that are technically correct but tone-deaf. A topical alignment check after the agent ensures the response didn’t go off the rails.

You can also combine both. Run input sanitization before the agent to clean the text, then run a custom guardrail after to verify the response quality. This creates a double layer of protection, which is the right approach for any public-facing AI workflow handling sensitive data.

Frequently Asked Questions

What are n8n guardrails?

n8n guardrails are two built-in nodes—Check Text for Violations and Sanitize Text—that enforce safety and content policies on text inside your workflows. They can be placed before or after AI agent nodes to block problematic inputs, detect violations, or redact sensitive information before it gets processed or returned to users.

Do n8n guardrails require an LLM?

It depends on the node. Check Text for Violations requires a connected chat model for any LLM-based guardrails like jailbreak detection, NSFW filtering, topical alignment, and custom guardrails. Pattern-based guardrails like keywords, URLs, and custom regex also run through this node but don’t rely on the LLM for evaluation. Sanitize Text requires no LLM at all and runs entirely on pattern matching.

What is the difference between Check Text for Violations and Sanitize Text?

Check Text for Violations evaluates whether text breaks a rule and routes it to either a Pass or Fail output branch, giving you conditional control flow. Sanitize Text finds sensitive content and replaces it with a placeholder, passing the cleaned text forward through a single output. Use violations checking when you want to block and alert. Use sanitize when you want to strip and continue.

Can I use n8n guardrails to prevent PII from being sent to an AI model?

Yes. The PII detection guardrail in Check Text for Violations can flag inputs containing email addresses, phone numbers, credit card numbers, social security numbers, and over 20 other PII types. The Sanitize Text node can redact those same fields before the text reaches your AI agent. Both approaches keep PII out of external LLM requests, which is a core requirement in most enterprise AI policies.

What version of n8n introduced the guardrails node?

The guardrails nodes were introduced in n8n version 1.113.3, released in November 2025. If you don’t see the guardrails nodes in your n8n search, update your instance to the latest version.

Next Steps

n8n guardrails close a significant gap for teams building AI workflows at scale. The ability to enforce keyword policies, detect PII, block jailbreak attempts, and validate AI output—all natively in n8n—makes it much easier to build workflows that meet enterprise security and compliance requirements.

Start by identifying the highest-risk points in your current workflows. If you’re taking user input and sending it to an AI agent, add a Check Text for Violations node in front of it with at minimum a PII and secret key guardrail. If the agent’s output goes directly to a customer, add a custom empathy or quality guardrail on the back end.

Build the fail branch with logging and alerting from the start. A data table entry and a Slack notification when a guardrail triggers gives you visibility into what’s actually hitting your workflows, which is valuable data for refining your policy over time.

Free Community

Join 1,000+ AI Automation Builders

Weekly tutorials, live calls & direct access to Ryan & Matt.

Join Free →