n8n tutorial - Lesson 07: Build an AI Email Classifier with n8n

On n8n tutorial,

Hi everyone, in this post we're building two production-grade email automation pipelines in n8n: an AI classifier that labels incoming Gmail messages every 15 minutes, and an AI draft reply generator triggered on demand. This is Session 7 of the n8n Workflow Automation Tutorial series — and it's one of the most dense, practical sessions yet.

How to do:

Step 1 — Create the Gmail Labels for Classification

Before building either workflow, you need the destination labels ready in Gmail so the Gmail Add Label nodes have valid targets.

Run your existing T2-B3-Setup-Labels workflow (or create labels manually in Gmail) to ensure these six labels exist: n8n/Quan-trọng, n8n/Chờ-trả-lời, n8n/Hoá-đơn, n8n/Newsletter, n8n/Spam-marketing, n8n/Đã-draft.
In Gmail, go to Settings → See all settings → Labels and confirm each label is set to Show.
Note: the label counter next to each label shows unread count only, not total messages — so n8n/Đã-draft may show zero even after messages are labeled, because drafted emails are typically read.

Note — If you already ran T2-B3-Setup-Labels in a previous session, just run it one more time with the new n8n/Đã-draft entry added. You don't need to recreate the other five labels.

Step 2 — Build the Email Classifier Workflow (T2-B4-Email-Classifier)

This 10-node workflow runs on a 15-minute schedule and automatically labels every new unread email using Claude AI — the core of your n8n email automation setup.

Create a new workflow named T2-B4-Email-Classifier.
Add a Schedule Trigger node. Set the interval to 15 minutes.
Add a Gmail → Get Many node with these settings:
- Query: is:unread -label:n8n/Quan-trọng -label:n8n/Chờ-trả-lời -label:n8n/Hoá-đơn -label:n8n/Newsletter -label:n8n/Spam-marketing
- Simplify: ON
- Limit: 10 (or your preferred batch size)
Add an Edit Fields node named "Filter 4 Core Fields". Map only these four fields: id, from, subject, snippet.
Add a Basic LLM Chain node named "Classify Email" with these settings:
- Model: Claude Haiku 4.5
- Temperature: 0
- Prompt format: XML-style with 5 few-shot examples
- Attach an Output Parser targeting the field {category}
Add a second Edit Fields node named "Merge ID + Category". Map both id (from the Filter node) and category (from the LLM output) into one item.
Add a Switch node named "Route by Category" with 5 output branches:
- Branch 1: category equals Quan-trọng
- Branch 2: category equals Chờ-trả-lời
- Branch 3: category equals Hoá-đơn
- Branch 4: category equals Newsletter
- Branch 5: category equals Spam-marketing
- Enable Fallback output for unmatched items.
Connect each branch to its corresponding Gmail → Add Label node (5 nodes total), passing the id field as the message identifier.

Tip — Do NOT activate this workflow immediately after testing. Run it manually for 1–2 days first and verify in Gmail that no personal emails are incorrectly labeled as Spam-marketing. Only activate scheduled execution after you're confident in the AI classification accuracy.

Step 3 — Understand How Gmail's Filter + Limit Works

A common misconception about the Gmail Get Many node is how filtering and the Limit setting interact — getting this wrong leads to confusing results.

Gmail does NOT fetch emails first and then filter on n8n's side.
The query string (e.g., is:unread -label:n8n/Quan-trọng) is sent to Gmail's server, which filters the entire inbox first, builds a matching pool, then returns the first N items up to your Limit.
So setting Limit to 10 with a 5-label exclusion filter returns 10 emails that match all conditions — not "10 emails minus excluded ones." This is the standard behavior across search APIs (Gmail, Google Sheets, YouTube, SQL — all work the same way).

Note — If you expect 9 results because "10 minus 1 excluded," you're thinking of client-side filtering. Gmail's -label: syntax is server-side — the excluded emails never enter the result pool at all.

Step 4 — Configure the AI Prompt with Output Parser (Not Just Prompt Instructions)

Getting the LLM to return clean, structured JSON requires more than telling it "return JSON only" — this is a critical lesson for any n8n workflow automation project.

In the Basic LLM Chain node, write a system prompt that:
- Defines the 5 valid category values
- Uses XML-style formatting to separate examples clearly
- Includes 5 few-shot examples showing input (subject + snippet) → output (category value only)
Even with a clear prompt saying "return only the category word, no JSON, no explanation," Claude Haiku 4.5 may still wrap output in ```json...``` blocks and add a Explanation paragraph — this was verified across 10/10 test emails.
The reliable fix: always attach an Output Parser to the LLM Chain node. The Output Parser enforces the schema and strips all markdown wrapping automatically.
Rule: if the LLM output needs to feed into a downstream node (Switch, Edit Fields, etc.), always use Output Parser — never rely on prompt instructions alone.

Production tip — "Prompt-only" output control (Method 1) is unreliable even with temperature=0 and explicit instructions. Output Parser (Method 2) is the only approach you should use in production workflows where downstream nodes depend on structured data.

Step 5 — Understand the Snippet Field vs. Full Email Body

When you set Simplify to ON in the Gmail node, the available content fields change — knowing which field to use for AI classification prevents silent accuracy failures.

The three content fields Gmail returns are:
- snippet — preview text, approximately 200 characters (50–100 tokens). Available with Simplify ON.
- text — full plain-text body. Requires Simplify OFF.
- html — full HTML body. Requires Simplify OFF.
snippet is sufficient for classification in most cases, but fails in three scenarios:
- Emails where the subject is generic (e.g., "Hi") and the key context is in the body
- Invoices and receipts where the amount/vendor appears mid-body
- Newsletters where the snippet is just a logo alt-text or header
Fallback strategy: if snippet-based classification gives low confidence results, switch Simplify to OFF and use text instead — but remember to re-check your field mappings in all downstream Edit Fields nodes after toggling Simplify.

Note — Every time you toggle the Simplify switch on a Gmail node, click Execute Step and re-inspect the JSON tab. The available fields change significantly between ON and OFF states.

Step 6 — Build the Draft Reply Workflow (T2-B5-Email-Drafter)

This 7-node workflow runs manually and generates AI-written draft replies for all emails labeled "Awaiting Reply," then marks them as drafted to prevent duplicates — a key n8n email automation pattern.

Create a new workflow named T2-B5-Email-Drafter.
Add a Manual Trigger node (do not use Schedule — you want full control over when drafts are generated).
Add a Gmail → Get Many node with these settings:
- Query: label:n8n/Chờ-trả-lời -label:n8n/Đã-draft
- Simplify: OFF (required to get threadId and text)
- Limit: 5
Add an Edit Fields node named "Filter Fields for Draft". Map these six fields: id, threadId, from_name, from_email, subject, text.
Add a Basic LLM Chain node named "AI Draft Reply" with these settings:
- Model: Claude Haiku 4.5
- Temperature: 0.7
- Max tokens: 1024
- 5 few-shot examples showing real email → reply pairs in your writing style
- Output Parser targeting {subject_reply, body_reply}
Add a Gmail → Create Draft node:
- Subject: map from subject_reply
- Message: map from body_reply
- Thread ID: set to {{ $('Filter Fields for Draft').item.json.threadId }} — use threadId, NOT id
Add a Gmail → Add Label node:
- Label: n8n/Đã-draft
- Message ID: use the original email's id (from the Edit Fields node), NOT the newly created draft's ID

Production tip — The -label:n8n/Đã-draft exclusion filter in Step 3 is your deduplication guard. Run the workflow a second time after the first run — it should return 0 items because all processed emails are already labeled. If it still returns emails, check that the Add Label node is targeting the original email id correctly.

Step 7 — Fix the id vs. threadId Production Bug

Using the wrong ID field when creating Gmail drafts is a subtle bug that causes replies to appear as new conversations instead of continuing the thread — catch it before going to production.

Understand the Gmail data model:
- Every individual email message has its own unique id
- All messages in the same conversation share one threadId
If you pass id into the Gmail Create Draft node's Thread ID field, Gmail may create the draft as a standalone new message rather than a reply in the existing thread.
The correct approach:
- In Edit Fields "Filter Fields for Draft", always map threadId as a separate field
- In Gmail Create Draft, use {{ $('Filter Fields for Draft').item.json.threadId }} for the Thread ID field
- Keep id mapped separately — you still need it for the Add Label node

Note — This is a database-design-level insight: one thread, many messages, one shared threadId. Always verify your draft is attached to the correct thread in Gmail by opening the draft and confirming it appears inside the original conversation.

Step 8 — Handle No-Reply Emails Gracefully

AI will attempt to draft replies to no-reply addresses unless you filter them out — here are three approaches, listed from simplest to most robust.

Method 1 — Gmail filter (simplest): Add -from:noreply to your Gmail Get Many query string. This is fast and prevents no-reply emails from entering the pipeline at all.
Method 2 — AI decision flag: Add a should_reply boolean field to the LLM output schema. If the AI returns false, route that item to a No Operation branch instead of Create Draft.
Method 3 — Pre-classify in Step 4: In the T2-B4-Email-Classifier workflow, ensure that automated system emails (security alerts, receipts from noreply addresses) are classified into Hoá-đơn or Spam-marketing — not Chờ-trả-lời. This prevents them from ever reaching the drafter workflow.
Production recommendation: combine Method 1 + Method 3. Method 1 is a safety net; Method 3 is the correct architectural fix.

Note — In testing, when AI received a Google Security Alert email and the schema required a body_reply output, the model correctly recognized it couldn't reply — but instead of an error, it output instructions addressed to the user explaining why no reply was needed. The schema was satisfied but the output was wrong. This is why filtering upstream is safer than relying on AI judgment for structural pipeline decisions.

Step 9 — Debug Workflows Using Execute Step Correctly

The Execute Step and Execute Workflow buttons behave differently in n8n — misunderstanding this causes confusing debug results.

Execute Step on any non-trigger node traces backwards through all upstream nodes and runs the full pipeline up to and including that node. It does NOT fan out to sibling branches.
If you click Execute Step on the Gmail Add Label "Newsletter" node and it shows 0 items processed, that is correct behavior — it means no emails were routed to that branch, not that the node failed.
Execute Workflow (or clicking Test Workflow at the top) runs the entire workflow from the trigger node downward, including all branches simultaneously.
When testing with a Schedule Trigger, clicking Execute Workflow bypasses the schedule condition and emits a [{}] placeholder item so the rest of the workflow can run. This is standard behavior for all trigger node types in n8n.

Step 10 — Customize the Few-Shot Examples in the Draft Workflow

The AI draft quality depends directly on the few-shot examples you provide — replace placeholder templates with your real emails for best results.

In the Basic LLM Chain "AI Draft Reply" node, locate the 5 few-shot example pairs currently using placeholder templates.
Replace each pair with a real email you received + the actual reply you sent, formatted as:
- Input: sender name, subject line, email body (shortened)
- Output: your actual reply subject and body
Choose examples that cover different tones and scenarios: a quick acknowledgment, a detailed answer, a polite decline, a follow-up, and a scheduling reply — this gives the AI range to match context.
After replacing examples, run the workflow manually on 2–3 test emails and verify the tone and style match your writing before using it regularly.

Tip — This is the difference between Option B (generic template few-shots) and Option A (your real email pairs). Option A gives you drafts that sound like you wrote them. The time investment of collecting 5 real pairs is small compared to the quality improvement.

Key Lessons from This Session

Always use Output Parser when LLM output feeds a downstream node. Prompt-only JSON control is unreliable even with temperature=0 and explicit instructions — verified across 10 emails with Claude Haiku 4.5.
Gmail server-side filtering and client-side filtering are completely different. The -label: query filters before the Limit is applied, so Limit=10 always returns 10 matching items, never "10 minus excluded."
Use threadId, not id, for the Gmail Create Draft Thread ID field. Each message has a unique id; all messages in a conversation share one threadId.
Deduplicate with label exclusion filters, not workflow state. The -label:n8n/Đã-draft query pattern is simple, reliable, and self-documenting.
Filter no-reply emails at the Gmail query level, not the AI level. Combine -from:noreply in the query with correct classification in the upstream classifier workflow.
Toggle Simplify ON/OFF only when you re-inspect all downstream field mappings. Simplify changes which fields are available — missing a re-check breaks Edit Fields nodes silently.
Execute Step traces upstream; it does not run sibling branches. Zero items on a Switch branch means no emails were routed there — not a node failure.
Do not activate scheduled workflows until you manually verify AI accuracy for 1–2 days. A misclassified personal email in Spam-marketing is a production incident that erodes trust in the automation.

Conclusion:

In this session, you built two complete n8n email automation pipelines: a 10-node AI email classifier running on a 15-minute schedule, and a 7-node AI draft reply generator with deduplication. Along the way, you resolved real production-grade issues — Output Parser enforcement, the id vs. threadId bug, no-reply edge cases, and Gmail's server-side filter behavior — all of which apply to any advanced n8n workflow automation project. In the next session, we'll complete the trilogy by adding a daily 6 PM digest that summarizes important emails and sends them to Telegram, then combine all three pipelines into one unified AI email assistant system.

If you have any questions, feel free to leave a comment below. Thank you!

Tags: n8n email automation, n8n tutorial, n8n workflow automation, Gmail AI classifier, n8n Claude integration, email draft automation, n8n Output Parser, n8n beginner to advanced

Maybe you are interested!

QTitHow

"The more we give, the more we receive"

Category

n8n Tutorial