n8n tutorial - Lesson 17: Generate Video Metadata with AI in n8n

Hi everyone, in this session of the n8n Workflow Automation Tutorial series, we build a practical n8n video metadata generator that uses AI to produce a YouTube description, 15 hashtags, and timestamped sections — all saved automatically to a Google Sheet for easy copy-paste when uploading. This is Session 17 of the series and covers a clean 5-node manual workflow you can run ad-hoc whenever you have a new video ready.

How to do:

Step 1 — Create the Google Sheet

Set up a destination Sheet before building any nodes so the Append step has somewhere to write.
  1. Create a new Google Sheet named T5-Video-Metadata (Sheet ID used in this session: 1gQ7qdsNyySUJnGL5wOeNQImg8-dnUHUHEj7Jqc2mHys).
  2. Inside that sheet, create a tab named Metadata with exactly 6 column headers in order:
    • video_title
    • content_summary
    • description
    • hashtags
    • timestamps
    • created_at
  3. Do not add a YouTube video ID column or any Plain Text format column — this sheet is metadata-only, not a post-upload tracker.

Note — Keeping the schema minimal here is intentional. A separate upload-tracking workflow (Session 18) will extend this sheet with status and video_id columns later.

Step 2 — Add a Manual Trigger Node

This workflow is designed to run on demand, not on a schedule, so a Manual Trigger is the correct entry point.
  1. Open n8n and create a new workflow named T5-B5-Video-Metadata-Generator.
  2. Add a Manual Trigger node as the first node.
  3. Leave the workflow status as Inactive — you will execute it manually each time you have a new video, not via an automatic schedule.

Tip — Keeping ad-hoc utility workflows inactive prevents accidental triggers and quota waste. Run it only when you actually need metadata generated.

Step 3 — Configure the Set Input Node

The Set Input node (an Edit Fields node) holds the three input variables you edit each time you run the workflow.
  1. Add an Edit Fields node after the Manual Trigger and name it Set Input.
  2. Add three fields with Value Fixed mode:
    • video_title — the title of the video you are about to upload
    • content_summary — a short summary of what the video covers
    • transcript_text — optional; paste raw transcript with time markers if you have them (leave blank if not)
  3. Before each run, manually edit these three values directly in this node, then click Execute Workflow.

Tip — The transcript_text field is optional by design. When it is empty, the AI still generates a full description and hashtags but returns an empty string for timestamps. When it contains markers like 0:30, 1:15, 2:00, the AI parses them into formatted timestamp sections automatically.

Step 4 — Build the Generate Metadata AI Node

This is the core node — a Basic LLM Chain that calls Claude Haiku 3.5 and returns structured metadata via an Output Parser.
  1. Add a Basic LLM Chain node after Set Input and name it Generate Metadata.
  2. Set the model to Claude Haiku 3.5 (or the equivalent Haiku model available in your n8n AI credentials).
  3. Set these model parameters:
    • Temperature: 0.7
    • Max Tokens: 3000
  4. Write the prompt using an XML 4-block structure covering:
    • language_rule — auto-detect: if title/summary is Vietnamese, output in Vietnamese; if English, output in English
    • description_structure — 3-paragraph format (hook paragraph, main content paragraph, call-to-action paragraph)
    • hashtag_rules — exactly 15 hashtags: 5 broad + 5 medium + 5 long-tail
    • few_shot examples — include a Vietnamese-language example pattern so the model handles both languages correctly
  5. Attach a Structured Output Parser with this schema:
    • description — string (200–300 words)
    • hashtags — array of 15 strings
    • timestamps — string (formatted timestamp list, or empty string)

Note — Set Max Tokens to 3000, not a lower default. Vietnamese text is tokenized roughly 3× heavier than English, so a description + 15 hashtags + timestamps in Vietnamese can easily exceed a 1000-token limit and get cut off mid-output.

Step 5 — Build the Build Row Node

The Build Row node normalizes the AI output and the original inputs into exactly 6 columns before writing to the Sheet.
  1. Add an Edit Fields node after Generate Metadata and name it Build Row.
  2. Add 6 fields mapping to the 6 Sheet columns:
    • video_title → expression: {{ $('Set Input').item.json.video_title }}
    • content_summary → expression: {{ $('Set Input').item.json.content_summary }}
    • description → expression: {{ $json.output.description }}
    • hashtags → expression: {{ $json.output.hashtags.join(', ') }}
    • timestamps → expression: {{ $json.output.timestamps }}
    • created_at → expression: {{ $now.toISO() }}
  3. For the video_title and content_summary fields, use the cross-node reference ${'Set Input'} because the Structured Output Parser drops the original input fields from $json — they are no longer available from the previous node directly.
  4. For the hashtags field, set the field Type explicitly to String — do not leave it as auto-detect.

Tip — This is the most common mistake in this workflow: pasting {{ $json.output.hashtags.join(', ') }} is correct, but if you leave the field Type as auto-detect, n8n sees the source is an array and sets Type to Array automatically. The node then expects an array but receives a joined string and throws a validation error. Always set Type to String manually whenever you use .join(), .toString(), or any expression that converts an array to a string.

Step 6 — Append to the Google Sheet

The final node writes the normalized row to the T5-Video-Metadata Sheet.
  1. Add a Google Sheets node after Build Row and name it Append to Metadata.
  2. Set the operation to Append.
  3. Select the spreadsheet T5-Video-Metadata (ID 1gQ7qdsNyySUJnGL5wOeNQImg8-dnUHUHEj7Jqc2mHys) and the tab Metadata.
  4. Set Mapping Column Mode to Auto-Map — n8n will match the 6 field names from Build Row to the 6 column headers automatically.

Step 7 — Test the Workflow with Two Scenarios

Run two test cases to verify all paths work before using this workflow for real videos.
  1. Test 1 — Empty transcript:
    • In Set Input, fill video_title and content_summary, and leave transcript_text blank.
    • Click Execute Workflow and verify: description is 200–300 words, 15 hashtags appear as a comma-separated string, and timestamps is an empty string "".
    • Check the Sheet — 1 new row should appear with all 6 columns populated.
  2. Test 2 — Transcript with time markers:
    • In Set Input, paste a transcript_text containing markers like 0:30, 1:15, 2:00 with section labels.
    • Click Execute Workflow and verify: the timestamps field contains 3 formatted entries in MM:SS Section Name format.
    • Check the Sheet — a second row should appear correctly.

Note — After both tests pass, your Sheet will have 2 rows of sample data. Real usage follows the same flow: edit the 3 values in Set Input, execute, copy the description and hashtags from the Sheet when uploading on YouTube.

Key Lessons from This Session

  1. Always set Type=String explicitly when converting arrays to strings in Edit Fields. Using .join() produces a string, but n8n's auto-detect sees the source array and sets Type=Array — causing a validation error at runtime.
  2. The Structured Output Parser drops input fields from $json. After an AI node with an Output Parser, reference upstream inputs using ${'Node Name'} cross-node syntax, not $json.
  3. Set Max Tokens high enough for the language you are generating. Vietnamese content tokenizes roughly 3× heavier than English; 3000 tokens is the safe floor for description + 15 hashtags + timestamps.
  4. Keep ad-hoc utility workflows as Manual Trigger and Inactive. This prevents accidental executions and keeps your active workflow list clean.
  5. Design the prompt with explicit hashtag count rules. Specifying 5 broad + 5 medium + 5 long-tail in the prompt reliably produces exactly 15 hashtags every run without post-processing.
  6. Test the optional field path explicitly. Always run once with transcript_text empty and once with data — the AI should handle both without errors and return an empty string (not null or missing) for timestamps when no transcript is provided.

Conclusion:

In this n8n tutorial, we built a complete n8n video metadata generator — a 5-node manual workflow that takes a video title and summary, calls an AI model to produce an SEO-ready description, exactly 15 hashtags, and formatted timestamps, then saves everything to a Google Sheet for quick copy-paste on YouTube. The key technical lesson of this session is the explicit Type=String requirement in Edit Fields whenever you convert an array to a string with .join(). The next session in this n8n workflow automation series builds on this sheet by creating a pipeline that reads the saved metadata and uploads videos directly from Google Drive to YouTube as unlisted.

If you have any questions, feel free to leave a comment below. Thank you!

Tags: n8n video metadata generator, n8n tutorial, n8n workflow automation, n8n AI automation, n8n Google Sheets, n8n Basic LLM Chain, YouTube metadata automation, n8n Edit Fields

Maybe you are interested!