Web Scraper

The Web Scraper node connects to your Apify account and runs a scraper actor for each URL it receives from an upstream node. The extracted content is passed downstream as structured JSON — ready to feed into an AI node, a CSV export, or any other step.

Setup

You need an Apify account and an API token. Connect it once in the node settings — the token is stored securely and reused for every run.

Configuration

Field	Description
Scraper	Which Apify actor to use. See the Scrapers section below. Only applies when Smart URL Routing is off.
Smart URL Routing	When enabled, the node inspects the URL and automatically selects the best actor for that platform. Falls back to the Generic Web Scraper for all other URLs.

Scrapers

Scraper	Best for
Generic Web Scraper (default)	Any public website — company pages, blogs, landing pages, news articles
Twitter / X	Tweets and thread content from a twitter.com or x.com URL
YouTube	Video metadata, description, channel info, view and like counts
Google Maps	Business listings — name, address, rating, reviews, phone, website
Amazon Products	Product pages — title, price, brand, rating, review count, ASIN

When Smart URL Routing is on, the node detects the platform from the URL and picks the right scraper automatically. You don't need to set anything else.

Inputs & Outputs

	Name	Description
Input	`json_data`	JSON from an upstream node containing a `url` field to scrape
Output	`action_result`	Structured JSON with the scraped content (fields vary by scraper — see below)

Output fields by scraper

Generic Web Scraper

Field	Description
`url`	The scraped URL
`title`	Page title
`description`	Meta description
`author`	Page author if present
`keywords`	Meta keywords
`language`	Detected language code
`text`	Full plain-text content
`markdown`	Content as Markdown

Twitter / X

Field	Description
`url`	Tweet URL
`author`	Username
`text`	Tweet text
`likes`	Like count
`retweets`	Retweet count
`replies`	Reply count
`posted_at`	Post timestamp

YouTube

Field	Description
`url`	Video URL
`title`	Video title
`channel`	Channel name
`channel_url`	Channel URL
`subscribers`	Subscriber count
`description`	Video description
`views`	View count
`likes`	Like count
`duration`	Video duration
`published_at`	Publish date
`hashtags`	Comma-separated hashtags

Google Maps

Field	Description
`url`	Place URL
`title`	Business name
`address`	Full address
`rating`	Average rating score
`reviews_count`	Total review count
`website`	Business website
`phone`	Phone number
`category`	Business category

Amazon Products

Field	Description
`url`	Product URL
`title`	Product title
`price`	Listed price
`brand`	Brand name
`rating`	Average star rating
`reviews_count`	Total review count
`description`	Product description
`asin`	Amazon product identifier

Tips

Connect a Google Sheets Trigger upstream to scrape one URL per new row automatically
Pass the output to a Use AI Model node to extract structured fields from the raw scraped content
Use Smart URL Routing when your sheet contains a mix of URLs from different platforms — the node picks the right scraper for each one
The Generic Web Scraper uses a lightweight HTTP crawler by default. It works well for most public pages and keeps costs low on the free Apify tier.
If a URL returns no results (e.g. a login-gated page), the node fails and stops the flow — the Flow Completion Event node can catch this and send you an alert

Web Scraper

Setup

Configuration

Scrapers

Inputs & Outputs

Output fields by scraper

Tips

On this page

Sicherheit auf Enterprise-Niveau

In jeder Infrastruktur einsetzbar

DSGVO-konform

Web Scraper

Setup

Configuration

Scrapers

Inputs & Outputs

Output fields by scraper

Tips

On this page

Command Palette