Custom Code

The Custom Code node lets you write arbitrary Python code that runs as a step in your flow. Input data from upstream nodes is passed in automatically, and you use create_output() to pass results downstream.

Use it when the built-in transformation or model nodes don't cover your exact logic — custom preprocessing, business rules, data enrichment, calling external APIs, generating plots, or anything else Python can do.

Configuration

Field	Description
Libraries	Select which libraries to import. Chosen imports are automatically added at the top of your code.
Output Names	Define the names of your outputs. Use these exact names in `create_output()` calls.
Python Code	Your code. Input blobs are available as `blob_0`, `blob_1`, etc. (or by name if tagged).
Timeout	Maximum execution time in seconds. Default: 300s. Maximum: 600s.

Available Libraries

Library	Use it for
`polars`	Fast dataframe operations (default)
`plotly`	Creating charts and visualizations
`pyarrow`	Columnar data and Parquet files
`numpy`	Numerical computing
`PIL`	Image processing
`torch`	PyTorch deep learning
`transformers`	Hugging Face models
`sentence_transformers`	Sentence embeddings
`flair`	NLP sequence labeling
`requests`	HTTP requests to external APIs
`google_play_scraper`	Scrape Google Play reviews
`datetime`, `json`, `math`, `statistics`, `uuid`, `re`, `emoji`	Standard utilities

Inputs & Outputs

	Name	Description
Input	`data`	Optional blobs from upstream nodes, accessible as `blob_0`, `blob_1`, … or by tag name
Output	`output_data`	Data blobs created via `create_output()`
Output	`plots`	Plotly figures created in your code
Output	`metrics`	Logs and metrics from execution

Writing Your Code

Input blobs from connected upstream nodes are injected as variables:

# Access inputs by index
df = blob_0  # first connected input
df2 = blob_1  # second connected input

# Or by tag name if you tagged your Select Data / upstream nodes
df = my_dataset

Use create_output() to pass data to downstream nodes. The name must match one of your configured Output Names:

filtered = df.filter(pl.col("age") > 18)
summary = filtered.group_by("category").agg(pl.count())

create_output("output", filtered)
create_output("result", summary)

Use log() to write messages visible in the node logs:

log(f"Rows after filter: {len(filtered)}")

Example: Filter and aggregate tabular data

# Filter adults and summarize by category
filtered_df = blob_0.filter(pl.col("age") > 18)
summary_df = filtered_df.group_by("category").agg(pl.count().alias("count"))

create_output("output", filtered_df)
create_output("result", summary_df)

log(f"Filtered to {len(filtered_df)} rows")

Example: Call an external API and structure the response

import json

response = requests.get("https://api.example.com/data", headers={"Authorization": "Bearer TOKEN"})
data = response.json()

df = pl.DataFrame(data["items"])
create_output("output", df)
log(f"Fetched {len(df)} records")

Tips

Keep timeout in mind for large datasets or slow external requests — increase it if needed (max 600s)
Connect a Preview Output node to inspect your outputs during development
Use polars instead of pandas — it's significantly faster and is the default dataframe library across the platform

Custom Code

Configuration

Available Libraries

Inputs & Outputs

Writing Your Code

Example: Filter and aggregate tabular data

Example: Call an external API and structure the response

Tips

On this page

Sicherheit auf Enterprise-Niveau

In jeder Infrastruktur einsetzbar

DSGVO-konform

Custom Code

Configuration

Available Libraries

Inputs & Outputs

Writing Your Code

Example: Filter and aggregate tabular data

Example: Call an external API and structure the response

Tips

On this page

Command Palette