Custom Code
Write and execute your own Python logic directly inside a flow.
The Custom Code node lets you write arbitrary Python code that runs as a step in your flow. Input data from upstream nodes is passed in automatically, and you use create_output() to pass results downstream.
Use it when the built-in transformation or model nodes don't cover your exact logic — custom preprocessing, business rules, data enrichment, calling external APIs, generating plots, or anything else Python can do.
Configuration
| Field | Description |
|---|---|
| Libraries | Select which libraries to import. Chosen imports are automatically added at the top of your code. |
| Output Names | Define the names of your outputs. Use these exact names in create_output() calls. |
| Python Code | Your code. Input blobs are available as blob_0, blob_1, etc. (or by name if tagged). |
| Timeout | Maximum execution time in seconds. Default: 300s. Maximum: 600s. |
Available Libraries
| Library | Use it for |
|---|---|
polars | Fast dataframe operations (default) |
plotly | Creating charts and visualizations |
pyarrow | Columnar data and Parquet files |
numpy | Numerical computing |
PIL | Image processing |
torch | PyTorch deep learning |
transformers | Hugging Face models |
sentence_transformers | Sentence embeddings |
flair | NLP sequence labeling |
requests | HTTP requests to external APIs |
google_play_scraper | Scrape Google Play reviews |
datetime, json, math, statistics, uuid, re, emoji | Standard utilities |
Inputs & Outputs
| Name | Description | |
|---|---|---|
| Input | data | Optional blobs from upstream nodes, accessible as blob_0, blob_1, … or by tag name |
| Output | output_data | Data blobs created via create_output() |
| Output | plots | Plotly figures created in your code |
| Output | metrics | Logs and metrics from execution |
Writing Your Code
Input blobs from connected upstream nodes are injected as variables:
# Access inputs by index
df = blob_0 # first connected input
df2 = blob_1 # second connected input
# Or by tag name if you tagged your Select Data / upstream nodes
df = my_datasetUse create_output() to pass data to downstream nodes. The name must match one of your configured Output Names:
filtered = df.filter(pl.col("age") > 18)
summary = filtered.group_by("category").agg(pl.count())
create_output("output", filtered)
create_output("result", summary)Use log() to write messages visible in the node logs:
log(f"Rows after filter: {len(filtered)}")Example: Filter and aggregate tabular data
# Filter adults and summarize by category
filtered_df = blob_0.filter(pl.col("age") > 18)
summary_df = filtered_df.group_by("category").agg(pl.count().alias("count"))
create_output("output", filtered_df)
create_output("result", summary_df)
log(f"Filtered to {len(filtered_df)} rows")Example: Call an external API and structure the response
import json
response = requests.get("https://api.example.com/data", headers={"Authorization": "Bearer TOKEN"})
data = response.json()
df = pl.DataFrame(data["items"])
create_output("output", df)
log(f"Fetched {len(df)} records")Tips
- Keep timeout in mind for large datasets or slow external requests — increase it if needed (max 600s)
- Connect a Preview Output node to inspect your outputs during development
- Use
polarsinstead ofpandas— it's significantly faster and is the default dataframe library across the platform