Documentation

SigLIP Cross-Encoder

Cross-modal reranker based on SigLIP for image-text relevance scoring

SigLIP Cross-Encoder scores similarity between images and text using SigLIP's superior cross-modal alignment. Use as a second-stage reranker after embedding-based retrieval.

When to use:

  • Second-stage reranking in image-text retrieval pipelines
  • Product recommendation: find images most relevant to a text query

Input:

  • Query Text (optional): Text query for image matching
  • Query Image (optional): Image to match against candidates
  • Candidate Images (required): List of images to rank

Output:

  • Scores: Relevance score per candidate image
  • Ranking: Candidate indices sorted by similarity

Inference Settings

No inference-time settings. Scoring is deterministic.

On this page


Command Palette

Search for a command to run...

Keyboard Shortcuts
CTRL + KSearch
CTRL + DTheme switch
CTRL + LLanguage switch

Software details
Compiled 4 days ago
Release: v4.0.0-production
Buildnumber: master@994bcfd
History: 46 Items