Dokumentation (english)

Qwen-VL-2 Embedding

Multilingual multimodal embeddings with strong OCR and visual understanding

Qwen-VL-2 generates 3584-dimensional joint embeddings from images and text across 32+ languages. Strong performance on OCR-heavy documents, charts, and visual content.

When to use:

  • Multilingual image-text retrieval (32+ languages)
  • Documents with embedded text, charts, or OCR content
  • Cross-modal similarity search in multilingual settings

Input:

  • Image (required): Image to encode
  • Text (optional): Optional text to pair with the image

Output: 3584-dimensional multimodal embedding vector

Inference Settings

No inference-time settings. Embeddings are computed deterministically.

On this page


Command Palette

Search for a command to run...

Schnellzugriffe
STRG + KSuche
STRG + DNachtmodus / Tagmodus
STRG + LSprache ändern

Software-Details
Kompiliert vor etwa 4 Stunden
Release: v4.0.0-production
Buildnummer: master@afa25ab
Historie: 72 Items