Direkt zum Inhalt
ThinDeep
CAPTCHA
Bild-CAPTCHA
Neues Captcha erzeugen
Geben Sie die Zeichen ein, die im Bild gezeigt werden.
Diese Sicherheitsfrage überprüft, ob Sie ein menschlicher Besucher sind und verhindert automatisches Spamming.

Hauptnavigation

  • Startseite
CAPTCHA
Bild-CAPTCHA
Neues Captcha erzeugen
Geben Sie die Zeichen ein, die im Bild gezeigt werden.
Diese Sicherheitsfrage überprüft, ob Sie ein menschlicher Besucher sind und verhindert automatisches Spamming.
Benutzermenü
  • Anmelden

Pfadnavigation

  1. Startseite

Qwen2.5 VL

ollama run qwen2.5vl:7b

6GB

 

ollama run qwen2.5vl:32b

21GB

 

 

Qwen2.5-VL, the new flagship vision-language model of Qwen and also a significant leap from the previous Qwen2-VL.

The key features include:

  • Understand things visually: Qwen2.5-VL is not only proficient in recognizing common objects such as flowers, birds, fish, and insects, but it is highly capable of analyzing texts, charts, icons, graphics, and layouts within images.
  • Being agentic: Qwen2.5-VL directly plays as a visual agent that can reason and dynamically direct tools, which is capable of computer use and phone use.
  • Capable of visual localization in different formats: Qwen2.5-VL can accurately localize objects in an image by generating bounding boxes or points, and it can provide stable JSON outputs for coordinates and attributes.
  • Generating structured outputs: for data like scans of invoices, forms, tables, etc. Qwen2.5-VL supports structured outputs of their contents, benefiting usages in finance, commerce, etc.

Performance

We evaluate our models with the SOTA models as well as the best models of similar model sizes. In terms of the flagship model Qwen2.5-VL-72B-Instruct, it achieves competitive performance in a series of benchmarks covering domains and tasks, including college-level problems, math, document understanding, general question answering, math, and visual agent. Notably, Qwen2.5-VL achieves significant advantages in understanding documents and diagrams, and it is capable of playing as a visual agent without task-specific fine tuning.

https://ollama.com/library/qwen2.5vl

Page tags
Qwen2.5
VL
RSS-Feed

Language switcher

  • English
  • Chinese, Simplified
  • Chinese, Traditional
  • Japanese
  • German
  • French
  • Korean
  • Italian
  • Russian
  • Portuguese, Brazil
  • Spanish
Unterstützt von Drupal

Tag Cloud

Links

機器人叫獸Youtube頻道 | 暴龍隊Youtube頻道 | 台灣機器人學校Youtube頻道 | Ubipilot 輔助駕駛 Youtube頻道 

Pi10t | 機器人叫獸 | HARU | 暴龍隊 | ThinDeep | TAIBOT | Ubipilot | Robot School | LEE School | LEE, Shih-yuan | Haru Tel  | Zeison  | Powro