Skip to main content
ThinDeep
CAPTCHA
Image CAPTCHA
Get new captcha!
Enter the characters shown in the image.
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.

Main navigation

  • Home
CAPTCHA
Image CAPTCHA
Get new captcha!
Enter the characters shown in the image.
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
User account menu
  • Log in

Breadcrumb

  1. Home

Qwen2.5 VL

ollama run qwen2.5vl:7b

6GB

 

ollama run qwen2.5vl:32b

21GB

 

 

Qwen2.5-VL, the new flagship vision-language model of Qwen and also a significant leap from the previous Qwen2-VL.

The key features include:

  • Understand things visually: Qwen2.5-VL is not only proficient in recognizing common objects such as flowers, birds, fish, and insects, but it is highly capable of analyzing texts, charts, icons, graphics, and layouts within images.
  • Being agentic: Qwen2.5-VL directly plays as a visual agent that can reason and dynamically direct tools, which is capable of computer use and phone use.
  • Capable of visual localization in different formats: Qwen2.5-VL can accurately localize objects in an image by generating bounding boxes or points, and it can provide stable JSON outputs for coordinates and attributes.
  • Generating structured outputs: for data like scans of invoices, forms, tables, etc. Qwen2.5-VL supports structured outputs of their contents, benefiting usages in finance, commerce, etc.

Performance

We evaluate our models with the SOTA models as well as the best models of similar model sizes. In terms of the flagship model Qwen2.5-VL-72B-Instruct, it achieves competitive performance in a series of benchmarks covering domains and tasks, including college-level problems, math, document understanding, general question answering, math, and visual agent. Notably, Qwen2.5-VL achieves significant advantages in understanding documents and diagrams, and it is capable of playing as a visual agent without task-specific fine tuning.

https://ollama.com/library/qwen2.5vl

Page tags
Qwen2.5
VL
RSS feed

Language switcher

  • English
  • Chinese, Simplified
  • Chinese, Traditional
  • Japanese
  • German
  • French
  • Korean
  • Italian
  • Russian
  • Portuguese, Brazil
  • Spanish
Powered by Drupal

Tag Cloud

Links

機器人叫獸Youtube頻道 | 暴龍隊Youtube頻道 | 台灣機器人學校Youtube頻道 | Ubipilot 輔助駕駛 Youtube頻道 

Pi10t | 機器人叫獸 | HARU | 暴龍隊 | ThinDeep | TAIBOT | Ubipilot | Robot School | LEE School | LEE, Shih-yuan | Haru Tel  | Zeison  | Powro