移至主內容
ThinDeep 瑞泰創新科技
CAPTCHA
圖片的 CAPTCHA
Get new captcha!
請輸入圖片上的文字。
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.

主導覽

  • 首頁
CAPTCHA
圖片的 CAPTCHA
Get new captcha!
請輸入圖片上的文字。
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
使用者帳號選單
  • 登入

導航連結

  1. 首頁

Qwen2.5 VL

ollama run qwen2.5vl:7b

6GB

 

ollama run qwen2.5vl:32b

21GB

 

 

Qwen2.5-VL, the new flagship vision-language model of Qwen and also a significant leap from the previous Qwen2-VL.

The key features include:

  • Understand things visually: Qwen2.5-VL is not only proficient in recognizing common objects such as flowers, birds, fish, and insects, but it is highly capable of analyzing texts, charts, icons, graphics, and layouts within images.
  • Being agentic: Qwen2.5-VL directly plays as a visual agent that can reason and dynamically direct tools, which is capable of computer use and phone use.
  • Capable of visual localization in different formats: Qwen2.5-VL can accurately localize objects in an image by generating bounding boxes or points, and it can provide stable JSON outputs for coordinates and attributes.
  • Generating structured outputs: for data like scans of invoices, forms, tables, etc. Qwen2.5-VL supports structured outputs of their contents, benefiting usages in finance, commerce, etc.

Performance

We evaluate our models with the SOTA models as well as the best models of similar model sizes. In terms of the flagship model Qwen2.5-VL-72B-Instruct, it achieves competitive performance in a series of benchmarks covering domains and tasks, including college-level problems, math, document understanding, general question answering, math, and visual agent. Notably, Qwen2.5-VL achieves significant advantages in understanding documents and diagrams, and it is capable of playing as a visual agent without task-specific fine tuning.

https://ollama.com/library/qwen2.5vl

Page tags
Qwen2.5
VL
RSS feed

Language switcher

  • English
  • Chinese, Simplified
  • Chinese, Traditional
  • Japanese
  • German
  • French
  • Korean
  • Italian
  • Russian
  • Portuguese, Brazil
  • Spanish
Powered by Drupal

Tag Cloud

1776Adob​​e IllustratorAEAIAIdeaLabAI FontAIGCAI LabAILabs.twAI monitoringAIPAI SentryAI人權AI佛AI佛祖AI哨兵AI字型AI字體AI戦略会議AI搜尋引擎AI教育AI模型AI白皮書AI算力AI造字AI醫療AMDAndrew NgAPBig DataBipedCAMPFIREChatGPTCNNCyber​​AgentDecentralized VPNDeeper NetworkDeep LearningDeepLearning.aiDeepMindDeepSeekDGX H100DPNDPRElon MuskFoxNewsGANGenerative AIGGUFGoogleGPTGPT-4HotokeHotoke AIHuaTuoHuggingFaceHyperCLOVAIPAJapan AIJDLAKazuma IeiriLaMDALINELINE AiCallLLaMALLMLLM ChatbotMax TegmarkMicrosoftMira MuratiMLPOllamaOpenAIOptimusPaLM 2PANSPerplexityPythonQuadrupedQwenQwen2.5Qwen3RIKENRNNRobotSal KhanSciMakerSEOStability AIStockGPTTAIDETAIGUUTaiwanProTedThinDeepTransformerTruthGPTUBIViral LoopVLVPNWorks Mobile JapanX.AIX.AI CorpYCYOLOYOLOv9Yuval Noah Hararizi2zizi2zi-pytorch上海世界AI大會世界AI大會中國AI五笔笔形亞洲無人機AI創新應用研發中心亞洲航空公司人工智慧監測人工通用智慧六種深度神經網路模型北京人工智能白皮書千問千問3台灣AI白皮書台灣人工智慧實驗室史塔克股神吳恩達和製GPT大模型大規模言語大規模語言模型大語言模型天空飛行科技公司孫正義宮川潤一家入一真岡野原大輔岸田文雄工藤郁子平將明徐挺耀微軟情報處理推進機構提示詞文心文心大模型新創公司日本AI日本AI白皮書日本GPT日本LLM智慧醫療東京大學松尾豊松尾豐楊立昆機器人機械佛永字八法深度學習 深度求索深度求索深度神經網路模型無人機無人機產業無條件基本收入特斯拉特斯拉機器人理化学研究所生命之未來研究所生成AI生成式AI生成式 AI 的使用指南百度百度輸入法盤古大模型眾籌神經網路模型筆跡經濟學人総理大臣官邸美國國家科學院群眾募資義竹脳情報通信融合研究センター自民黨自駕車英國AI白皮書華駝蘇姿丰越獄版軟銀軟體銀行輝達通用人工智慧通義千問鐵馬克開源大型語言模型阿特曼阿里巴巴陸奇雅虎革新知能統合研究センター顏擇雅飛槳數據馬斯克

Links

機器人叫獸Youtube頻道 | 暴龍隊Youtube頻道 | 台灣機器人學校Youtube頻道 | Ubipilot 輔助駕駛 Youtube頻道 

Pi10t | 機器人叫獸 | HARU | 暴龍隊 | ThinDeep | TAIBOT | Ubipilot | Robot School | LEE School | LEE, Shih-yuan | Haru Tel  | Zeison  | Powro