Alibaba Cloud launches open source Large Vision Language Model Qwen-VL

Alibaba Cloud launches open source Large Vision Language Model Qwen-VL. Credit: 123RF

On August 25, Alibaba Cloud launched an open-source Large Vision Language Model (LVLM) named Qwen-VL. The LVLM is predicated on Alibaba Cloud’s 7 billion parameter foundational language mannequin Qwen-7B. In addition to capabilities reminiscent of image-text recognition, description, and query answering, Qwen-VL introduces new options together with visible location recognition and image-text comprehension, the corporate mentioned in an announcement. These capabilities allow the mannequin to establish places in footage and to offer customers with steering primarily based on the data extracted from pictures, the agency added. The mannequin will be utilized in varied situations together with picture and document-based query answering, picture caption era, and fine-grained visible recognition. Currently, each Qwen-VL and its visible AI assistant Qwen-VL-Chat can be found free of charge and business use on Alibaba’s “Model as a Service” platform ModelScope. [Alibaba Cloud statement, in Chinese]

Related Content

Test de l’iPad Pro M4 : une merveille de technologie… et de frustration

Si vous en avez marre des tendinites, cette souris ergonomique Logitech est en promo

ATRenew Extends Quarterly Profit Streak as Diversification Push Gains Stream

HiPhi Undergoes Restructuring, Set to Receive $1 Billion Investment”

Headline