Alibaba, one of China’s biggest tech companies, announced the release of two new A.I. models on Friday that dramatically level up the possibilities of artificial intelligence.
The open source models, called Qwen-VL and Qwen-VL-Chat, are vision language models, meaning they “read” images rather than text, unlike competitors ChatGPT and Google Bard. Qwen-VL-Chat promises complex features like providing directions by scanning street signs, solving math equations based on a photo, and weaving together a narrative based on multiple pictures. For example, it can scan an image of a sign in a hospital written in Mandarin and then translate it into English, or help a news organization write a caption for a photo, the company says.
Qwen-VL, the other release Friday, is an updated version of its existing image-reading chatbot that can now read pictures in higher resolution.
Alibaba declined to comment to Fortune beyond its public announcement.
These new iterations of A.I. are the latest shots fired in the arms race among developers to create increasingly sophisticated tools, as the technology graduates from gimmick to genuine game-changer. For example, Alibaba says its new image-scanning technology has significant opportunities to help visually impaired people with shopping, allowing them, for instance, to scan an item and have the chatbot recite the label back to them.
Both models will be made available on Alibaba Cloud’s proprietary model-as-a-service platform Modelscope and on Hugging Face, the popular startup that has a library of A.I. models.
Alibaba’s release comes just a day after Meta launched an A.I. model fine-tuned for writing code, built on the open-source Llama 2 model released in July. Alibaba has been trying to keep up with Meta’s A.I. rollouts for the last few months. Earlier this month, Alibaba unveiled its first two open-source large language models, Qwen-7B and Qwen-7B-Chat—the same ones that form the basis for Friday’s releases. In July, the two companies struck an agreement to make Meta’s Llama 2 model available to the Chinese market via Alibaba’s cloud division.
By making these new models open-source, Alibaba is letting users tweak the tools to develop their own apps or conduct research. Most A.I. companies hope that users will adapt open-source models into tools for highly specific use cases, without having to undertake the onerous task of building a large language model from scratch. Alongside the open-source offerings, the companies offer their proprietary models as a service, hoping to capture market share in the burgeoning industry.
A.I. development is a priority for the Chinese government
Just last month, the Chinese government became one of the first countries to issue comprehensive regulations for A.I., a development that experts say gave Alibaba and other Chinese tech companies the green light to make their products public.
Alibaba is also preparing to undergo a complete restructuring that would spin off Alibaba Cloud, the cloud computing division that houses its A.I. research, into an independent division, a move that investors welcome. Since A.I. technology requires significant computing power that can only be properly serviced with a cloud network, having the two in the same division would boost A.I.’s efficiencies. The current CEO and chairman of Alibaba Cloud, Daniel Zhang, is set to down in September, to be replaced by two of Alibaba’s cofounders: Eddie Wu as CEO and Joseph Tsai as chairman.
The Chinese government has on more than one occasion indicated that it considers A.I. critical to its technological future, setting up an arms race with the U.S. Even seemingly innocuous tools like those released by Alibaba on Friday could be implicated because of their underlying technology and how other developers might use them. A.I. “has become a proxy in the battle for primacy between China and the U.S.,” Kerry Brown, director of the Lau China Institute at King’s College London, told Fortune earlier this month.
So far, it seems that Chinese tech companies are slightly lagging their U.S. counterparts. The open source version of Meta’s Llama 2 model is based on roughly 70 billion variables (called parameters in A.I. parlance), about 10 times larger than Alibaba’s new releases (Alibaba does say it has bigger models which aren’t open-source). Despite the U.S.’s advantage, government officials are concerned the Chinese government will ultimately co-opt some A.I. tech developed by private firms for military or surveillance purposes, according to Axios.