Get all your news in one place.
100’s of premium titles.
One app.
Start reading
Tom’s Guide
Tom’s Guide
Technology
Scott Younker

Microsoft’s new tiny language model can read images — here’s what you can use it for

Microsoft image for Phi-3 language model.

During Build 2024, Microsoft announced a new version of the company’s small language AI model, Phi-3, which is capable of analyzing images and telling users what’s in them.

The new version, Phi-3-vision, is a multimodal model. For those unaware, especially with OpenAI’s GPT-4o and Google’s updates to Gemini, a multimodal model means that the AI tool can read text and images. 

Phi-3-vision is meant for use on mobile devices as it features a 4.2 billion-parameter model. An AI model’s parameters are a shorthand for understanding how complex a model is and how much of the training it receives it understands. Microsoft has been iterating the Phi model on previous versions. So, Phi-2, for example, learned from Phi-1 and grew with new capabilities, and Phi-3 is similar to Phi-2, trained on Phi-2 and added capabilities.

Phi-3-vision can perform general visual reasoning tasks, such as analyzing charts and images. Unlike other more well-known models, like OpenAI’s DALL-E, Phi-3-vision can only “read” an image; it cannot generate images. 

Microsoft has released several of these small AI models. They’re designed to run locally and on a wider range of devices than larger models like Google’s Gemini or even ChatGPT. No internet connection is required. They also reduce the computing power needed to run certain tasks, like solving math problems, as Microsoft’s small Orca-Math model does.

The first iteration of Phi-3 was announced in April when Microsoft released the tiny Phi-3-mini. In benchmark tests, it performed quite well against larger models like Meta’s Llama 2. The mini model has just 3.8 billion parameters. There are two other models, Phi-3-small and Phi-3-medium, which feature 7 billion parameters and 14 billion parameters, respectively. 

Phi-3-vision is available in preview right now. The three other Phi-3 models, Phi-3-mini, Phi-3-small and Phi-3-medium, are accessible via the Azure Machine Learning model catalog and collections. To utilize them, you’ll need a paid Azure account and Azure AI Studio hub. 

More from Tom's Guide

Sign up to read this article
Read news from 100’s of titles, curated specifically for you.
Already a member? Sign in here
Related Stories
Top stories on inkl right now
One subscription that gives you access to news from hundreds of sites
Already a member? Sign in here
Our Picks
Fourteen days free
Download the app
One app. One membership.
100+ trusted global sources.