Apple's SHARP can turn a photo into a 3D scene in…

Apple's SHARP can turn a photo into a 3D scene in under a second

A 3D scene generated by Apple SHARP model.

Apple's AI developments have been much mocked, but could the Cupertino giant emerge as a surprise leader in AI-driven 3D? A host of tech companies are researching tools for simpler, faster creation of 3D scenes, environments and digital twins, and Apple's just made a pretty big leap.

SHARP is an experimental AI model that can quickly turn 2D images into 3D gaussian splats that can then be viewed on Vision Pro. Some now think that though a combination of its hardware and software, Apple could have the edge for developing AI-driven 3D workflows.

People are underestimating @Apple in AI.I just ran Apple’s new SHARP model locally and watched my photos turn into 3D Gaussian splats in seconds, then stepped inside them on Vision Pro.This feels like the beginning of something special. You really have to try it. pic.twitter.com/cEVYAsZyzdDecember 17, 2025

Instead of traditional polygons, gaussian splatting uses millions of fuzzy 3D ellipsoids with defined position, size, orientation, colour and transparency to represent and render intricate 3D scenes in real-time so that they look highly accurate from a particular viewpoint.

Most techniques require lots – sometimes hundreds – of images of a scene from different angles (see our pick of the best 3D scanners). But Apple’s SHARP uses AI to predict the scene from just one photo in under a second on a standard GPU.

Apple trained SHARP on swathes of synthetic and real-world data to teach it to identify frequent depth and geometry patterns so it can predict the position and appearance of 3D Gaussians via a single forward pass through a neural network.

According to the research paper, distances and scale remain consistent in real-world terms. The representation is metric, with absolute scale, supporting metric camera movements.

The compromise is that SHARP only accurately renders nearby viewpoints, not unseen parts of the scene, which means users can't venture far from that viewpoint.

With the code available on GitHub, and people have been testing out the tool and sharing the results on social media (see below). Others are wondering why Apple chose to illustrate the model with an image of a horse that appears

From 2 second (faster on higher end GPUs) Gaussian splat generations, using Apple's "Sharp Monocular View Synthesis in Less Than a Second" code, running locally on my system.Imported into Octane Render 2026 which has fully path traced rendering of Gaussian splats.Stuff like… pic.twitter.com/YhTXHb4WpmDecember 18, 2025

Apple's SHARP model generates photorealistic 3D Gaussian reps from a single img in secs.GitHub: https://t.co/wU6yTWRdClPaper: https://t.co/xUtr40pEJ9SHARP enables photorealistic NVS from one photo by regressing 3D Gaussian params via single NN fwd pass (<1s on std GPU).… pic.twitter.com/Wo6EyZIPvLDecember 17, 2025

This week also saw the launch of SpAItial AI's Echo, which can turn 2D images into editable 3D worlds on which users can apply different styles. The company hopes to add full prompt-based scene manipulation, allowing users to add, remove, rearrange, or restyle objects.

Read news from 100’s of titles, curated specifically for you.

Already a member? Sign in here