Apple LGTM — 4K Gaussian Splatting Without the Memory Explosion

Feed-forward Gaussian splatting has been stuck at a hard ceiling: push resolution past 1080p and memory explodes. Apple just published a paper at ICLR 2026 that breaks through that ceiling — and reaches 4K without even flinching.

LGTM — Apple 4K Gaussian Splatting: comparison between baseline methods and LGTM at 4K resolution
LGTM enables 4K feed-forward Gaussian splatting — a capability previously out of reach. Source: yxlao.github.io/lgtm

The Problem That Capped 3DGS at 1080p

3D Gaussian Splatting has taken over the novel view synthesis world. Fast, high-quality, real-time — everyone loves it. But feed-forward methods (the ones that reconstruct a scene in a single pass, no per-scene training) have a dirty secret: they predict one Gaussian primitive per pixel. Go from 1080p to 4K and you go from ~2 million to ~8 million primitives. Memory explodes quadratically. Methods like NoPoSplat literally crash with OOM errors at 2K.

This isn’t a hardware limitation. It’s a fundamental architectural problem baked into how pixel-aligned Gaussian prediction works. Until LGTM, nobody had a clean fix.

What LGTM Actually Does

LGTM stands for Less Gaussians, Texture More. The insight is elegant: if you decouple geometry from appearance, you can stop scaling the number of Gaussians with resolution. The system has two networks working together:

  • Primitive Network: Takes a low-resolution input and predicts a compact set of 3D Gaussian primitives — far fewer than pixel count
  • Texture Network: Takes a high-resolution input and attaches a small RGBA texture map to each Gaussian primitive

The geometry stays lightweight. The detail lives in the texture maps. When you rasterize, each Gaussian gets sampled based on viewing angle and composited in depth order — you get sharp 4K output without a 4K-sized point cloud.

LGTM dual-network architecture: primitive network for geometry plus texture network for high-res appearance
The LGTM dual-network pipeline. Geometry and resolution are fully decoupled. Source: yxlao.github.io/lgtm

The Numbers Are Wild

Going from 512×288 to 4096×2304 is a 64× increase in pixel count. With pixel-aligned methods, that means 64× more Gaussians. With LGTM, it costs 1.80× more memory and 1.47× more inference time. That’s it. That’s the whole trade-off. You get 4K for almost free.

LGTM trains at 2K and 4K using under 30 GB of VRAM. NoPoSplat cannot even complete training at 2K due to OOM errors. The gap is not marginal — it’s a different category.

And it’s not niche. LGTM works across all the major feed-forward scenarios:

  • Single-view (replaces Flash3D)
  • Two-view, pose-free (beats NoPoSplat)
  • Two-view, posed (improves DepthSplat)
  • Multi-view (works with VGGT)
LGTM vs Flash3D single-view comparison showing sharper textures and finer details
Single-view reconstruction: Flash3D (left) vs LGTM (right). The texture and sharpness gap at 4K is massive. Source: arXiv 2603.25745

Why You Should Care

Right now, most real-time 3D reconstruction tools cap their output at 720p or 1080p — not because 4K is hard to display, but because generating it costs too much. Splat captures of environments, objects, people — all of it stays low-res in practice.

LGTM points toward a world where:

  • AR/VR headsets can reconstruct spaces at native display resolution in real time
  • Object scanning tools generate production-grade splats from a handful of photos
  • Instant 3D from single images stops looking like it came from 2020
  • Video-to-3D pipelines can output 4K frames without a render farm

This is Apple ML Research, so it also signals that Apple is thinking hard about 4K Gaussian splatting in real-world device contexts — Vision Pro being the obvious one. The timing is not a coincidence.

Try It / Follow It

The code isn’t public yet — the repo shows “coming soon” at the time of writing. But you can:

IK3D Lab Take

This is exactly the kind of paper that doesn’t get enough hype because it’s “just infrastructure.” No flashy app, no viral demo reel, no product launch. Just a research team at Apple quietly solving the thing that was blocking 3DGS from reaching its full potential.

When the code drops, expect to see it folded into every serious splat pipeline within months. Feed-forward 4K reconstruction is about to become the baseline expectation. LGTM is the reason why.

Keep an eye on this one.

Sharing is caring!

Leave a Reply

Your email address will not be published. Required fields are marked *