While everyone else is racing to generate prettier worlds, a New York lab is quietly betting that the real prize is teaching machines to move through them. General Intuition just put a number on that bet: it’s in talks to raise around $300 million at a ~$2 billion valuation — eight months after a $134M seed — with Jeff Bezos and Eric Schmidt now on the cap table. The fuel? Two billion first-person game clips a year.
The Story
General Intuition spun out of Medal, the clip-sharing app where 10 million gamers a month upload their best (and worst) moments — roughly 2 billion videos a year. Founder Pim de Witte looked at that firehose and saw something nobody else had a license to: not spectator footage like YouTube or Twitch, but first-person, interactive gameplay. Every clip is a human seeing a world and reacting to it, frame by frame, with the controller inputs that caused the next frame. That is exactly the signal a machine needs to learn spatial-temporal reasoning — how things move through space and time.
Here’s the part that should make any 3D or AI nerd sit up. The co-founders — Eloi Alonso, Adam Jelley, and Vincent Micheli — are the team behind DIAMOND (DIffusion As a Model Of eNvironment Dreams), the NeurIPS 2024 paper that proved a diffusion model could be a playable game engine. They trained it on static Counter-Strike footage and ended up with a neural network you could literally walk around inside. No mesh, no physics engine, no renderer — just a diffusion model dreaming the next frame from your inputs.
DIAMOND’s other headline was that visual details matter: by keeping pixels sharp instead of compressing scenes into discrete tokens, its agents hit a then-record 1.46 human-normalized score on the Atari 100k benchmark — learning to play entirely inside their own imagined world. General Intuition is that same idea, scaled from arcade cabinets to a planet’s worth of games.
Why You Should Care
Most of the world-model gold rush — World Labs, Genie, Cosmos, Marble — is about generating explorable scenes. General Intuition flips the product: it isn’t selling the world model at all. The agent is the product. The world model is just the cheap, infinite gym you train it in. That’s a fundamentally different thesis, and it’s why the same tech points at search-and-rescue drones that navigate GPS-denied buildings, robot arms, and autonomous vehicles — not just prettier game backdrops.
- Game devs: agents that genuinely understand level geometry and timing — trained on real human play, not hand-scripted behavior trees — are coming for your NPCs and playtesting.
- 3D & spatial creators: the line between “rendered scene” and “learned scene” is dissolving. A diffusion model that can be played is a new kind of interactive medium.
- Everyone: de Witte’s stated motivation is humanitarian — drones that can search a collapsed building without GPS. Gameplay data as a path to physical-world AI is one of 2026’s most underrated bets.
Try It / Follow Them
General Intuition’s first product is expected late summer / early fall 2026 — for now it’s a research lab, not a download. But you can get hands-on with the technology that started it: the DIAMOND code and weights are open on GitHub, and you can watch the playable CS:GO and Atari world models on the DIAMOND project page. Follow the company at generalintuition.com, and keep an eye on Eloi Alonso’s research page for what comes next.
IK3D Lab Take
We’ve covered a dozen world models that let you walk into a generated place. General Intuition is the first that’s interesting because of what it does after you walk in: it learns to act. Pairing the DIAMOND team’s “playable diffusion” research with Medal’s absurd dataset moat is the kind of unfair-advantage combo that’s hard to copy — you can’t just scrape 2 billion interactive first-person clips. The risk is obvious (no shipped product yet, and “agents from game clips” still has to prove it transfers to a real drone), but the bet is gorgeous: the thing we built to waste time might be the best curriculum we ever made for teaching machines to understand the physical world. We’re watching this one closely.



