Every AI 3D generator on the planet hands you a beautiful, dead statue. Mesh? Gorgeous. Textures? 4K. But try to make it walk and you hit the wall that has haunted this whole field: rigging and skinning. AniGen, a SIGGRAPH 2026 paper from VAST AI Research — yes, the Tripo team — just generated a fully rigged, animation-ready character from a single photo. Skeleton included. Skin weights included. One shot.
The Story
Text- and image-to-3D has been on an absolute tear. Tripo, Meshy, Rodin, Hunyuan3D — give them a prompt or a picture and you get production-grade geometry in seconds. We’ve covered most of them right here in the Lab. But there’s a dirty secret behind all that progress: the output is a static mesh. To animate it, an artist still has to build a skeleton, place every joint, then paint skinning weights so the surface deforms correctly when the bones move. It’s the most technical, least glamorous step in the pipeline, and it’s where hours disappear.
The usual fix is a sequential pipeline: generate the mesh, then run an auto-rigger on it, then run an auto-skinner on top of that. The problem is obvious once you say it out loud — errors compound. A slightly off mesh produces a misplaced skeleton, which produces garbage weights, and the whole chain collapses. AniGen’s insight is to stop treating these as three separate problems.
Its core trick is what the authors call S³ Fields — Shape, Skeleton, and Skin represented as three mutually consistent fields defined over one shared spatial domain. Instead of generating geometry and bolting a rig on afterward, AniGen learns all three jointly, so they agree by construction. A two-stage flow-matching pipeline first synthesizes a sparse structural scaffold (the rough shape and bone layout), then fills in dense geometry and articulation inside a structured latent space. Two clever bits make it robust: a confidence-decaying skeleton field that gracefully handles ambiguous joint placement at boundary regions, and a dual skin feature field that decouples skinning weights from any fixed joint count — so a fixed-size network can predict rigs of wildly different complexity, from a four-legged dog to a multi-limbed machine.
Why You Should Care
Because this is the missing link that turns AI 3D generation from a concept-art toy into an actual animation pipeline. A rigged, skinned asset can be dropped straight into Blender, Maya, Unreal or Unity and driven by off-the-shelf motion data — mocap, a walk cycle library, a retargeted animation clip. No rig-from-scratch. No weight painting marathon.
And it isn’t limited to humanoids, which is where most auto-riggers quietly fall apart. AniGen handles animals, humanoids and machinery, generalizing to in-the-wild images across categories. The authors report it substantially outperforms state-of-the-art sequential baselines on both rig validity (does the skeleton actually make sense?) and animation quality (does it deform without exploding?). Look at the horse below — the auto-generated skeleton follows the spine, legs and neck the way a TD would actually place it.
For indie game devs and solo animators, this is enormous. The thing that used to require a dedicated rigging artist — or a paid auto-rig service per asset — collapses into the same generation step that already gives you the mesh. For studios, it’s a way to populate a scene with dozens of animation-ready creatures without a rigging queue.
Try It / Follow Them
The best part: it’s open. The code is on GitHub under an MIT license (one third-party component, CUBVH, is research/non-commercial), the pretrained weights are on Hugging Face, and there’s a live demo Space you can poke at right now.
- Project page: yihua7.github.io/AniGen_web — videos of every result, rotating with and without skeletons
- Code: github.com/VAST-AI-Research/AniGen
- Live demo: Hugging Face Space — drop in an image, get a GLB
- Paper: arXiv 2604.08746
Export is GLB, so the output drops cleanly into Blender for a sanity check before you push it down your animation pipeline. If you’re already running Tripo for geometry, this is the same lab solving the very next step.
IK3D Lab Take
We’ve spent a year watching AI 3D generation get scary-good at geometry while quietly ignoring the question every animator actually asks: “Okay, but can I move it?” AniGen is the first release that treats rigging as a generation problem instead of a post-process — and the joint S³ Fields formulation is the kind of clean idea that, in hindsight, feels obvious. Is it perfect? No. Skinning on tricky topology and very mechanical rigs will still need an artist’s pass, and the demo’s resolution won’t replace a hand-built hero rig. But as a starting point that gets you 80% of the way for background characters, creatures and crowd assets, it’s a genuine leap. The static-statue era of AI 3D is ending. Drop a photo in, get something that can dance out. That’s the part we’ve been waiting for.



