Animating Animals from Video

Note: we have changed our project to "lbsNeRF: Animatable Volumetric Avatars from Video"

Neural radiance fields (NeRF) have emerged as a promising representation for encoding geometry and appearance of static scenes and objects. However, extending these representations to capture the non-rigid deformations common in categories such as humans and animals remains an open challenge. Towards this goal, we propose lbsNeRF, a framework that learns an actor-specific neural avatar from multi-view videos with associated skeleton motions, and subsequently allows rendering under arbitrary query articulation and viewpoints. Inspired by classic works on skinning models that allow controlled deformation of meshes, our approach similarly constrains the allowed deformation under articulation. Our approach models appearance under articulation using a canonical space NeRF which is associated to the view space via a (neural) blending weight field that induces a per-point transformation. Unlike existing works that relies on predefined surface models such as SMPL, our approach will allow a deformable model without the surface, only from posed skeleton inputs. This enables applying our approach to other generic categories e.g. cats and elephants. We plan to empirically validate our approach across multiple datasets for both humans \textit{and} animals. Preliminary results show that our method allows generalization to unseen poses, while also performing comparably to prior methods which require stronger supervision.


  • Hang Gao, UC Berkeley, link
  • Shubham Tulsiani, Facebook AI Research, link
  • Angjoo Kanazawa, UC Berkeley, link


August 31, 2021