Learning Perpetual View Generation of Natural Scenes from Single Images

1Google Research   2Cornell Tech, Cornell University   3UC Berkeley

ECCV 2022 (Oral Presentation)

Training only on collections of single photo, we learn perpetual view generation from a input RGB image


We present a method for learning to generate unbounded flythrough videos of natural scenes starting from a single view, where this capability is learned from a collection of single photographs, without requiring camera poses or even multiple views of each scene. To achieve this, we propose a novel self-supervised view generation training paradigm, where we sample and rendering virtual camera trajectories, including cyclic ones, allowing our model to learn stable view generation from a collection of single views. At test time, despite never seeing a video during training, our approach can take a single image and generate long camera trajectories comprised of hundreds of new views with realistic and diverse content. We compare our approach with recent state-of-the-art supervised view generation methods that require posed multi-view videos and demonstrate superior performance and synthesis quality.


Thanks to Andrew Liu, Richard Bowen and Richard Tucker for the fruitful discussions and helpful comments.


  title     = {InfiniteNature-Zero: Learning Perpetual View Generation of Natural Scenes from Single Images},
  author    = {Li, Zhengqi and Wang, Qianqian and Snavely, Noah and Kanazawa, Angjoo},
  booktitle = {ECCV},
  year      = {2022}