Discrete-continuous optimization for large-scale structure from motion

David CrandallAndrew OwensNoah SnavelyDan Huttenlocher

Recent work in structure from motion (SfM) has successfully built 3D models from large unstructured collections of images downloaded from the Internet. Most approaches use incremental algorithms that solve progressively larger bundle adjustment problems. These incremental techniques scale poorly as the number of images grows, and can drift or fall into bad local minima. We present an alternative formulation for SfM based on finding a coarse initial solution using a hybrid discrete-continuous optimization, and then improving that solution using bundle adjustment. The initial optimization step uses a discrete Markov random field (MRF) formulation, coupled with a continuous Levenberg-Marquardt refinement. The formulation naturally incorporates various sources of information about both the cameras and the points, including noisy geotags and vanishing point estimates. We test our method on several large-scale photo collections, including one with measured camera positions, and show that it can produce models that are similar to or better than those produced with incremental bundle adjustment, but more robustly and in a fraction of the time.

For more details, please see our CVPR 2011 paper and slides from our CVPR talk.

Sample reconstruction videos

Central Rome
(after multi-view stereo)

Cornell Arts Quad

Cornell Arts Quad
(after multi-view stereo)


Papers and presentations

BibTeX entries:

    author = {David Crandall and Andrew Owens and Noah Snavely and Daniel Huttenlocher},
    title = {{SfM with MRFs}: Discrete-Continuous Optimization for Large-Scale Structure from Motion},
    journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI)},
    year = {2013},
    month = {December},
    volume = {35},
    number = {12},
    pages = {2841--2853}

    author = {David Crandall and Andrew Owens and Noah Snavely and Daniel Huttenlocher},
    title = {Discrete-Continuous Optimization for Large-scale Structure from Motion},
    booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    year = {2011}


In the CVPR 2011 paper, the reference to “V. Govindu. Lie-algebraic averaging for globally consistent motion estimation. CVPR, 2004” should instead be to “V. Govindu.  Combining Two-view Constraints For Motion Estimation. CVPR, 2001”.


We would like to thank the Cornell Facilities Team for helping us collect the ground truth Arts Quad dataset. We also gratefully acknowledge the support of the following:

National Science FoundationMIT Lincoln LabsGoogleIntel CorporationLilly Endowment
National Science
MIT Lincoln LabsGoogleIntel CorporationLilly EndowmentIU Data to Insight Center
The IU Computer Vision Lab's projects and activities have been funded, in part, by grants and contracts from the Air Force Office of Scientific Research (AFOSR), the Defense Threat Reduction Agency (DTRA), Dzyne Technologies, EgoVid, Inc., ETRI, Facebook, Google, Grant Thornton LLP, IARPA, the Indiana Innovation Institute (IN3), the IU Data to Insight Center, the IU Office of the Vice Provost for Research through an Emerging Areas of Research grant, the IU Social Sciences Research Commons, the Lilly Endowment, NASA, National Science Foundation (IIS-1253549, CNS-1834899, CNS-1408730, BCS-1842817, CNS-1744748, IIS-1257141, IIS-1852294), NVidia, ObjectVideo, Office of Naval Research (ONR), Pixm, Inc., and the U.S. Navy. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the U.S. Government, or any sponsor.