This release contains implementations of algorithms from the paper "Discrete-Continuous Optimization for Large-Scale Structure from Motion" (CVPR 2011), specifically: - Estimating camera translations using discrete belief propagation (bp-trans). - Estimating camera rotations using discrete belief propagation (bp-rot). - Data files for the Acropolis and Arts Quad datasets. This release is primarily intended for people who are interested in large-scale optimization problems. It is not a complete structure from motion pipeline and does not include code for the continuous refinement steps, processing geotags, or bundle adjustment. Please let us know if you have any questions, or if there is some other part of the system you would like to know more about. === Installation === 1. Run ./configure, which configures several required libraries. 2. Install the Google sparsehash library by executing "sudo ./install_deps.sh" (requires root permission). 3. Run "make". This will build bp-rot, bp-trans and the rest of the dependencies. === Using bp-trans === Running: See run_transbp.sh for an example of running bp-trans on the Acropolis dataset. Input: (1) Estimated relative translation directions (global.translations.txt) (2) Geotags and estimated camera rotations (global.poses.txt) The rotation estimates were obtained as described in the paper (i.e. using discrete BP followed by non-linear least squares refinement). Note that these datasets contain only camera-point edges. Points have no rotation, so their rotation matrices are all-zero in the input files. Parameters: The grid's dimensions and the size of each grid cell can be passed to bp-trans via the --labelspace flag. Specifically, "--labelspace N D" specifies an N x N grid with grid cells of size (1/D)^2 square meters. Output: Each iteration, bp-trans prints the total energy and node labels from the MAP solution. === Using bp-rot === Running: See run_rotbp.sh for an example of running bp-rot on the Acropolis dataset. Input: (1) Geotags (geoplanar.geotags.txt) (2) Camera tilt/twist estimated from vanishing points (tilt_twist.txt) (3) Relative pose estimates (pairs.txt) Output: As with bp-trans, the energy and node labels are printed each iteration. === Arts Quad dataset === We have provided the Arts Quad relative poses, camera geotags, and tilt/twist estimates in the data/artsquad directory (pairs.txt, tilt_twist.txt, and geoplanar.geotags.txt). High-precision, "ground truth", geotags and raw images are available on the project web page (http://vision.soic.indiana.edu/disco). We have also provided the files required to run bp-rot and bp-trans. Unfortunately, you will probably not be able to run bp-trans on this dataset as-is, since it requires a large amount of memory and CPU. For the experiments in our paper, we used a MapReduce implementation of bp-trans instead (but please email us if you'd like the code). === Pairs file format (pairs.txt) === This file contains relative poses derived from pairwise feature matches and has the following format: num_images num_pairs # header cam_i cam_j R_ij t_ij c1 c2 # relative pose for each matched image pair ... where cam_i and cam_j are the (0-indexed) camera ids, R_ij is the relative rotation matrix (9 entries in row-major order), and t_ij is a 3-vector representing the relative translation direction. Please see equations 1 and 2 in the paper for more details. You can ignore c1 and c2 if you just care about relative pose (they are related to the distance of the cameras to the 3D points in the scene). Note that the camera IDs are *0-indexed*. (Note that some images may have no adjacent edges, which is why the ArtsQuad relative pose data looks like it starts with index 1 -- image 0 has no edges.) === Updates === - Added bp-rot and Arts Quad dataset. Fixed incorrect unary weight parameter in run script. (10/10/2011) - Clarified this README file to make it clear that camera IDs use 0-based indexing. (07/20/2014) === Copying === This code is released under the Apache License 2.0, except where otherwise stated.