3D reconstruction project (2024 version)
Half the course consists of a 3D reconstruction project. The project is conducted in groups of 5 or 4 students. Groups of 5 students may propose to form groups, but do not worry, everyone that wants to, will be assigned to a group when the project starts.
Date  Milestone 

April 25  Project start and group assignment 
May 2  Design plan due 
May 19 (a Sunday)  Report submission for peer review 
May 21  Final seminar with project presentations 
Required functionalities
Each group should implement a system with the following functionality:

Robust correspondence extraction using RANSAC and the epipolar constraint on a sequence of frames.
(i.e. like CE1, but for many image pairs, and you may want to try different correspondences, e.g. from OpenCV.)  Computation of an Ematrix (e.g. from F and a known K) and subsequent
extraction of the twistedpair solutions to R and t from
E, and selection of the correct one, using the visiblepoints constraint.
(See LE6)  Robust Perspectivenpoint (PnP) estimation for adding a new view
and detecting outliers. Use an existing P3P solver and your own RANSAC loop.
(See LE6,LE5)  Bundle adjustment for N cameras (N≥2) using known correspondences,
intrinsic camera parameters (Kmatrix), and an initial guess for the
set of camera poses.
(See LE7)  Improved efficiency for bundle adjustment.
Options:
 Use a sparsity mask for the Jacobian in the Bundle adjustment step (See CE1 extra task).
 Use pytorch and autograd (See Micro Bundle Adjustment).
 Program in C++, and use Ceres Solver, which also has automatic differentiation and a Shur complement solver. (Discuss with your guide).
 Evaluate the robustness to noise in the implemented system, by measuring the camera pose errors of your reconstruction compared to reference poses. Details can be found here.
Optional functionalities
In addition to the required functionalities, groups with four students should do at least one extra task from the list below. Groups of five students should to at least two extra tasks.
 Densification using e.g. PatchMatch, with crosschecking and the fundamental matrix constraint (See CE2)
 Visualisation of the 3D model.
Pick one of the following:
 Use screened Poisson surface reconstruction to obtain a volumetric representation. Try e.g. the version in Meshlab, or Kazhdan's own implementation (recommended, as it has more options). Both require you to estimate colour and surface normals for the 3Dpoints, and save these in PLY format. Use visibility in cameras to determine normal signs. See LE4. Theoretical details on Screened PSR can be found in Kazhdan and Hoppe.
 Use space carving (as in Fitzgibbon, Cross and Zisserman) to generate a volumetric representation of your object.
 Try a radiance field method, e.g. 3D Gaussian Splatting, Plenoxels or some NeRF variant. (discuss with your supervisor).
 Replace incremental SfM with Global SfM. Use global estimation of all rotation matrices as an initialisation of Bundle Adjustment (as in Martinec and Pajdla). Eventually Graph Attention Networks would be fun to try here, but it is probably too difficult this year (GIT repo).
 Make use of the points on the turntable to make camera pose estimation more accurate. This is not trivial, as the turntable is a dominant plane. You may e.g. use the fact that a view change on a plane is a homography.
 Add retriangulation to your pipeline, and compare using optimal triangulation and LOST (e.g. from GTSAM) in terms of execution time and the final result quality.
 An idea of your own (discuss with your supervisor).
Datasets
 The turntable datasets produced for the course.
 For debugging it is useful to work with a noisefree dataset, and for this purpose, we have "cleaned" the dinosaur data set from any noise (up to the numerical accuracy of Matlab) and produced a dataset that contains 2D image points, 3D points, and the camera matrices. These are found in the BAdino2.mat file in the dataset repo above (use e.g. scipy.io.loadmat to load the file).
 The VGG datasets from Oxford University. This includes e.g., the dinosaur sequence.
 Some datasets by Carl Olsson at LTH:
Carl Olsson's SfM datasets  The EPFL multiview stereo datasets are also highly recommended:
Multiview stereo datasets
Code
Please look at the utility code for CE1 and CE2. Also, look at the Extra exercises in the lab sheets, they are meant to help you get started with the projects.
You are absolutely NOT allowed to use complete SfM systems such as COLMAP, SBA, Bundler, Visual SFM, SSBA, OpenMVG, or GTSAM.
On the other hand, you are encouraged to use a nonlinear least squares package, (or pytorch) and if you intend to make a more advanced representation of your 3D model, you may use a package for this. Some useful (and allowed) packages are listed below.
 Micro Bundle Adjustment by Johan Edstedt.
 Ceres Solver, a library for efficient nonlinear optimization. python bindings here.
 sparseLM, a sparse LevenbergMarquardt optimisation package.
 lmfit, a package for nonlinear leastsquares minimization and curvefitting for python.

PMVS, package to compute 3D models from images and camera poses.
 OpenCV, implements several local invariant features, and some geometry functions.
 Lambda twist P3P by Mikael Persson and Klas Nordberg.

P3P by Laurent Kneip.
 OpenGV, a collection of geometric solvers for calibrated epipolar geometry.
 Open3D, a python and C++ library for visualizing 3D models and point clouds.
Deliverables
In order to pass, each project group should do the following:

Make a design plan, and get it approved by the guide.
The design plan should contain the following: A list of the tasks/functionalities that will be implemented
 A group member list, with responsibilities for each person (e.g. what task)
 A flowchart of the system components
 A brief description of the components

Hand in a written report to the project guide.
The report should contain the following: A group member list, with responsibilities for each person
 A description of the problem that is solved
 How the problem is solved
 What the result is (i.e. performance relative to ground truth)
 Why the result is what it is
 References to used methods
 Deliver an oral presentation at the seminar
 Prepare an opposition on another group, and hand in an opposition report after the seminar.
We recommend that you use the CVPR LaTeX template when writing the documents.
References
Most of these are cached locally in the course repository (except IREG).
 C. Barnes et al. PatchMatch: A randomized correspondence algorithm for structural image editing, ACM Transactions on Graphics (Proceedings of SIGGRAPH) 2009.
 A. Fitzgibbon, G. Cross and A. Zisserman, Automatic 3D Model Construction for TurnTable Sequences, in 3D Structure from Multiple Images of LargeScale Environments, Editors Koch & Van Gool, Springer Verlag 1998, pages 155170.
 Y. Furukawa and C. Hernández, Multiview Stereo: A Tutorial. In Foundations and Trends in Computer Graphics and Vision Vol. 9, No. 12, 2013.
 Michael Kazhdan and Hugues Hoppe, Screened Poisson Surface Reconstruction, ACM Transactions on Graphics 2013.
 K. Madsen et al., Methods for NonLinear Least Squares Problems, 2nd ed., DTU technical report 2004.
 Daniel Martinec and Tomáš Pajdla, Robust Rotation and Translation Estimation in Multiview Reconstruction, CVPR 2007.
 Klas Nordberg, Introduction to Representations and Estimation in Geometry (IREG).
 Johannes Schönberger and JanMichael Frahm, StructurefromMotion Revisited, CVPR 2016.
 Triggs, McLauchlan, Hartley and Fitzgibbon, Bundle Adjustment  A Modern Synthesis, in Vision Algorithms  Theory and Practice, Springer Verlag, Lecture Notes in Computer Science No 1883, 2000, pages 298372.
 Daniel Scharstein and Richard Szeliski, A Taxonomy and Evaluation of Dense TwoFrame Stereo Correspondence Algorithms, IJCV 47(13), 2002
 Brian Curless and Marc Levoy, A Volumetric Method for Building Complex Models from Range Images, SIGGRAPH'96, 1996
 William E. Lorenzen, Harvey E. Cline, Marching cubes: A high resolution 3D surface construction algorithm, SIGGRAPH'87, 1987
 R. Newcombe et al. KinectFusion: Realtime Dense Surface Mapping and Tracking. ISMAR'11, 2011
 C. Loop and Z. Zhang, Computing Rectifying Homographies for Stereo Vision, ICCV 1999.