17.07.2024 • News • Hyperspectral image processing • Artificial Intelligence • 3D Vision

From two images to a 3D object

More precise 3D reconstructions for autonomous driving and preservation of cultural artefacts.

In recent years, neural methods have become widespread in camera-based reconstructions. In most cases, however, hundreds of camera perspectives are needed. Meanwhile, conventional photometric methods exist which can compute highly precise reconstructions even from objects with textureless surfaces. However, these typically work only under controlled lab conditions. Daniel Cremers from the Technical University Munich TUM has developed a method together with his team that utilizes the two approaches. It combines a neural network of the surface with a precise model of the illumination process that considers the light absorption and the distance between the object and the light source. It combines a neural network of the surface with a precise model of the illumination process that considers the light absorption and the distance between the object and the light source. The brightness in the images is used to determine the angle and distance of the surface relative to the light source.

Fields of application for 3D reconstructions include autonomous driving and... — Fields of application for 3D reconstructions include autonomous driving and monument conservation. (Source: TUM)

“That enables us to model the objects with much greater precision than existing processes. We can use the natural surroundings and can reconstruct relatively textureless objects for our reconstructions,” says Daniel Cremers. The method can be used to preserve historical monuments or digitize museum exhibits. If these are destroyed or decay over time, photographic images can be used to reconstruct the originals and create authentic replicas.

The team also develops neural camera-based reconstruction methods for autonomous driving, where a camera films the vehicle's surroundings. The autonomous car can model its surroundings in real-time, develop a three-dimensional representation of the scene, and use it to make decisions. The process is based on neural networks that predict 3D point clouds for individual video images that are then merged into a large-scale model of the roads travelled. (Source: TUM)

Reference: M. Brahimi et al.: Sparse Views Near Light: A Practical Paradigm for Uncalibrated Point-light Photometric Stereo, Proc. IEEE/CVF Conf. Computer Vision and Pattern Recog. (CVPR), 11862 (2024); DOI: 10.48550/arXiv.2404.00098

Link: Computer Vision, Munich Center for Machine Learning, Technical University Munich TUM, Munich, Germany