Combining learning and computational imaging for 3D inference

Guo, Xinqing

Combining learning and computational imaging for 3D inference

Author(s)	Guo, Xinqing
Date Accessioned	2018-05-17T11:42:25Z
Date Available	2018-05-17T11:42:25Z
Publication Date	2017
SWORD Update	2018-02-20T20:41:53Z
Abstract	Acquiring 3D geometry of the scene is a key task in computer vision. Applications are numerous, from classical object reconstruction and scene understanding to the more recent visual SLAM and autonomous driving. Recent advances in computational imaging have enabled many new solutions to tackle the problem of 3D reconstruction. By modifying the camera's components, computational imaging optically encodes the scene, then decodes it with tailored algorithms. ☐ This dissertation focuses on exploring new computational imaging techniques, combined with recent advances in deep learning, to infer 3D geometry of the scene. In general, our approaches can be categorized into active and passive 3D sensing. ☐ For active illumination methods, we propose two solutions: first, we present a multi-flash (MF) system implemented on the mobile platform. Using the sequence of images captured by the MF system, we can extract the depth edges of the scene, and further estimate a depth map on a mobile device. Next, we show a portable immersive system that is capable of acquiring and displaying high fidelity 3D reconstructions using a set of RGB-D sensors. The system is based on structured light technique and is able to recover 3D geometry of the scene in real time. We have also developed a visualization system that allows users to dynamically visualize the event from new perspectives at arbitrary time instances in real time. ☐ For passive sensing methods, we focus on light field based depth estimation. For depth inference from a single light field, we present an algorithm that is tailored for barcode images. Our algorithm analyzes the statistics of raw light field images and conducts depth estimation with real time speed for fast refocusing and decoding. To mimic the human vision system, we investigate the dual light field input and propose a unified deep learning based framework to extract depth from both disparity cue and focus cue. To facilitate training, we have created a large dual focal stack database with ground truth disparity. While above solution focuses on fusing depth from focus and stereo, we also exploit combing depth from defocus and stereo, with an all-focus stereo pair and a defocused image of one of the stereo views as input. We have adopted the hourglass network architecture to extract depth from the image triplets. We have then studied and explored multiple neural network architectures to improve depth inference. We demonstrate that our deep learning based approaches preserve the strength of focus/defocus cue and disparity cue while effectively suppressing their weaknesses.	en_US
Advisor	Yu, Jingyi
Degree	Ph.D.
Department	University of Delaware, Department of Computer and Information Sciences
DOI	https://doi.org/10.58088/kw5e-5431
Unique Identifier	1035837989
URL	http://udspace.udel.edu/handle/19716/23209
Language	en
Publisher	University of Delaware	en_US
URI	https://search.proquest.com/docview/2023675353?accountid=10457
Keywords	Applied sciences	en_US
Keywords	Computational imaging	en_US
Keywords	Deep learning	en_US
Keywords	Depth estimation	en_US
Keywords	Light field	en_US
Title	Combining learning and computational imaging for 3D inference	en_US
Type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Guo_udel_0060D_13132.pdf
Size:: 27.16 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 2.22 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Doctoral Dissertations (Winter 2014 to Present)