Combining learning and computational imaging for 3D inference

Author(s)Guo, Xinqing
Date Accessioned2018-05-17T11:42:25Z
Date Available2018-05-17T11:42:25Z
Publication Date2017
SWORD Update2018-02-20T20:41:53Z
AbstractAcquiring 3D geometry of the scene is a key task in computer vision. Applications are numerous, from classical object reconstruction and scene understanding to the more recent visual SLAM and autonomous driving. Recent advances in computational imaging have enabled many new solutions to tackle the problem of 3D reconstruction. By modifying the camera's components, computational imaging optically encodes the scene, then decodes it with tailored algorithms. ☐ This dissertation focuses on exploring new computational imaging techniques, combined with recent advances in deep learning, to infer 3D geometry of the scene. In general, our approaches can be categorized into active and passive 3D sensing. ☐ For active illumination methods, we propose two solutions: first, we present a multi-flash (MF) system implemented on the mobile platform. Using the sequence of images captured by the MF system, we can extract the depth edges of the scene, and further estimate a depth map on a mobile device. Next, we show a portable immersive system that is capable of acquiring and displaying high fidelity 3D reconstructions using a set of RGB-D sensors. The system is based on structured light technique and is able to recover 3D geometry of the scene in real time. We have also developed a visualization system that allows users to dynamically visualize the event from new perspectives at arbitrary time instances in real time. ☐ For passive sensing methods, we focus on light field based depth estimation. For depth inference from a single light field, we present an algorithm that is tailored for barcode images. Our algorithm analyzes the statistics of raw light field images and conducts depth estimation with real time speed for fast refocusing and decoding. To mimic the human vision system, we investigate the dual light field input and propose a unified deep learning based framework to extract depth from both disparity cue and focus cue. To facilitate training, we have created a large dual focal stack database with ground truth disparity. While above solution focuses on fusing depth from focus and stereo, we also exploit combing depth from defocus and stereo, with an all-focus stereo pair and a defocused image of one of the stereo views as input. We have adopted the hourglass network architecture to extract depth from the image triplets. We have then studied and explored multiple neural network architectures to improve depth inference. We demonstrate that our deep learning based approaches preserve the strength of focus/defocus cue and disparity cue while effectively suppressing their weaknesses.en_US
AdvisorYu, Jingyi
DegreePh.D.
DepartmentUniversity of Delaware, Department of Computer and Information Sciences
DOIhttps://doi.org/10.58088/kw5e-5431
Unique Identifier1035837989
URLhttp://udspace.udel.edu/handle/19716/23209
Languageen
PublisherUniversity of Delawareen_US
URIhttps://search.proquest.com/docview/2023675353?accountid=10457
KeywordsApplied sciencesen_US
KeywordsComputational imagingen_US
KeywordsDeep learningen_US
KeywordsDepth estimationen_US
KeywordsLight fielden_US
TitleCombining learning and computational imaging for 3D inferenceen_US
TypeThesisen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Guo_udel_0060D_13132.pdf
Size:
27.16 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.22 KB
Format:
Item-specific license agreed upon to submission
Description: