Exploiting advantages of non-centric imaging for computer vision

Date
2017
Journal Title
Journal ISSN
Volume Title
Publisher
University of Delaware
Abstract
Employing image features pertaining to scene geometry for reliable scene understanding and reconstruction is an important task in computer vision. The pinhole camera follows the perspective projection principle strictly, i.e project lines to lines and objects in the distance appear smaller than objects close by. Though being identical to the human vision, perspective images seem to lack effective features that can provide cues about the scene structure. In contrast, images captured by non-centric cameras are generally distorted (e.g. project lines to curves). The multi-perspective distortions produce some unique geometric features that will facilitate scene understanding tasks. ☐ In this thesis, I comprehensively exploit the advantages of general non-centric cameras, the XSlit Camera in particular, in scene understanding context. In addition to vanishing point (VP), I first show that another geometric feature exists in non-centric cameras, called the coplanar common point (CCP). A CCP is a point in the image plane corresponding to the intersection of the projections of all lines lying on a common 3D plane. I explore the existence of CCP in general non-centric cameras and show its potential in scene recovery tasks. I show that CCP generally exists in non-centric cameras and derive the necessary and sufficient conditions for CCP to exist. Specifically, I conduct a comprehensive analysis from the perspective of ray-space and caustics and show how to determine the existence of CCP for a general non-centric camera. Experiments show that the CCP analysis provides useful insights on planar structure localization. ☐ Another useful feature exhibited in non-centric images is the depth-dependent aspect ratio (DDAR): aspect ratio (AR) of an object in the image changes according to its depth to the camera. I first conduct a comprehensive analysis to characterize DDAR, infer object depth from its AR, and model recoverable depth range, sensitivity, and error. I show that repeated shape patterns in real Manhattan World scenes can be used for 3D reconstruction using a single XSlit image. I also extend the analysis to model slopes of lines. Specifically, parallel 3D lines exhibit depth-dependent slopes (DDS) in image which can also be used to infer their depths. I validate the analyses using real XSlit cameras, XSlit panoramas, and catadioptric mirrors. Experiments show that DDAR and DDS provide important depth cues and enable effective single-image scene reconstruction. ☐ Finally, I prove that structure-from-motion(SfM) via XSlit camera automatically avoid the scale ambiguity that plagues the perspective camera based solutions. I demonstrate that viewpoint transforms under XSlit camera can also be derived using the fundamental matrix analogous to the perspective case. To address non-linearity and mitigate depth-dependent distortions in XSlit images, I further develop a novel feature matching algorithm based on non-uniform Gaussian kernels. I also extend the bundle adjustment to XSlit images to refine the estimated camera poses. Experiments demonstrate that our XSlit-based SfM approach can reliably estimate camera motion and scene geometry while avoiding ambiguity.
Description
Keywords
Citation