Please login to be able to save your searches and receive alerts for new content matching your search criteria.
Stereo matching is the central problem of stereovision paradigm. Area-based techniques provide the dense disparity maps and hence they are preferred for stereo correspondence. Normalized cross correlation (NCC), sum of squared differences (SSD) and sum of absolute differences (SAD) are the linear correlation measures generally used in the area-based techniques for stereo matching. In this paper, similarity measure for stereo matching based on fuzzy relations is used to establish the correspondence in the presence of intensity variations in stereo images. The strength of relationship of fuzzified data of two windows in the left image and the right image of stereo image pair is determined by considering the appropriate fuzzy aggregation operators. However, these measures fail to establish correspondence of the pixels in the stereo images in the presence of occluded pixels in the corresponding windows. Another stereo matching algorithm based on fuzzy relations of fuzzy data is used for stereo matching in such regions of images. This algorithm is based on weighted normalized cross correlation (WNCC) of the intensity data in the left and the right windows of stereo image pair. The properties of the similarity measures used in these algorithms are also discussed. Experiments with various real stereo images prove the superiority of these algorithms over normalized cross correlation (NCC) under nonideal conditions.
In this paper, we propose a new method to simultaneously achieve segmentation and dense matching in a pair of stereo images. In contrast to conventional methods that are based on similarity or correlation techniques, this method is based on geometry, and uses correlations only on a limited number of key points. Stemming from the observation that our environment is abundant in planes, this method focuses on segmentation and matching of planes in an observed scene. Neither prior knowledge about the scene nor camera calibration are needed. Using two uncalibrated images as inputs, the method starts with a rough identification of a potential plane, defined by three points only. Based on these three points, a plane homography is then calculated and, used for validation. Starting from a seed region defined by the original three points, the method grows the current region by successive move/confirmation steps until occlusions and/or surface discontinuity occur. In this case, the homography-based mapping of points between the two images will not be valid anymore. This condition is detected by the correlation, used in the confirmation process. In particular, this method grows a region even across different colors as long as the region is planar. Experiments on real images validated our method and showed its capability and performance.
We are interested in the semantic interpretation of 3-D data obtained by a stereovision algorithm, in the context of indoor scenes. This paper presents a high level process built to find the main plane structures of the scene and to label them with a semantic description. This process is made up of three stages: first, the sorting and enhancement of the input data (3-D segments), second, the generation of affine planes from the sorted 3-D segments and finally, the semantic description of these planes by a classification expert system using the interpreter CLASSIC. This process has been tested on various stereo images and the quality of the results, in robustness and accuracy, are quite good. The advantages of using a knowledge based system are discussed.
This paper addresses the problem of computing the fundamental matrix which describes a geometric relationship between a pair of stereo images: the epipolar geometry. In the uncalibrated case, epipolar geometry captures all the 3D information available from the scene. It is of central importance for problems such as 3D reconstruction, self-calibration and feature tracking. Hence, the computation of the fundamental matrix is of great interest. The existing classical methods14 use two steps: a linear step followed by a nonlinear one. However, in some cases, the linear step does not yield a close form solution for the fundamental matrix, resulting in more iterations for the nonlinear step which is not guaranteed to converge to the correct solution. In this paper, a novel method based on virtual parallax is proposed. The problem is formulated differently; instead of computing directly the 3 × 3 fundamental matrix, we compute a homography with one epipole position, and show that this is equivalent to computing the fundamental matrix. Simple equations are derived by reducing the number of parameters to estimate. As a consequence, we obtain an accurate fundamental matrix with a stable linear computation. Experiments with simulated and real images validate our method and clearly show the improvement over the classical 8-point method.
This paper addresses the problem of computing the camera motion and the Euclidean 3D structure of an observed scene using uncalibrated images. Given at least two images with pixel correspondences, the motion of the camera (translation and rotation) and the 3D structure of the scene are calculated simultaneously. We do not assume the knowledge of the intrinsic parameters of the camera. However, an approximation of these parameters is required. Such an approximation is all the time available, either from the camera manufacturer's data or from former experiments. Classical methods based on the essential matrix are highly sensitive to image noise. This sensitivity is amplified when the intrinsic parameters of the cameras contain errors. To overcome such instability, we propose here a method where a particular choice of a 3D Euclidean coordinate system with a different parameterization of the motion/structure problem allowed us to reduce significantly the total number of unknowns. In addition, the simultaneous calculation of the camera motion and the 3D structure has made the computation of the motion and structure less sensitive to the errors in the values of the intrinsic parameters of the camera. All steps of our method are linear. However, a final nonlinear optimal step might be added to improve the accuracy of the results and to allow the orthogonality of the rotation matrix to be taken into account.
Experiments with real images validated our method and showed that a good quality motion/structure can be recovered from a pair of uncalibrated images. Intensive experiments with simulated images have shown the relationship between the errors on the intrinsic parameters and the accuracy of the recovered 3D structure.
At present, manual needle-positioning techniques known as "triangulation" and "keyhole surgery" are implemented during percutaneous nephrolithotomy (PCNL) to gain initial kidney access. These techniques do not ensure correct needle placement inside the kidney, resulting in multiple needle punctures, unnecessary hemorrhage, excessive radiation exposure to all involved and increased surgery time. A cost-effective fluoroscopy-guided needle-positioning system is proposed for aiding urologists in gaining accurate and repeatable kidney calyx access. Guidance is realized by modeling a C-arm fluoroscopic system as an adapted pinhole camera model and utilizing stereovision principles on an image pair. Targeting is realized with the aid of a graphical user interface operated by the surgeon. An average target registration error of 2.5 mm (SD = 0.8 mm) was achieved in a simulated environment. Similar results were achieved in the operating room environment with successful needle access in two in-vitro porcine kidneys.
This paper describes and compares three different approaches to estimate simultaneous localization and mapping (SLAM) in dynamic outdoor environments. SLAM has been intensively researched in recent years in the field of robotics and intelligent vehicles, many approaches have been proposed including occupancy grid mapping method (Bayesian, Dempster-Shafer and Fuzzy Logic), Localization estimation method (edge or point features based direct scan matching techniques, probabilistic likelihood, EKF, particle filter). In this paper, a number of promising approaches and recent developments in this literature have been reviewed firstly in this paper. However, SLAM estimation in dynamic outdoor environments has been a difficult task since numerous moving objects exist which may cause bias in feature selection problem. In this paper, we proposed a possibilistic SLAM with RANSAC approach and implemented with three different matching algorithms. Real outdoor experimental result shows the effectiveness and efficiency of our approach.
Laser or millimeter wave radars have shown good performance in measuring relative speed and distance in a highway driving environment. However the accuracy of these systems decreases in an urban traffic environment as more confusion occurs due to factors such as parked vehicles, guardrails, poles and motorcycles. A trinocular stereo-based sensing system provides an effective supplement to radar-based road scene analysis with its much wider field of view and more accurate lateral information. This paper presents an efficient solution using a trinocular stereo based 3D representation method of driving environment which employs the "U-V-disparity" concept. It is used to classify a 3D road scene into relative surface planes and characterize the features of road pavement surfaces, roadside structures and obstacles. Real-time implementation of the trinocular stereo disparity calculation and the "U-V-disparity" classification algorithm is also presented in this paper.
In this paper, we propose a novel method for mobile robot localization and navigation based on multispectral visual odometry (MVO). The proposed approach consists in combining visible and infrared images to localize the mobile robot under different conditions (day, night, indoor and outdoor). The depth image acquired by the Kinect sensor is very sensitive for IR luminosity, which makes it not very useful for outdoor localization. So, we propose an efficient solution for the aforementioned Kinect limitation based on three navigation modes: indoor localization based on RGB/depth images, night localization based on depth/IR images and outdoor localization using multispectral stereovision RGB/IR. For automatic selection of the appropriate navigation modes, we proposed a fuzzy logic controller based on images’ energies. To overcome the limitation of the multimodal visual navigation (MMVN) especially during navigation mode switching, a smooth variable structure filter (SVSF) is implemented to fuse the MVO pose with the wheel odometry (WO) pose based on the variable structure theory. The proposed approaches are validated with success experimentally for trajectory tracking using the mobile robot (Pioneer P3-AT).
This chapter surveys the contributions of projective geometry to computer vision. Projective geometry deals elegantly with the general case of perspective projection and therefore provides interesting understanding of the geometric aspect of image formation. It also provides useful tools like perspective invariants. First the major definitions and results of this geometry are presented. Applications are then provided for several domains of 3-D computer vision, including location of the viewer for uncalibrated cameras, properties of epipolar lines in stereovision and object recognition.
During the last decade, significant progress has been made towards the goal of using machine vision as an aid to highway driving. This chapter describes a few pieces of representative work which have been done in the area.
The two most important tasks to be performed by an automatic vehicle are road following and collision avoidance. Road following requires the recognition of the road and of the position of the vehicle with respect to the road so that appropriate lateral control commands (steering) can be generated. Collision avoidance requires the detection of obstacles and other vehicles, and the measurement of the distances of these objects to the vehicle.
We first explain the significance of vision-based automatic road vehicle guidance. We then describe the different road models, and contrast the approaches based on model-based lane marker detection with adaptive approaches. We describe in detail the important approach of road following by recursive parameter estimation, which is the basis for the most successful systems. We then address the issue of obstacle detection, first detailing monocular approaches. We finally describe an integrated stereo approach which is beneficial not only for obstacle detection, but also for road following.
This paper addresses the problem of computing the camera motion and the Euclidean 3D structure of an observed scene using uncalibrated images. Given at least two images with pixel correspondences, the motion of the camera (translation and rotation) and the 3D structure of the scene are calculated simultaneously. We do not assume the knowledge of the intrinsic parameters of the camera. However, an approximation of these parameters is required. Such an approximation is all the time available, either from the camera manufacturer's data or from former experiments. Classical methods based on the essential matrix are highly sensitive to image noise. This sensitivity is amplified when the intrinsic parameters of the cameras contain errors. To overcome such instability, we propose here a method where a particular choice of a 3D Euclidean coordinate system with a different parameterization of the motion/structure problem allowed us to reduce significantly the total number of unknowns. In addition, the simultaneous calculation of the camera motion and the 3D structure has made the computation of the motion and structure less sensitive to the errors in the values of the intrinsic parameters of the camera. All steps of our method are linear. However, a final nonlinear optimal step might be added to improve the accuracy of the results and to allow the orthogonality of the rotation matrix to be taken into account.
Experiments with real images validated our method and showed that a good quality motion/structure can be recovered from a pair of uncalibrated images. Intensive experiments with simulated images have shown the relationship between the errors on the intrinsic parameters and the accuracy of the recovered 3D structure.
A stereovision algorithm is proposed for visual odometry to estimate motion of mobile robot by providing feature pair sequence. It is composed of feature extracting, matching and tracking. Firstly, corners are extracted as features by Harris operator and grid-based optimizing. In feature matching and tracking, serious problems are caused by variable illumination between stereo images. An improved Moravec's Normalized Cross Correlation (MNCC) algorithm is presented to reduce illumination affect in computing correspondence of corners. On current stereo image pair, extracted corners are matched by correlation-based bidirectional algorithm and outliers are rejected by epipolar constraint. Matched corners are tracked in pre-estimated search windows. The computational cost is greatly reduced by limiting number of corners, pre-estimating search window and feature local-updating. Simulation results validate that our algorithm is efficient and reliable.
This chapter surveys the contributions of projective geometry to computer vision. Projective geometry deals elegantly with the general case of perspective projection and therefore provides interesting understanding of the geometric aspect of image formation. It also provides useful tools like perspective invariants. First the major definitions and results of this geometry are presented. Applications are then provided for several domains of 3-D computer vision, including location of the viewer for uncalibrated cameras, properties of epipolar lines in stereovision and object recognition.