Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

SEARCH GUIDE  Download Search Tip PDF File

  • articleNo Access

    A Hierarchical Regression Approach for Unconstrained Face Analysis

    Head pose and facial feature detection are important for face analysis. However, many studies reported good results in constrained environment, the performance could be decreased due to the high variations in facial appearance, poses, illumination, occlusion, expression and make-up. In this paper, we propose a hierarchical regression approach, Dirichlet-tree enhanced random forests (D-RF) for face analysis in unconstrained environment. D-RF introduces Dirichlet-tree probabilistic model into regression RF framework in the hierarchical way to achieve the efficiency and robustness. To eliminate noise influence of unconstrained environment, facial patches extracted from face area are classified as positive or negative facial patches, only positive facial patches are used for face analysis. The proposed hierarchical D-RF works in two iterative procedures. First, coarse head pose is estimated to constrain the facial features detection, then the head pose is updated based on the estimated facial features. Second, the facial feature localization is refined based on the updated head pose. In order to further improve the efficiency and robustness, multiple probabilitic models are learned in leaves of the D-RF, i.e. the patch’s classification, the head pose probabilities, the locations of facial points and face deformation models (FDM). Moreover, our algorithm takes a composite weight voting method, where each patch extracted from the image can directly cast a vote for the head pose or each of the facial features. Extensive experiments have been done with different publicly available databases. The experimental results demonstrate that the proposed approach is robust and efficient for head pose and facial feature detection.

  • articleNo Access

    Visual Focus of Attention and Spontaneous Smile Recognition Based on Continuous Head Pose Estimation by Cascaded Multi-Task Learning

    Multi-person Visual focus of attention (M-VFOA) and spontaneous smile (SS) recognition are important for persons’ behavior understanding and analysis in class. Recently, promising results have been reported using special hardware in constrained environment. However, M-VFOA and SS remain challenging problems in natural and crowd classroom environment, e.g. various poses, occlusion, expressions, illumination and poor image quality, etc. In this study, a robust and un-invasive M-VFOA and SS recognition system has been developed based on continuous head pose estimation in the natural classroom. A novel cascaded multi-task Hough forest (CM-HF) combined with weighted Hough voting and multi-task learning is proposed for continuous head pose estimation, tip of the nose location and SS recognition, which improves accuracies of recognition and reduces the training time. Then, M-VFOA can be recognized based on estimated head poses, environmental cues and prior states in the natural classroom. Meanwhile, SS is classified using CM-HF with local cascaded mouth-eyes areas normalized by the estimated head poses. The method is rigorously evaluated for continuous head pose estimation, multi-person VFOA recognition, and SS recognition on some public available datasets and real-class video sequences. Experimental results show that our method reduces training time greatly and outperforms the state-of-the-art methods for both performance and robustness with an average accuracy of 83.5% on head pose estimation, 67.8% on M-VFOA recognition and 97.1% on SS recognition in challenging environments.

  • articleFree Access

    Head Pose Estimation Based on Multi-Level Feature Fusion

    Head Pose Estimation (HPE) has a wide range of applications in computer vision, but still faces challenges: (1) Existing studies commonly use Euler angles or quaternions as pose labels, which may lead to discontinuity problems. (2) HPE does not effectively address regression via rotated matrices. (3) There is a low recognition rate in complex scenes, high computational requirements, etc. This paper presents an improved unconstrained HPE model to address these challenges. First, a rotation matrix form is introduced to solve the problem of unclear rotation labels. Second, a continuous 6D rotation matrix representation is used for efficient and robust direct regression. The RepVGG-A2 lightweight framework is used for feature extraction, and by adding a multi-level feature fusion module and a coordinate attention mechanism with residual connection, to improve the network’s ability to perceive contextual information and pay attention to features. The model’s accuracy was further improved by replacing the network activation function and improving the loss function. Experiments on the BIWI dataset 7:3 dividing the training and test sets show that the average absolute error of HPE for the proposed network model is 2.41. Trained on the dataset 300W_LP and tested on the AFLW2000 and BIWI datasets, the average absolute errors of HPE of the proposed network model are 4.34 and 3.93. The experimental results demonstrate that the improved network has better HPE performance.

  • articleNo Access

    Learning a Deep Regression Forest for Head Pose Estimation from a Single Depth Image

    Robust head pose estimation significantly improves the performance of applications related to face analysis in Cyber-Physical Systems (CPS) such as driving assistance and expression recognition. However, there exist two main challenges in this issue, i.e., the large pose variations and the property of inhomogeneous facial feature space. Head pose in large variations makes the distinguished facial features, such as nose or lips, invisible, especially in extreme cases. Additionally, features extracted from a head do not change in a stationary manner with respect to the head pose, which results in an inhomogeneous feature space. To deal with the above problems, we propose an end-to-end framework to estimate the head pose from a single depth image. To be specific, the PointNet network is adopted to automatically select distinguished facial feature points from visible surface of a head and to extract discriminative features. The Deep Regression Forest is utilized to handle the nonstationary property of the facial feature space and to learn the head pose distributions. Experimental results show that our proposed method achieves the state-of-the-art performance on the Biwi Kinect Head Pose Dataset, the Pandora Dataset and the ICT-3DHP Dataset.

  • articleNo Access

    SCALE ROBUST HEAD POSE ESTIMATION BASED ON RELATIVE HOMOGRAPHY TRANSFORMATION

    Head pose estimation has been widely studied in recent decades due to many significant applications. Different from most of the current methods which utilize face models to estimate head position, we develop a relative homography transformation based algorithm which is robust to the large scale change of the head. In the proposed method, salient Harris corners are detected on a face, and local binary pattern features are extracted around each of the corners. And then, relative homography transformation is calculated by using RANSAC optimization algorithm, which applies homography to a region of interest (ROI) on an image and calculates the transformation of a planar object moving in the scene relative to a virtual camera. By doing so, the face center initialized in the first frame will be tracked frame by frame. Meanwhile, a head shoulder model based Chamfer matching method is proposed to estimate the head centroid. With the face center and the detected head centroid, the head pose is estimated. The experiments show the effectiveness and robustness of the proposed algorithm.

  • chapterNo Access

    STEREO CAMERA BASED HEAD ORIENTATION ESTIMATION FOR REAL-TIME SYSTEM

    This paper presents a new system to estimate the head pose of human in interactive indoor environment that has dynamic illumination change and large working space. The main idea of this system is to suggest a new morphological feature for estimating head angle from stereo disparity map. When a disparity map is obtained from stereo camera, the matching confidence value can be derived by measurements of correlation of the stereo images. Applying a threshold to the confidence value, we also obtain the specific morphology of the disparity map. Therefore, we can obtain the morphological shape of disparity map. Through the analysis of this morphological property, the head pose can be estimated. It is simple and fast algorithm in comparison with other algorithm which apply facial template, 2D, 3D models and optical flow method. Our system can automatically segment and estimate head pose in a wide range of head motion without manual initialization like other optical flow system. As the result of experiments, we obtained the reliable head orientation data under the real-time performance.