In the past years, research in eye tracking development and applications has attracted much attention and the possibility of interacting with a computer employing just gaze information is becoming more and more feasible. Efforts in eye tracking cover a broad spectrum of fields, system mathematical modeling being an important aspect in this research. Expressions relating to several elements and variables of the gaze tracker would lead to establish geometric relations and to find out symmetrical behaviors of the human eye when looking at a screen. To this end a deep knowledge of projective geometry as well as eye physiology and kinematics are basic. This paper presents a model for a bright-pupil technique tracker fully based on realistic parameters describing the system elements. The system so modeled is superior to that obtained with generic expressions based on linear or quadratic expressions. Moreover, model symmetry knowledge leads to more effective and simpler calibration strategies, resulting in just two calibration points needed to fit the optical axis and only three points to adjust the visual axis. Reducing considerably the time spent by other systems employing more calibration points renders a more attractive model.
In this paper, an improved Mean-shift algorithm was integrated with standard tracking–learning–detection (TLD) model tracker for improving the tracking effects of standard TLD model and enhancing the anti-occlusion capability and the recognition capability of similar objectives. The target region obtained by the improved Mean-shift algorithm and the target region obtained by the TLD model tracker are integrated to achieve favorable tracking effects. Then the optimized TLD tracking system was applied to human eye tracking. In the tests, the model can be self-adopted to partial occlusion, such as eye-glasses, closed eyes and hand occlusion. And the roll angle can approach 90∘, raw angle can approach 45∘ and pitch angle can approach 60∘. In addition, the model never mistakenly transfers the tracking region to another eye (similar target on the same face) in longtime tracking. Experimental results indicate that: (1) the optimized TLD model shows sound tracking stability even when targets are partially occluded or rotated; (2) tracking speed and accuracy are superior to those of the standard TLD and some mainstream tracking methods. In summary, the optimized TLD model show higher robustness, stability and better responding to complex eye tracking requirement.
A new method based on eye-tracking data — visual momentum (VM) — was introduced to quantitatively evaluate a dynamic interactive visualization interface. We extracted the dimensionless factors from the raw eye-tracking data, including the fixation time factor T, the saccade amplitude factor D and the fixation number factor N. A predictive regression model of VM was deduced by eye movement factors and the performance response time (RT). In Experiment 1, the experimental visualization materials were designed with six effectiveness levels according to design techniques proposed by Woods to improve VM (total replacement, fixed format data replacement, long shot, perceptual landmark and spatial representation) and were tested in six parallel subject groups. The coefficients of the regression model were calculated from the data of 42 valid subjects in Experiment 1. The mean VM of each group exhibited an increasing trend with an increase in design techniques. The data of the performance and eye tracking among the combined high VM group, middle VM group and low VM group indicated significant differences. The data analysis indicates that the results were consistent with the previous qualitative research of VM. We tested and verified this regression model in Experiment 2 with another dynamic interactive visualization. The results indicated that the VM calculated by the regression model was significantly correlated with the performance data. Therefore, the virtual parameter VM can be a quantitative indicator for evaluating dynamic visualization. It could be a useful evaluation method for the dynamic visualization in general working environments.
The traditional information extraction technology of dashboard is easily affected by external factors, and the robustness is poor. To improve the safety of the pilot’s performance on the dashboard, this paper proposes a way for extracting the dashboard feature information in eye tracking, which acquires the line of sight point in simulated dashboard. It then uses the Mask R-CNN method to detect the gaze area and then extracts the target feature information. Finally, it fuses two sets of data to get the result of the pilot who extracts the target gaze area in the scene. Experiment results show that the method of new dashboard information extraction proposed in this paper has a better accuracy.
Investigating realistic visual exploration is quite challenging in sport climbing, but it promises a deeper understanding of how performers adjust their perception-action couplings during task completion. However, the samples of participants and the number of trials analyzed in such experiments are often reduced to a minimum because of the time-consuming treatments of the eye-tracking data. Notably, mapping successive points of gaze from local views to the global scene is generally performed manually by watching eye-tracking video data frame by frame. This manual procedure is not suitable for processing a large number of datasets. Consequently, this study developed an automatic method for solving this global point of gaze localization in indoor sport climbing. Particularly, an eye-tracking device was used for acquiring local image frames and points of gaze from a climber’s local views. Artificial landmarks, designed as four-color-disk groups, were distributed on the wall to facilitate localization. Global points of gaze were computed based on planar homography transforms between the local and global positions of the detected landmarks. Thirty climbing trials were recorded and processed by the proposed methods. The success rates (Mean±SD) were up to 85.72%±13.90%, and the errors (Mean±SD) were up to 0.1302±0.2051m. The proposed method will be employed for computing global points of gaze in our current climbing dataset for understanding the dynamics intertwining of gaze and motor behaviors during the climbs.
Attention-deficit/hyperactivity disorder (ADHD) is a common neurodevelopmental disorder in children and adolescents. Traditional diagnosis methods of ADHD focus on observed behavior and reported symptoms, which may lead to a misdiagnosis. Studies have focused on computer-aided systems to improve the objectivity and accuracy of ADHD diagnosis by utilizing psychophysiological data measured from devices such as EEG and MRI. Despite their performance, their low accessibility has prevented their widespread adoption. We propose a novel ADHD prediction method based on the pupil size dynamics measured using eye tracking. Such data typically contain missing values owing to anomalies including blinking or outliers, which negatively impact the classification. We therefore applied an end-to-end deep learning model designed to impute the dynamic pupil size data and predict ADHD simultaneously. We used the recorded dataset of an experiment involving 28 children with ADHD and 22 children as a control group. Each subject conducted an eight-second visuospatial working memory task 160 times. We treated each trial as an independent data sample. The proposed model effectively imputes missing values and outperforms other models in predicting ADHD (AUC of 0.863). Thus, given its high accessibility and low cost, the proposed approach is promising for objective ADHD diagnosis.
This paper proposed an improved natural semantic information-based eye-tracking method combing Needleman–Wunsch and SubsMatch technologies in psychological assessments. The natural semantic information of the self-assessment scale in psychological measurement was combined with the participants’ eye-tracking data, and a time-space similarity method was proposed to calculate the eye-tracking differences between different participants in the questionnaire response and carry out visual pattern recognition to achieve the screening of the target population in psychological measurement. The proposed method was evaluated by screening a sample at high risk of depression. The comparative results showed that the average screening accuracy of the sample under the stimulation of a single topic was 80.13% and increased to 97.37% after a dimensionality reduction. This paper verified the objectivity and effectiveness of the semantic information-based eye-tracking time-space similarity method and proved it to be a promising objective tool to assist psychological measurement.
This study used the recording of eye-tracking-based target domain (TD) fixation as the primary approach to explore the correlation between occupying fixation and sight interpretation (SI) performance. This paper records the gaze plot and gaze duration during the sight interpretation and analyzes the correlation between them and the interpretation performance. First, we designed the nine-point track calibration for a noninvasive study. Second, we carried out pre-experiments to find out the best experimental conditions. Finally, after eye rest, we performed the formal test. Extensive experiments were implemented to verify the factors that affect the SI performance, including the number of TD occupying fixations, the time-cost of TD occupying fixation, and the concentration of TD occupying fixation. Statistical analysis of experiments concluded that the psychological dictionary, Long-term Memory (LTM) information, and bilingual conversion skills are the main factors affecting the number and time of eye-tracking TD occupying fixation spots.
In this paper, we investigated the parameters related to eye movement patterns of individuals while viewing images that consist of natural and man-made scenes. These parameters are as follows: number of fixations and saccades, fixation duration, saccade amplitude and distribution of fixation locations. We explored the way in which individuals look at images of different semantic categories, and used this information for automatic image classification. We showed that the eye movements and the contents of eye fixation locations of observers differ for images of different semantic categories. These differences were used effectively in automatic image categorization. Another goal of this study was to find the answer of this question that “whether the image patches of fixation points have sufficient information for image categorization?” To achieve this goal, a number of patches with different sizes from two different image categories was extracted. These patches, which were selected at the location of eye fixation points, were used to form a feature vector based on K-means clustering algorithm. Then, different statistical classifiers were trained for categorization purpose. The results showed that it is possible to predict the image category by using the feature vectors derived from the image patches. We found significant differences in parameters of eye movement pattern between the two image categories (average across subjects). We could categorize images by using these parameters as features. The results also showed that it is possible to predict the image category by using image patches around the subjects’ fixation points.
Ontology visualization plays an important role in human data interaction by offering clarity and insight for complex structured datasets. Recent usability studies of ontology visualization techniques have added to our understanding of desired features when assisting users in the interactive process. However, user behavioral data such as eye gaze and event logs have largely been used as indirect evidence to explain why a user may have carried out certain tasks in a controlled environment, as opposed to direct input that informs the underlying visualization system. Although findings from usability studies have contributed to the refinement of ontology visualizations as a whole, the visualization techniques themselves remain a one-size-fits-all approach, where all users are presented with the same visualizations and interactive features. By contrast, this paper investigates the feasibility of using behavioral data, such as user gaze and event logs, as real-time indicators of how appropriate or effective a given visualization may be for a specific user at a moment in time, which in turn may be used to inform the adaptation of the visualization to the user on the fly. To this end, we apply established predictive modeling techniques in Machine Learning to predict user success using gaze data and event logs. We present a detailed analysis from a controlled experiment and demonstrate such predictions are not only feasible, but can also be significantly better than a baseline classifier during visualization usage. These predictions can then be used to drive the adaptations of visual systems in providing ad hoc visualizations on a per user basis, which in turn may increase individual user success and performance. Furthermore, we demonstrate the prediction performance using several different feature sets, and report on the results generated from several notable classifiers, where a decision tree-based learning model using a boosting algorithm produced the best overall results.
In many applications of human–computer interaction, a prediction of the human’s next intended action is highly valuable. To control direction and orientation of the body when walking towards a goal, a walking person relies on visual input obtained by eye and head movements. The analysis of these parameters might allow us to infer the intended goal of the walker. However, such a prediction of human locomotion intentions is a challenging task, since interactions between these parameters are nonlinear and highly dynamic. We employed machine learning models to investigate if walk and gaze data can be used for locomotor prediction. We collected training data for the models in a virtual reality experiment in which 18 participants walked freely through a virtual environment while performing various tasks (walking in a curve, avoiding obstacles and searching for a target). The recorded position, orientation- and eye-tracking data was used to train an LSTM model to predict the future position of the walker on two different time scales, short-term predictions of 50ms and long-term predictions of 2.5s. The trained LSTM model predicted free walking paths with a mean error of 5.14mm for the short-term prediction and 65.73cm for the long-term prediction. We then investigated how much the different features (direction and orientation of the head and body and direction of gaze) contributed to the prediction quality. For short-term predictions, position was the most important feature while orientation and gaze did not provide a substantial benefit. In long-term predictions, gaze and orientation of the head and body provided significant contributions. Gaze offered the greatest predictive utility in situations in which participants were walking short distances or in which participants changed their walking speed.
Robust and accurate eye gaze tracking can advance medical telerobotics by providing complementary data for surgical training, interactive instrument control, and augmented human–robot interactions. However, current gaze tracking solutions for systems such as the da Vinci Surgical System (dVSS) are limited to complex hardware installations. Additionally, existing methods do not account for operator head movement inside the surgeon console, invalidating the original calibration. This work provides an initial solution to these challenges that can seamlessly integrate into console devices beyond the dVSS. Our approach relies on simple and unobtrusive wearable eye tracking glasses and provides calibration routines that can contend with operator-head movements. An external camera measures movement of the glasses through trackers mounted on the glasses to detect invalidation of the prior calibration from head movement and slippage. Movements beyond a threshold of 5 cm or 9ˆ∘ prompt another calibration sequence. In a study where users moved freely in the surgeon console after an initial calibration procedure, we show that our system tracks the eye tracking glasses to initiate recalibration procedures. Recalibration can reduce the mean tracking error up to 89% compared to the current prevailing approach which relies on the initial calibration only. This work is an important first step towards incorporating user movement into gaze-based applications for the dVSS.
Many previous eye-tracking studies were conducted to examine how adult readers process different written languages. Relatively, only few eye-tracking studies have been conducted to observe the reading process of Arab children. This study investigated the influence of orthographic regularity on Saudi elementary grades’ English and Arabic words recognition. The eye movements of 15 grade-four students and 15 grade-six students were recorded while they read words that differ in frequency and regularity. Analysis of the visual information from the word-recognition process shows differences in the students’ eye movements for the two languages. There were statistically significant differences in the total fixation duration and fixation count between the two languages and between both groups. All the students showed longer processing time for English sentences than Arabic ones. However, Arabic-speaking students were influenced by English orthography with more processing difficulty for English irregular words. The visual information shows that more cross-linguistic differences are found in grade-four students’ results. Grade-four students transferred their first language (L1) reading strategies to read English words; however, Arabic reading methods cannot be effectively applied to reading irregular orthographies like English. This explains the increased eye-movement measurements of grade-four students compared to grade-six students, who fixated more on unfamiliar English words. Although orthographic regularity had a major effect on the word-recognition process in this study, the development of the students’ Arabic and English orthographic knowledge affected the progress of their visual word recognition across the two levels.
Crime scene investigation is one of the most important steps for the investigation and trial process. The crime scene can provide lots of clues such as physical evidence, modus operandi, etc., which helps an investigator find links to the suspect. Crime scene investigators’ ability to collect evidence and assess crime scenes could help the investigator solve crime efficiently and accurately. There are currently few studies about investigative thinking and decision making during crime scene investigation. The aim of this research is to compare investigative thinking of experts and novices through interview and eye tracking data.
In this study we use omnidirectional camera to scan the criminal scene in order to construct the virtual reality (VR) scenario. Forty-eight participants were recruited from Central Police University and other forensic departments, categorized as experts or trained novices based on their practical experience. The expert and novice investigators wore the VR headset with eye trackers and their investigation strategies were compared in simulated crime scenes from eye tracking data of gaze plot and heat map and hotspot gaze. The participants were interviewed for their investigative and logical thinking during the simulated crime scene investigation.
The results show that the experts searched for evidence more efficiently than the novices because of their prior experience, while making deductions from the crime scenes more conservatively. Finally, we suggest that VR system with eye tracking could be a useful tool for investigators’ training in assessing the crime scene.
This study extended the project of “The Study on Rotational Motion Perception and Eye Tracking in Motion Forms”. By using eye-tracking equipment and technology, the visual movement and eye tracking change of perception were investigated, and the relationship between motion illusion perception and eye movement was analyzed. The related studies on motion perception of a column of rotational motion illusion in kinetic art are classified into “forms change” and “continuous line graphic on the form”. However, as the graphic on the surface is focused on “continued line graphic”, it is not extended to “discontinued line graphic.” This study focused on the “discontinued line graphic” and combined eye-tracking with vision tracking. With the standard of the best geometric form of induced movement of motion illusion perception, different types of discontinued line graphics were observed and compared for finding the difference.
The golden ratio (GR) is an irrational number (close to 1.618) that repeatedly occurs in nature as well as in masterpieces of art. The GR has been considered a proportion perfectly representing beauty since ancient times, and it was investigated in several scientific fields but with conflicting results. This study aims at investigating if this proportion is associated with a judgment of beauty independently of the type of the stimulus, and the factors that may affect this aesthetic preference. In Experiment 1, an online psychophysical questionnaire was administered to 256 volunteers asked to choose among three possible proportions between the parts of the same stimulus (GR, 1.5, and 1.8). In Experiment 2, we recorded eye movements of 15 participants who had to express an aesthetic judgment on the same stimuli as Experiment 1. The results revealed a slight overall preference for GR (53%, p < 0.001), with higher preferences for stimuli representing humans, anthropomorphic sculptures, and paintings, regardless of the cultural level. In Experiment 2, a shorter dwell time was significantly associated with a better aesthetic judgment (p = 0.005), suggesting the possibility that GR could be associated with easier visual processing, and it could be hence considered as a visual affordance.
Please login to be able to save your searches and receive alerts for new content matching your search criteria.