Fatigue detection of workers is an important factor in construction site monitoring. Nowadays, worker exhaustion on construction sites causes tiredness and drowsiness. The prediction of mental exhaustion is critical because the job has increased over the years. Accurate fatigue detection is important for analyzing the stress level of work on construction sites. However, recording worker activities and detecting fatigue is critical for site supervisors. Over the last century, there has been an increasing trend towards vision-based action recognition. The identification of worker activities in far-field surveillance video has received good attention. Henceforth, this research proposed a hierarchical statistical approach for detecting worker activity in far-field security footage. In this paper, extension based equilibrium with capsule autoencoder network is proposed for fatigue detection. To improve the prediction performance, video data is collected for monitoring. Initially, the video dataset is converted into frames, and these frames are pre-processed using normalization. Afterwards, statistical feature extraction techniques are extracted from the normalized frames to improve the prediction performance. It helps to identify the symptoms of fatigue detection. The persons with the symptoms are identified from the extraction. Feature selection is performed using hybrid Battle Royale Optimization and Particle Swarm Optimization algorithms (hybrid BRO-PSO). Finally, based on the selected features, fatigue classification is performed using extended equilibrium optimization with Capsule Autoencoder (EECAEN). The performance of the proposed scheme is compared with different existing approaches. The performance of fatigue detection is analyzed with real time video dataset. The performance of the proposed approach is compared with different existing techniques such as Convolution Neural Network (CNN), Long Short-Term Memory (LSTM), and Recurrent Neural Network (RNN). The proposed method outperforms the existing approaches in terms of F1-score, accuracy, recall, precision, specificity, and sensitivity.