Research PaperFree Access

OTRN-DCN: An optimized transformer-based residual network with deep convolutional network for action recognition and multi-object tracking of adaptive segmentation using soccer sports video

K. Kausalya

Department of Information Technology, Easwari Engineering College, Ramapuram, Chennai, Tamil Nadu, India

E-mail Address: kkausalya828@gmail.com

Corresponding author.

Search for more papers by this author

and

S. Kanaga Suba Raja

https://orcid.org/0009-0002-9187-6042

Department of Information Technology, Easwari Engineering College, Ramapuram, Chennai, Tamil Nadu, India

E-mail Address: skanagasubaraja@gmail.com

Search for more papers by this author

https://doi.org/10.1142/S0219691323500340Cited by:2 (Source: Crossref)

Abstract

In today’s era, video analysis is immensely involved in recognizing the sport-related movement that has become a significant part of human’s life. The intent of this approach is to know about the player’s activities with prior information of tracking objects. It also analyzes the player potential or capacity to lead the winning team. When the player frequently changes their location, object tracking and action recognition will become a quite challenging task. Over the game, various athletes or different objects are considered to assist the system to easily recognize the respective actions of the player. Most of the previous models have been implemented, yet, it faces such consequences to provide promising performance. To meet the pre-requisite, a new multi-athlete tracking model for action recognition in soccer sports is designed with deep learning approaches. Initially, the multi-object tracking video is offered as the input to pre-processing phase. Here, occlusion and background clutter removal and contrast enhancement techniques are utilized to perform pre-processing in the videos. Then, the pre-processed video is offered to the multi-object tracking phase, where the jersey number is observed during multi-object tracking to avoid the identity switch problem. Then, effective multi-object tracking is performed by adaptive YOLOv5. The parameters presented in the improved adaptive YOLOv5 are tuned by proposing a new algorithm as the Random-based Cheetah Red Deer Algorithm (RCRDA). Next, in the action recognition phase, the tracked object from the video is taken based on the Region of Interest (ROI) that is subjected to an action recognition model named Optimized Transformer-based Residual Network with Deep Convolutional Network (OTRN-DCN). At first, ROI is offered as the input to TRN for attaining the feature vectors. Then, the optimal weighted vector extraction is performed, where the weight is tuned by the developed RCRDA. Finally, the attained optimally weighted vectors are given to the DCN phase for attaining recognized action as output. Hence, the developed multi-object tracking and action recognition model will secure an improved recognition rate than the traditional framework.

Keywords:

References

1. T. Ahmad, Y. Ma, M. Yahya, B. Ahmad, S. Nazir and A. ul Haq , Object detection through modified YOLO neural network, Sci. Program. 2020 (2020) 8403262. Web of Science, Google Scholar
2. M. A. Akbari, M. Zare, R. Azizipanah-Abarghooee, S. Mirjalili and M. Deriche , The cheetah optimizer: A nature-inspired metaheuristic algorithm for large-scale optimization problems, Sci. Rep. 12(10953) (2022) 10953. Crossref, Web of Science, Google Scholar
3. M. V. Berry, Z. V. Lewis and J. F. Nye , On the Weierstrass-Mandelbrot fractal function, Proc. R. Soc. Lond. Ser. A 370(1743) (1980) 459–484. Crossref, Web of Science, Google Scholar
4. P. Chavali and A. Nehorai , Concurrent particle filtering and data association using game theory for tracking multiple maneuvering targets, IEEE Trans. Signal Process. 61(20) (2013) 4934–4948. Crossref, Web of Science, Google Scholar
5. Y. Chen, W. Li, C. Sakaridis, D. Dai and L. V. Gool , Domain adaptive faster R-CNN for object detection in the wild, Computer Vision and Pattern Recognit (2018). Google Scholar
6. J. Chen, R. D. J. Samuel and P. Poovendran , LSTM with bio inspired algorithm for action recognition in sports videos, Image Vis. Comput. 112 (2021) 104214. Crossref, Web of Science, Google Scholar
7. M. Fathollahi-Fard, M. Hajiaghaei-Keshteli and R. Tavakkoli-Moghaddam , Red Deer Algorithm (RDA): A new nature-inspired meta-heuristic, Soft Comput. 24 (2020) 14637–14665. Crossref, Web of Science, Google Scholar
8. G. Gao, J. Cao, C. Bao, Q. Hao, A. Ma and G. Li , A novel transformer-based attention network for image dehazing, Sensors 22 (2022) 3428. Crossref, Web of Science, Google Scholar
9. E. Guariglia , Spectral analysis of the Weierstrass-Mandelbrot function, 2017 2nd Int. Multidisciplinary Conf. Computer and Energy Science (SpliTech), Split, Croatia (2017). Google Scholar
10. E. Guariglia and R. C. Guido , Chebyshev wavelet analysis, J. Funct. Spaces 2022(1) (2022) 17. Google Scholar
11. E. Guariglia, R. C. Guido and G. J. P. Dalalana , From wavelet analysis to fractional calculus: A review, Mathematics 11(7) (2023) 1606. Crossref, Web of Science, Google Scholar
12. E. Guariglia and S. Silvestrov , Fractional-wavelet analysis of positive definite distributions and wavelets on D’D’(C), Eng. Math. 179 (2017) 337–353. Google Scholar
13. T. Guha and R. K. Ward , Learning sparse representations for human action recognition, IEEE Trans. Pattern Anal. Mach. Intell. 34(8) (2012) 1576–1588. Crossref, Web of Science, Google Scholar
14. R. C. Guido , Practical and useful tips on discrete wavelet transforms [sp Tips & Tricks], IEEE Signal Process. Mag. 32(3) (2015) 162–166. Crossref, Web of Science, Google Scholar
15. R. C. Guido, F. Pedroso, R. C. Contreras, L. C. Rodrigues, E. Guariglia and J. S. Neto , Introducing the Discrete Path Transform (DPT) and its applications in signal analysis, artefact removal, and spoken word recognition, Digit. Signal Process. 117 (2021) 103158. Crossref, Web of Science, Google Scholar
16. H. Guo, X. Wu and N. Li , Action extraction in continuous unconstrained video for cloud-based intelligent service robot, IEEE Access 6 (2018) 33460–33471. Crossref, Web of Science, Google Scholar
17. F. A. Hashim, E. H. Houssein, K. Hussain, M. S. Mabrouk and W. Al-Atabany , Honey badger algorithm: New metaheuristic algorithm for solving optimization problems, Math. Comput. Simul. 192 (2022) 84–110. Crossref, Web of Science, Google Scholar
18. X. Jiang , Human tracking of track and field athletes based on FPGA and computer vision, Microprocess. Microsyst. 83 (2021) 104020. Crossref, Web of Science, Google Scholar
19. L. Kong, D. Huang, J. Qin and Y. Wang , A joint framework for athlete tracking and action recognition in sports videos, IEEE Trans. Circuits Syst. Video Technol. 30(2) (2020) 532–548. Crossref, Web of Science, Google Scholar
20. L. Kong, D. Huang and Y. Wang , Long-term action dependence-based hierarchical deep association for multi-athlete tracking in sports videos, IEEE Trans. Image Process. 29 (2020) 7957–7969. Crossref, Web of Science, Google Scholar
21. H. Lee, Y.-S. Kim, M. Kim and Y. Lee , Low-cost network scheduling of 3D-CNN processing for embedded action recognition, IEEE Access 9 (2021) 83901–83912. Crossref, Web of Science, Google Scholar
22. H. Li, J. Tang, S. Wu, Y. Zhang and S. Lin , Automatic detection and analysis of player action in moving background sports video sequences, IEEE Trans. Circuits Syst. Video Technol. 20(3) (2010) 351–364. Crossref, Web of Science, Google Scholar
23. D. Li, T. Yao, L.-Y. Duan, T. Mei and Y. Rui , Unified spatio-temporal attention networks for action recognition in videos, IEEE Trans. Multimed. 21(2) (2019) 416–428. Crossref, Web of Science, Google Scholar
24. Q. Liang, W. Wu, Y. Yang, R. Zhang, Y. Peng and M. Xu , Multi-player tracking for multi-view sports videos with improved k-shortest path algorithm, Appl. Sci. 10 (2020) 864. Crossref, Google Scholar
25. W.-L. Lu, J.-A. Ting, J. J. Little and K. P. Murphy , Learning to track and identify players from broadcast sports videos, IEEE Trans. Pattern Anal. Mach. Intell. 35(7) (2013) 1704–1716. Crossref, Web of Science, Google Scholar
26. A. Majumder and S. Irani , Perception-based contrast enhancement of images, ACM Trans. Appl. Percept. 4(3) (2007) 17. Crossref, Web of Science, Google Scholar
27. S. G. Mallat , A theory for multiresolution signal decomposition: The wavelet representation, IEEE Trans. Pattern Anal. Mach. Intell. 11(7) (1989) 674–693. Crossref, Web of Science, Google Scholar
28. M. Meghji, A. Balloch, D. Habibi, I. Ahmad, N. Hart, R. Newton and J. Weber , An algorithm for the automatic detection and quantification of athletes’ change of direction incidents using IMU sensor data, IEEE Sens. J. 19(12) (2019) 4518–4527. Crossref, Web of Science, Google Scholar
29. A. Mihanpour, M. J. Rashti and S. E. Alavi , Human action recognition in video using DB-LSTM and ResNet, 2020 6th Int. Conf. Web Research (ICWR) (Tehran, Iran, 2020), pp. 133–138. Crossref, Google Scholar
30. MOTA and MOTP, https://visailabs.com/evaluating-multiple-object-tracking-accuracy-and-performance-metrics-in-a-real-time-setting/. Google Scholar
31. P. Ong, T. K. Chong, K. M. Ong and E. S. Low , Tracking of moving athlete from video sequences using flower pollination algorithm, Vis. Comput. 38 (2022) 939–962. Crossref, Web of Science, Google Scholar
32. M. Qi, Y. Wang, J. Qin, A. Li, J. Luo and L. Van Gool , stagNet: An attentive semantic RNN for group activity and individual action recognition, IEEE Trans. Circuits Syst. Video Technol. 30(2) (2020) 549–565. Crossref, Web of Science, Google Scholar
33. R. V. Rao , Jaya: A simple and new optimization algorithm for solving constrained and unconstrained optimization problems, Int. J. Ind. Eng. Comput. 7 (2016) 19–34. Google Scholar
34. W. Rawat and Z. Wang , Deep convolutional neural networks for image classification: A comprehensive review, Neural Comput. 29(9) (2017) 2352–2449. Crossref, Web of Science, Google Scholar
35. S. P. Sahoo, S. Ari, K. Mahapatra and S. P. Mohanty , HAR-Depth: A novel framework for human action recognition using sequential learning and depth estimated history images, IEEE Trans. Emerg. Top. Comput. Intell. 5(5) (2021) 813–825. Crossref, Google Scholar
36. S. S. Shinde , Enhanced Manta-Ray foraging optimization algorithm based DCNN for lane detection, Multimed. Res. 4(3) (2021) 34–41. Crossref, Google Scholar
37. Q. Song, S. Li, Q. Bai, J. Yang, X. Zhang, Z. Li and Z. Duan , Object detection method for grasping robot based on improved YOLOv5, Micromachines 12 (2021) 1273. Crossref, Web of Science, Google Scholar
38. Y. Tian, Y. Kong, Q. Ruan, G. An and Y. Fu , Aligned dynamic-preserving embedding for zero-shot action recognition, IEEE Trans. Circuits Syst. Video Technol. 30(6) (2020) 1597–1612. Crossref, Web of Science, Google Scholar
39. Z. Tu, W. Xie, J. Dauwels, B. Li and J. Yuan , Semantic cues enhanced multimodality multistream CNN for action recognition, IEEE Trans. Circuits Syst. Video Technol. 29(5) (2019) 1423–1437. Crossref, Web of Science, Google Scholar
40. T. Wang and C. Shi , Basketball motion video target tracking algorithm based on improved gray neural network, Neural Comput. Appl. 35 (2022) 4267–4282. Crossref, Web of Science, Google Scholar
41. A. Waqar, I. Ahmad, D. Habibi, N. Hart and Q. V. Phung , Enhancing athlete tracking using data fusion in wearable technologies, IEEE Trans. Instrum. Meas. 70 (2021) 1–13. Crossref, Web of Science, Google Scholar
42. T. O. Worsey, H. G. Espinosa, J. B. Shepherd and D. V. Thiel , One size doesn’t fit all: Supervised machine learning classification in athlete-monitoring, IEEE Sens. Lett. 5(3) (2021) 1–4. Crossref, Google Scholar
43. X. Xie, G. Cheng, J. Wang, X. Yao and J. Han , Oriented R-CNN for object detection, in Proc. IEEE/CVF Int. Conf. Computer Vision (ICCV) (2021), pp. 3520–3529. Crossref, Google Scholar
44. L. Yang, H. Su, C. Zhong, Z. Meng, H. Luo, X. Li, Y. Y. Tang and Y. Lu , Hyperspectral image classification using wavelet transform-based smooth ordering, Int. J. Wavelets Multiresolution Inf. Process. 17(6) (2019) 1950050. Link, Web of Science, Google Scholar
45. Y. Yoon, H. Hwang, Y. Choi, M. Joo, H. Oh, I. Park, K.-H. Lee and J.-H. Hwang , Analyzing basketball movements and pass relationships using realtime object tracking techniques based on deep learning, IEEE Access 7 (2019) 56564–56576. Crossref, Web of Science, Google Scholar
46. H. Yu, A. Sharma and P. Sharma , Adaptive strategy for sports video moving target detection and tracking technology based on mean shift algorithm, Int. J. Syst. Assur. Eng. Manag. (2021). Crossref, Web of Science, Google Scholar
47. Z. Zhang, C. Wang, B. Xiao, W. Zhou and S. Liu , Action recognition using context-constrained linear coding, IEEE Signal Process. Lett. 19(7) (2012) 439–442. Crossref, Web of Science, Google Scholar
48. R. Zhang, L. Wu, Y. Yang, W. Wu, Y. Chen and M. Xu , Multi-camera multi-player tracking with deep player identification in sports video, Pattern Recognit. 102 (2020) 107260. Crossref, Web of Science, Google Scholar
49. X. Zheng, Y. Y. Tang and J. Zhou , A framework of adaptive multiscale wavelet decomposition for signals on undirected graphs, IEEE Trans. Signal Process. 67(7) (2019) 1696–1711. Crossref, Web of Science, Google Scholar
50. G. Zhu, Q. Huang, C. Xu, L. Xing, W. Gao and H. Yao , Human behavior analysis for highlight ranking in broadcast racket sports video, IEEE Trans. Multimed. 9(6) (2007) 1167–1182. Crossref, Web of Science, Google Scholar

Vol. 22, No. 01

Metrics

Downloaded 70 times

History

Received 30 March 2023

Revised 16 June 2023

Accepted 27 June 2023

Published: 7 September 2023

Keywords

PDF download

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

OTRN-DCN: An optimized transformer-based residual network with deep convolutional network for action recognition and multi-object tracking of adaptive segmentation using soccer sports video

Abstract

References

Optimized Convolutional Neural Network for Tamil Handwritten Character Recognition

Object recognition from enhanced underwater image using optimized deep-CNN

Spatiotemporal Detection and Localization of Object Removal Video Forgery with Multiple Feature Extraction and Optimized Residual Network

Human action recognition based on transformer

HEp-2 CELL CLASSIFICATION BY ADAPTIVE CONVOLUTIONAL LAYER BASED CONVOLUTIONAL NEURAL NETWORK

Feature selection based on the self-calibration of binocular camera extrinsic parameters

CMVFTA: Optimal regression and deep maxout with optimization algorithm for pan-sharpening

Double graphs regularized multi-view subspace clustering

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

OTRN-DCN: An optimized transformer-based residual network with deep convolutional network for action recognition and multi-object tracking of adaptive segmentation using soccer sports video

Abstract

Recommended

Optimized Convolutional Neural Network for Tamil Handwritten Character Recognition

Object recognition from enhanced underwater image using optimized deep-CNN

Spatiotemporal Detection and Localization of Object Removal Video Forgery with Multiple Feature Extraction and Optimized Residual Network

Human action recognition based on transformer

HEp-2 CELL CLASSIFICATION BY ADAPTIVE CONVOLUTIONAL LAYER BASED CONVOLUTIONAL NEURAL NETWORK

Feature selection based on the self-calibration of binocular camera extrinsic parameters

CMVFTA: Optimal regression and deep maxout with optimization algorithm for pan-sharpening

Double graphs regularized multi-view subspace clustering