Hand Pose Detection Using YOLOv8-pose
Hand detection and pose estimation are prominent problems in computer vision. They have applications in augmented and virtual reality, human-robot interaction, and gesture recognition which can be incorporated into controlling various interfaces, such as those used in assistive technology. The hand detection problem involves three sub-problems, i.e. hand localisation, hand classification, and pose estimation. Different hand detection methods approach this problem in multiple stages. However, there is a scope to train an end-to-end network that addresses these three problems at once. In this paper, we contribute to hand detection, classification, and pose estimation by first modifying the FreiHAND dataset to ensure both left and right hand images, along with their annotations, are present for training. Then, we train the YOLOv8-pose networks from nano to extra-large sizes to perform a comparative study of the performance of each network. Further, we perform quantitative and qualitative analysis on three public hand datasets that shows the strengths and limitations of YOLOv8-pose networks. Our experiments on training YOLOv8-pose networks from nano to extra-large sizes showed that the mean average precision score increases with the network size. We also conclude that the ratio of hand size to the image size in training affects the confidence score and classification during inference detection.
Item Type | Book Section |
---|---|
Additional information | © 2024 IEEE. This is the accepted manuscript version of an article which has been published in final form at https://doi.org/10.1109/ICEI64305.2024.10912185 |
Keywords | yolo, deep learning, hand detection, hand pose estimation, fluid flow and transfer processes, artificial intelligence, computer vision and pattern recognition, information systems, mechanical engineering, health informatics, media technology, control and optimization |
Date Deposited | 10 Jun 2025 14:54 |
Last Modified | 11 Jun 2025 00:04 |