Robotic Vision and Multi-View Synergy: Action and activity recognition in assisted living scenarios
The significance of Human-Robot Interaction (HRI) is increasingly evident when integrating robotics within human-centric settings. A crucial component of effective HRI is Human Activity Recognition (HAR), which is instrumental in enabling robots to respond aptly in human presence, especially within Ambient Assisted Living (AAL) environments. Since robots are generally mobile and their visual perception is often compromised by motion and noise, this paper evaluates methods by merging the robot's mobile perspective with a static viewpoint utilising multi-view deep learning models. We introduce a dual-stream Convolutional 3D (C3D) model to improve vision-based HAR accuracy for robotic applications. Utilising the Robot House Multiview (RHM) dataset, which encompasses a robotic perspective along with three static views (Front, Back, Top), we examine the efficacy of our model and conduct comparisons with the dual-stream ConvNet and Slow-Fast models. The primary objective of this study is to enhance the accuracy of robot viewpoints by integrating them with static views using dual-stream models. The metrics for evaluation include Top-1 and Top-5 accuracy. Our findings reveal that the integration of static views with robotic perspectives significantly boosts HAR accuracy in both Top-1 and Top-5 metrics across all models tested. Moreover, the proposed dual-stream C3D model demonstrates superior performance compared to the other contemporary models in our evaluations.
Item Type | Book Section |
---|---|
Additional information | © 2024 IEEE. This is the accepted manuscript version of an article which has been published in final form at https://doi.org/10.1109/BioRob60516.2024.10719749 |
Date Deposited | 15 May 2025 16:50 |
Last Modified | 30 May 2025 23:20 |