dc.contributor.author | Shahabian Alashti, Mohamad Reza | |
dc.date.accessioned | 2025-02-24T15:29:01Z | |
dc.date.available | 2025-02-24T15:29:01Z | |
dc.date.issued | 2024-06-06 | |
dc.identifier.uri | http://hdl.handle.net/2299/28802 | |
dc.description.abstract | Human activity recognition (HAR) is crucial in assistive technology and human-robot
interaction (HRI) as it enables robots and assistive devices to understand and respond
to the individual’s movements and actions, facilitating personalized assistance for those
with mobility challenges or disabilities. In the context of HAR for ambient assisted living (AAL)
environments, integrating additional cameras with the robot’s perspective has the potential for
significant advancement of detection outcomes. However, considering the computational limits
in robots, the caveat is that processing additional video streams presents challenges in terms of
both computation complexity and data integration.
The primary goal of this research is to create an efficient multi-view skeleton-based HAR
system that optimises accuracy without sacrificing efficiency. By leveraging the strengths of
the skeleton-based models and incorporating diverse perspectives, this system aims to enhance
overall performance in AAL scenarios. To support this goal, an open dataset for skeletal data
in HAR is developed and utilised. For objective evaluation, this work considers computational
needs and algorithmic efficiency in HAR methods, exploring the potential of multi-view systems
to improve human-robot interaction.
This thesis is grounded on a thorough literature review including extensive dataset analysis.
It explores pivotal research questions centred around the effectiveness of skeleton-based models
in multi-view settings compared to image-based models. It explores the role of perspectives in
multi-view HAR and investigates the optimal models for multi-view recognition. These inquiries
lead to the development and evaluation of a novel lightweight and multi-view HAR architecture.
This thesis significantly contributes to the field by introducing a multi-view skeleton-based
dataset, dataset analysis metrics to evaluate and compare different perspectives, and a novel
lightweight HAR architecture. Performance analysis supports the importance of integrating robot
vision with observations from additional cameras. These results reveal variations in performance
based on different views. For instance, the results highlight how the robot’s tracking of the human
subject during action performance can lead to higher data collection quality in activities like
climbing stairs up and down, while proximity to the subject may result in missed body joints.
Conversely, other views offer a wider perspective of the scene, presenting unique advantages
and challenges. In this study, integrating the additional view with the Robot-view resulted in an
accuracy increase of up to 25%.
Notably, the proposed skeleton-based architecture exhibits improved efficiency compared
to its image-based counterpart when applied to the same dataset, demonstrating a notable
15% improvement in HAR. Moreover, the comparison between image-based and skeleton-based
methods reveals that the robot movement affects them differently. Unlike image-based methods,
where the movement of the robot’s camera can create confusion between the subject’s movement
and the camera’s movement, the skeleton-based method is less affected by the robot’s movement.
In addition, this work introduces various multi-view architectures for comparative analysis,
shedding light on different data combination methodologies. The proposed system achieves
high accuracy (approximately 90%), with a minimal number of training parameters (0.6M), and
demonstrates significantly lower computational demands (0.00106 Giga FLOPs)compared to
well-known CNN and GCN models. For instance, the ResNet models with 11.2M and MobiNet
with 2.2M parameters achieved 90.9% and 83.5% accuracy, respectively.
Given the pivotal role of HAR in diverse applications, the emphasis on its efficiency and
effectiveness is crucial. This work not only addresses these concerns but also establishes a
foundation for future research directions. The proposed skeleton-based architecture lays the
groundwork for various applications such as ambient assisted living scenarios, offering a flexible
platform for the development of efficient multi-person activity recognition, continual and real-time
HAR, and utilisation in Human-Robot Interaction (HRI) studies. | en_US |
dc.language.iso | en | en_US |
dc.rights | info:eu-repo/semantics/openAccess | en_US |
dc.rights | Attribution 3.0 United States | * |
dc.rights.uri | http://creativecommons.org/licenses/by/3.0/us/ | * |
dc.subject | Human Activity Recognition (HAR) | en_US |
dc.subject | Skeleton-based HAR | en_US |
dc.subject | Ambient Assisted Living (AAL) | en_US |
dc.subject | Human-Robot Interaction (HRI) | en_US |
dc.subject | Convolutional Neural Network (CNN) | en_US |
dc.subject | Human Body Pose Estimation | en_US |
dc.subject | Efficient Deep Learning (DL) | en_US |
dc.title | Human and Activity Detection in Ambient Assisted Living Scenarios | en_US |
dc.type | info:eu-repo/semantics/article | en_US |
dc.type.qualificationlevel | Doctoral | en_US |
dc.type.qualificationname | PhD | en_US |
dcterms.dateAccepted | 2024-06-06 | |
rioxxterms.funder | Default funder | en_US |
rioxxterms.identifier.project | Default project | en_US |
rioxxterms.version | NA | en_US |
rioxxterms.licenseref.uri | https://creativecommons.org/licenses/by/4.0/ | en_US |
rioxxterms.licenseref.startdate | 2025-02-24 | |
herts.preservation.rarelyaccessed | true | |
rioxxterms.funder.project | ba3b3abd-b137-4d1d-949a-23012ce7d7b9 | en_US |