A Multimodal Graph Contrastive Learning for Human Activity Recognition Using Deep Learning Technique

Velantina V; V. Manikandan; P. Manikandan

doi:10.5935/jetia.v12i58.3346

Velantina V Research scholar, Department of Computer Science and Engineering, Jain (Deemed-to-be-University), Bengaluru, Karnataka, India. https://orcid.org/0009-0006-1905-3470
V. Manikandan Assistant Professor, Department of Computer Science and Engineering, Jain (Deemed-to-be-University), Bengaluru, Karnataka, India. https://orcid.org/0000-0002-8620-1860
P. Manikandan Professor, Department of Computer Science and Engineering, Jain (Deemed-to-be-University), Bengaluru, Karnataka, India. https://orcid.org/0000-0003-3037-7688

DOI: https://doi.org/10.5935/jetia.v12i58.3346

Abstract

The deep learning technique for Human Activity Recognition (HAR) systems has achieved remarkable improvements in recent years in recognition of complex action classes and real-world contexts.This research advances our unified deep learning framework, called the Hybrid Dense Temporal Transformer Network (HDTTN) to capture spatial, temporal, and semantic information to capture better human activity detection. We introduce Dense Net to improve spatial feature extraction from visual inputs, Temporal Convolutional Networks (TCNs) for learning short-term motion patterns, and Transformer encoders for learning long-range temporal dependencies that are crucial for processing complex and subtle activities. Thestudy employs early multimodal feature fusion strategy for further enhance representational coherence which makes it easier to incorporate heterogeneous cues at the feature level and to learn dynamic multimodal representations. Moreover, a hybrid optimization approach is integrated for parameter fine-tuning for efficiency, reduce overfitting, and boost model robustness. The proposed HDTTN framework is shown to be effective on the large scale and difficult Kinetics dataset containing a wide range of unconstrained human activities. Experimental results show that our proposed model is 93% accurate, compared to several existing state-of-the-art baseline approaches. Moreover, qualitative and quantitative analyses validate HDTTN's ability to identify intricate and nuanced activities across a multitude of environments.

Downloads

Download data is not yet available.

JETIA Journal Data
Available:	2015 - 2026
Volumes:	12
Issues:	58
Articles:	1.110
Article Processing Charges (APC):	PAID