Hostname: page-component-77f85d65b8-grvzd Total loading time: 0 Render date: 2026-03-27T09:36:41.850Z Has data issue: false hasContentIssue false

GLULA: Linear attention-based model for efficient human activity recognition from wearable sensors

Published online by Cambridge University Press:  05 April 2024

Aldiyar Bolatov*
Affiliation:
Department of Computer Science, Nazarbayev University, Astana, Kazakhstan
Aigerim Yessenbayeva
Affiliation:
Department of Computer Science, Nazarbayev University, Astana, Kazakhstan
Adnan Yazici
Affiliation:
Department of Computer Science, Nazarbayev University, Astana, Kazakhstan
*
Corresponding author: Aldiyar Bolatov; Email: aldiyar.bolatov@nu.edu.kz

Abstract

Body-worn sensor data is used in monitoring patient activity during rehabilitation and also can be extended to controlling rehabilitation devices based on the activity of the person. The primary focus of research has been on effectively capturing the spatiotemporal dependencies in the data collected by these sensors and efficiently classifying human activities. With the increasing complexity and size of models, there is a growing emphasis on optimizing their efficiency in terms of memory usage and inference time for real-time usage and mobile computers. While hybrid models combining convolutional and recurrent neural networks have shown strong performance compared to traditional approaches, self-attention-based networks have demonstrated even superior results. However, instead of relying on the same transformer architecture, there is an opportunity to develop a novel framework that incorporates recent advancements to enhance speed and memory efficiency, specifically tailored for human activity recognition (HAR) tasks. In line with this approach, we present GLULA, a unique architecture for HAR. GLULA combines gated convolutional networks, branched convolutions, and linear self-attention to achieve efficient and powerful solutions. To enhance the performance of our proposed architecture, we employed manifold mixup as an augmentation variant which proved beneficial in limited data settings. Extensive experiments were conducted on five benchmark datasets: PAMAP2, SKODA, OPPORTUNITY, DAPHNET, and USC-HAD. Our findings demonstrate that GLULA outperforms recent models in the literature on the latter four datasets but also exhibits the lowest parameter count and close to the fastest inference time among state-of-the-art models.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2024. Published by Cambridge University Press
Figure 0

Figure 1. The graphical representation of the proposed model’s structure. Data preprocessing and each layer are shown and numbered following the model description given in the methodology. While the GCN, Linear Attention (L-Att), and Softmax Self-Attention (S-Att) can all potentially serve as the main block (4), GCN has been shown to outperform the others in this role, as highlighted in equation (1). Consequently, it is illustrated in the figure as the exclusive type for the main block. In contrast, for the additional block (9), all three network types (GCN, L-Att, S-Att) underwent full testing. Hence, the additional block in the graph showed as a choice among these three types. A comprehensive structure of each of the network types that was tested as (9) is provided in Figure 2.

Figure 1

Figure 2. The graphical portrayal of each network type that was experimented as the additional block: Linear Attention (a), Softmax Self-Attention (b), and Gated Convolutional Network (c).

Figure 2

Table 1. Information about presented datasets’ structure

Figure 3

Table 2. F1-weighted scores with STD on PAMAP2 using training methods on GLULA and GLUSA models

Figure 4

Table 3. Results obtained on different datasets using the proposed GLULA and its variations

Figure 5

Table 4. Speed comparison using the average forward pass time of our model with its variations on different datasets

Figure 6

Table 5. Size and scores (F1-weighted/F1-macro) comparison of our model with listed methods on benchmark datasets. The bold values indicate the best or close to the best results for the corresponding metric in the table’s column among presented models.

Figure 7

Table 6. LOSO-Cross-Validation F1-weighted scores comparison. The asterisk highlights the oversight in the reported result of the model’s experiment (row) on the corresponding dataset (column), where data leakage occurred in the form of labels being inadvertently included as part of the input.

Figure 8

Table 7. Configuration of the models’ input dimensions following all pre-processing procedures across datasets

Figure 9

Table 8. The average inference time of the proposed model with existing methods on benchmark datasets for speed comparison