Combining Public Human Activity Recognition Datasets to Mitigate Labeled Data Scarcity

by   Riccardo Presotto, et al.

The use of supervised learning for Human Activity Recognition (HAR) on mobile devices leads to strong classification performances. Such an approach, however, requires large amounts of labeled data, both for the initial training of the models and for their customization on specific clients (whose data often differ greatly from the training data). This is actually impractical to obtain due to the costs, intrusiveness, and time-consuming nature of data annotation. Moreover, even with the help of a significant amount of labeled data, model deployment on heterogeneous clients faces difficulties in generalizing well on unseen data. Other domains, like Computer Vision or Natural Language Processing, have proposed the notion of pre-trained models, leveraging large corpora, to reduce the need for annotated data and better manage heterogeneity. This promising approach has not been implemented in the HAR domain so far because of the lack of public datasets of sufficient size. In this paper, we propose a novel strategy to combine publicly available datasets with the goal of learning a generalized HAR model that can be fine-tuned using a limited amount of labeled data on an unseen target domain. Our experimental evaluation, which includes experimenting with different state-of-the-art neural network architectures, shows that combining public datasets can significantly reduce the number of labeled samples required to achieve satisfactory performance on an unseen target domain.


page 1

page 2

page 3

page 4


Semantic-Discriminative Mixup for Generalizable Sensor-based Cross-domain Activity Recognition

It is expensive and time-consuming to collect sufficient labeled data to...

SelfAct: Personalized Activity Recognition based on Self-Supervised and Active Learning

Supervised Deep Learning (DL) models are currently the leading approach ...

Domain Adaptations for Computer Vision Applications

A basic assumption of statistical learning theory is that train and test...

Benchmark of DNN Model Search at Deployment Time

Deep learning has become the most popular direction in machine learning ...

Attention-based Convolutional Neural Network for Weakly Labeled Human Activities Recognition with Wearable Sensors

Unlike images or videos data which can be easily labeled by human being,...

Learning Cross-domain Generalizable Features by Representation Disentanglement

Deep learning models exhibit limited generalizability across different d...

Please sign up or login with your details

Forgot password? Click here to reset