Balanced Representation Learning for Long-tailed Skeleton-based Action Recognition

by   Hongda Liu, et al.

Skeleton-based action recognition has recently made significant progress. However, data imbalance is still a great challenge in real-world scenarios. The performance of current action recognition algorithms declines sharply when training data suffers from heavy class imbalance. The imbalanced data actually degrades the representations learned by these methods and becomes the bottleneck for action recognition. How to learn unbiased representations from imbalanced action data is the key to long-tailed action recognition. In this paper, we propose a novel balanced representation learning method to address the long-tailed problem in action recognition. Firstly, a spatial-temporal action exploration strategy is presented to expand the sample space effectively, generating more valuable samples in a rebalanced manner. Secondly, we design a detached action-aware learning schedule to further mitigate the bias in the representation space. The schedule detaches the representation learning of tail classes from training and proposes an action-aware loss to impose more effective constraints. Additionally, a skip-modal representation is proposed to provide complementary structural information. The proposed method is validated on four skeleton datasets, NTU RGB+D 60, NTU RGB+D 120, NW-UCLA, and Kinetics. It not only achieves consistently large improvement compared to the state-of-the-art (SOTA) methods, but also demonstrates a superior generalization capacity through extensive experiments. Our code is available at


page 1

page 2

page 4

page 11


Skeleton Cloud Colorization for Unsupervised 3D Action Representation Learning

Skeleton-based human action recognition has attracted increasing attenti...

ImbSAM: A Closer Look at Sharpness-Aware Minimization in Class-Imbalanced Recognition

Class imbalance is a common challenge in real-world recognition tasks, w...

Hierarchical Contrast for Unsupervised Skeleton-based Action Representation Learning

This paper targets unsupervised skeleton-based action representation lea...

Skeleton-Based Action Recognition with Spatial Reasoning and Temporal Stack Learning

Skeleton-based action recognition has made great progress recently, but ...

Local Spherical Harmonics Improve Skeleton-Based Hand Action Recognition

Hand action recognition is essential. Communication, human-robot interac...

Action Recognition Using Volumetric Motion Representations

Traditional action recognition models are constructed around the paradig...

Multi-Expert Human Action Recognition with Hierarchical Super-Class Learning

In still image human action recognition, existing studies have mainly le...

Please sign up or login with your details

Forgot password? Click here to reset