AlignMix: Improving representation by interpolating aligned features

Mixup is a powerful data augmentation method that interpolates between two or more examples in the input or feature space and between the corresponding target labels. Many recent mixup methods focus on cutting and pasting two or more objects into one image, which is more about efficient processing than interpolation. However, how to best interpolate images is not well defined. In this sense, mixup has been connected to autoencoders, because often autoencoders "interpolate well", for instance generating an image that continuously deforms into another. In this work, we revisit mixup from the interpolation perspective and introduce AlignMix, where we geometrically align two images in the feature space. The correspondences allow us to interpolate between two sets of features, while keeping the locations of one set. Interestingly, this gives rise to a situation where mixup retains mostly the geometry or pose of one image and the texture of the other, connecting it to style transfer. More than that, we show that an autoencoder can still improve representation learning under mixup, without the classifier ever seeing decoded images. AlignMix outperforms state-of-the-art mixup methods on five different benchmarks.


page 1

page 4

page 8


An analysis on the use of autoencoders for representation learning: fundamentals, learning task case studies, explainability and challenges

In many machine learning tasks, learning a good representation of the da...

Deformable Style Transfer

Both geometry and texture are fundamental aspects of visual style. Exist...

Unbiased Image Style Transfer

Recent fast image style transferring methods use feed-forward neural net...

ShuffleMix: Improving Representations via Channel-Wise Shuffle of Interpolated Hidden States

Mixup style data augmentation algorithms have been widely adopted in var...

A Closer Look At Feature Space Data Augmentation For Few-Shot Intent Classification

New conversation topics and functionalities are constantly being added t...

Beyond a Video Frame Interpolator: A Space Decoupled Learning Approach to Continuous Image Transition

Video frame interpolation (VFI) aims to improve the temporal resolution ...

Autoencoder-Aided Visualization of Collections of Morse Complexes

Though analyzing a single scalar field using Morse complexes is well stu...

Please sign up or login with your details

Forgot password? Click here to reset