Push it to the Limit: Discover Edge-Cases in Image Data with Autoencoders

10/07/2019
by   Ilja Manakov, et al.
0

In this paper, we focus on the problem of identifying semantic factors of variation in large image datasets. By training a convolutional Autoencoder on the image data, we create encodings, which describe each datapoint at a higher level of abstraction than pixel-space. We then apply Principal Component Analysis to the encodings to disentangle the factors of variation in the data. Sorting the dataset according to the values of individual principal components, we find that samples at the high and low ends of the distribution often share specific semantic characteristics. We refer to these groups of samples as semantic groups. When applied to real-world data, this method can help discover unwanted edge-cases.

READ FULL TEXT
research
02/22/2023

Deep Kernel Principal Component Analysis for Multi-level Feature Learning

Principal Component Analysis (PCA) and its nonlinear extension Kernel PC...
research
12/21/2022

Inference for Model Misspecification in Interest Rate Term Structure using Functional Principal Component Analysis

Level, slope, and curvature are three commonly-believed principal compon...
research
12/14/2020

Probabilistic Contrastive Principal Component Analysis

Dimension reduction is useful for exploratory data analysis. In many app...
research
01/09/2020

D-GCCA: Decomposition-based Generalized Canonical Correlation Analysis for Multiple High-dimensional Datasets

Modern biomedical studies often collect multiple types of high-dimension...
research
02/06/2015

A Fingerprint-based Access Control using Principal Component Analysis and Edge Detection

This paper presents a novel approach for deciding on the appropriateness...
research
06/15/2018

Automated Image Data Preprocessing with Deep Reinforcement Learning

Data preparation, i.e. the process of transforming raw data into a forma...

Please sign up or login with your details

Forgot password? Click here to reset