How Can We Tame the Long-Tail of Chest X-ray Datasets?

by   Arsh Verma, et al.

Chest X-rays (CXRs) are a medical imaging modality that is used to infer a large number of abnormalities. While it is hard to define an exhaustive list of these abnormalities, which may co-occur on a chest X-ray, few of them are quite commonly observed and are abundantly represented in CXR datasets used to train deep learning models for automated inference. However, it is challenging for current models to learn independent discriminatory features for labels that are rare but may be of high significance. Prior works focus on the combination of multi-label and long tail problems by introducing novel loss functions or some mechanism of re-sampling or re-weighting the data. Instead, we propose that it is possible to achieve significant performance gains merely by choosing an initialization for a model that is closer to the domain of the target dataset. This method can complement the techniques proposed in existing literature, and can easily be scaled to new labels. Finally, we also examine the veracity of synthetically generated data to augment the tail labels and analyse its contribution to improving model performance.


page 1

page 4

page 5

page 7


Unbiased Loss Functions for Extreme Classification With Missing Labels

The goal in extreme multi-label classification (XMC) is to tag an instan...

Boosted Cascaded Convnets for Multilabel Classification of Thoracic Diseases in Chest Radiographs

Chest X-ray is one of the most accessible medical imaging technique for ...

Long-Tailed Classification of Thorax Diseases on Chest X-Ray: A New Benchmark Study

Imaging exams, such as chest radiography, will yield a small set of comm...

Multi-Domain Balanced Sampling Improves Out-of-Distribution Generalization of Chest X-ray Pathology Prediction Models

Learning models that generalize under different distribution shifts in m...

Data Valuation for Medical Imaging Using Shapley Value: Application on A Large-scale Chest X-ray Dataset

The reliability of machine learning models can be compromised when train...

CheXclusion: Fairness gaps in deep chest X-ray classifiers

Machine learning systems have received much attention recently for their...

Please sign up or login with your details

Forgot password? Click here to reset