Multi-level Multiple Instance Learning with Transformer for Whole Slide Image Classification

by   Ruijie Zhang, et al.

Whole slide image (WSI) refers to a type of high-resolution scanned tissue image, which is extensively employed in computer-assisted diagnosis (CAD). The extremely high resolution and limited availability of region-level annotations make it challenging to employ deep learning methods for WSI-based digital diagnosis. Multiple instance learning (MIL) is a powerful tool to address the weak annotation problem, while Transformer has shown great success in the field of visual tasks. The combination of both should provide new insights for deep learning based image diagnosis. However, due to the limitations of single-level MIL and the attention mechanism's constraints on sequence length, directly applying Transformer to WSI-based MIL tasks is not practical. To tackle this issue, we propose a Multi-level MIL with Transformer (MMIL-Transformer) approach. By introducing a hierarchical structure to MIL, this approach enables efficient handling of MIL tasks that involve a large number of instances. To validate its effectiveness, we conducted a set of experiments on WSIs classification task, where MMIL-Transformer demonstrate superior performance compared to existing state-of-the-art methods. Our proposed approach achieves test AUC 94.74 and test accuracy 94.37 pre-trained models are available at:


page 1

page 2

page 3

page 4


Histopathological Image Classification based on Self-Supervised Vision Transformer and Weak Labels

Whole Slide Image (WSI) analysis is a powerful method to facilitate the ...

TransMIL: Transformer based Correlated Multiple Instance Learning for Whole Slide Image Classication

Multiple instance learning (MIL) is a powerful tool to solve the weakly ...

ScoreNet: Learning Non-Uniform Attention and Augmentation for Transformer-Based Histopathological Image Classification

Progress in digital pathology is hindered by high-resolution images and ...

Shuffle Instances-based Vision Transformer for Pancreatic Cancer ROSE Image Classification

The rapid on-site evaluation (ROSE) technique can signifi-cantly acceler...

Detecting cutaneous basal cell carcinomas in ultra-high resolution and weakly labelled histopathological images

Diagnosing basal cell carcinomas (BCC), one of the most common cutaneous...

Dynamically Visual Disambiguation of Keyword-based Image Search

Due to the high cost of manual annotation, learning directly from the we...

Cross-scale Multi-instance Learning for Pathological Image Diagnosis

Analyzing high resolution whole slide images (WSIs) with regard to infor...

Please sign up or login with your details

Forgot password? Click here to reset