Inadequately Pre-trained Models are Better Feature Extractors

03/09/2022
by   Andong Deng, et al.
0

Pre-training has been a popular learning paradigm in deep learning era, especially in annotation-insufficient scenario. Better ImageNet pre-trained models have been demonstrated, from the perspective of architecture, by previous research to have better transferability to downstream tasks. However, in this paper, we found that during the same pre-training process, models at middle epochs, which is inadequately pre-trained, can outperform fully trained models when used as feature extractors (FE), while the fine-tuning (FT) performance still grows with the source performance. This reveals that there is not a solid positive correlation between top-1 accuracy on ImageNet and the transferring result on target data. Based on the contradictory phenomenon between FE and FT that better feature extractor fails to be fine-tuned better accordingly, we conduct comprehensive analyses on features before softmax layer to provide insightful explanations. Our discoveries suggest that, during pre-training, models tend to first learn spectral components corresponding to large singular values and the residual components contribute more when fine-tuning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/04/2021

Bridging Pre-trained Models and Downstream Tasks for Source Code Understanding

With the great success of pre-trained models, the pretrain-then-finetune...
research
03/09/2023

Mark My Words: Dangers of Watermarked Images in ImageNet

The utilization of pre-trained networks, especially those trained on Ima...
research
07/12/2023

Large Class Separation is not what you need for Relational Reasoning-based OOD Detection

Standard recognition approaches are unable to deal with novel categories...
research
06/09/2022

On Data Scaling in Masked Image Modeling

An important goal of self-supervised learning is to enable model pre-tra...
research
06/30/2020

SE3M: A Model for Software Effort Estimation Using Pre-trained Embedding Models

Estimating effort based on requirement texts presents many challenges, e...
research
03/06/2023

Angel-PTM: A Scalable and Economical Large-scale Pre-training System in Tencent

Recent years have witnessed the unprecedented achievements of large-scal...
research
01/19/2022

Enhanced Performance of Pre-Trained Networks by Matched Augmentation Distributions

There exists a distribution discrepancy between training and testing, in...

Please sign up or login with your details

Forgot password? Click here to reset