Skip to yearly menu bar Skip to main content


Poster

Idling Neurons, Appropriately Lenient Workload During Fine-tuning Leads to Better Generalization

Hongjing Niu · Hanting Li · Bin Li · Feng Zhao

Strong blind review: This paper was not made available on public preprint services during the review process Strong Double Blind
[ ]
Wed 2 Oct 7:30 a.m. PDT — 9:30 a.m. PDT

Abstract:

Pre-training on large-scale datasets has become a fundamental method for training deep neural networks. Pre-training provides a better set of parameters than random initialization, which reduces the training cost of deep neural networks on the target task. In addition, pre-training also provides a large number of feature representations, which may help improve generalization capabilities. However, this potential advantage has not received enough attention and has been buried by rough fine-tuning. Based on some exploratory experiments, this paper rethinks the fine-tuning process and gives a new perspective on understanding fine-tuning. Moreover, this paper proposes some plug-and-play fine-tuning strategies as alternatives for simple fine-tuning. These fine-tuning strategies all preserve pre-trained features better by creating idling of some neurons, leading to better generalization.

Live content is unavailable. Log in and register to view live content