Unlocking Tuning-free Generalization: Minimizing the PAC-Bayes Bound with Trainable Priors

Abstract

It is widely recognized that the generalization ability of neural networks can be greatly enhanced through carefully tuning the training procedure. The current state-of-the-art training approach involves utilizing stochastic gradient descent (SGD) or Adam optimization algorithms along with a combination of additional regularization techniques such as weight decay, dropout, or noise injection. Optimal generalization can only be achieved by tuning a multitude of hyper-parameters extensively, which can be time-consuming and necessitates the additional validation dataset. To address this issue, we present a nearly tuning-free PAC-Bayes training framework that requires no extra regularization. This framework achieves test performance comparable to that of SGD/Adam, even when the latter are optimized through a complete grid search and supplemented with additional regularization terms.

Xitong Zhang
Xitong Zhang
Senior Engineer

I am a Camera Machine Learning Engineer.