"Implicitly Regularized Empirical Risk Minimization"
Chou, Hung-HsuEmpirical risk minimization (ERM) plays a crucial role in not only statistical learning theory but also in practice. For overparameterized models such as neural networks, there are often infinitely many solutions to ERM. Remarkably, the ones obtained from simple gradient methods often generalize well and have certain desirable property such as sparsity. Implicit regularization, a possible explanation to this phenomenon, suggests that there is a built-in regularization in gradient descent and ERM. I will present a framework that combine implicit regularization and ERM, denoted as IRERM, with analytic quantification of solutions and proofs for convergence. I will also show concrete examples where IRERM transforms constrained optimization problems into unconstrained optimization problems.