regularization

scroll ↓ to Resources

Contents

Note

  • regularization is used to regulate model complexity and help fight overfitting or multicollinearity
  • it leads to increasing the error on the training set and decreasing the error on the validation set
  • there are methods, which modify the loss function and the ones, which modify data like data augmentation

Regularization for shallow models

Methods

  • λ is a hyperparameter and needs to be adjusted from experiments
  • Minimizing the sum of two functions

Lasso (L1)

  • Lasso (L1) - punishes non-zero coefficients some coefficients go to 0

Ridge

  • Ridge (L2) - punishes large coefficients makes model robust to small changes in input data, well differentiable

  • Minimization task becomes a sum of the loss function and the squared weights:

  • Minimizing the above yields the solution for weights w:

ElasticNet

Regularization for deep neural networks

Resources