gradient descent
scroll ↓ to Resources
Note
- Data normalization positively affects the convergence speed of the algorithm because of the more rounded shape of the function surface we optimize
Steps of the algorithm
- Initialize trainable parameters
- Compute the gradient of the loss function
- Update : where is the learning rate hyperparameters
- Decide if it is time to stop or continue
- Stopping decision can be done
- because of limited computational budget, number of iterations or time allowed
- if the value of selected ML metric on the validation set has stabilized and not changing much
- Stopping decision can be done
- if continue, go to step 3
Resources
Links to this File
table file.inlinks, file.outlinks from [[]] and !outgoing([[]]) AND -"Changelog"