softmax
scroll ↓ to Resources
Note
- normalized exponential function, converts a vector of K real numbers into a probability distribution of K possible outcomes
- used in multi-class classification problems (including next token prediction tasks) as a generalization of logistic regression, see ^027f8a
Formula
- of input elements []
- Output of softmax is always a positive number regardless of the input sign
- dividing the exponent by a parameter temperature T allows for controlling entropy and affecting the output distribution
Resources
Links to this File
table file.inlinks, file.outlinks from [[]] and !outgoing([[]]) AND -"Changelog"