softmax

scroll ↓ to Resources

Note

  • normalized exponential function, converts a vector of K real numbers into a probability distribution of K possible outcomes
  • used in multi-class classification problems (including next token prediction tasks) as a generalization of logistic regression, see ^027f8a

Formula

  • of input elements []
  • Output of softmax is always a positive number regardless of the input sign
  • dividing the exponent by a parameter temperature T allows for controlling entropy and affecting the output distribution

Resources


table file.inlinks, file.outlinks from [[]] and !outgoing([[]])  AND -"Changelog"