GPU characteristics
scroll β to Resources
Note
- Peak FLOPs: indicates the maximum number of floating-point operations a GPU can perform per second when all its compute engines are fully utilized.. The value depends on the precision of the modelβs weight (e.g. 32, 16 or 8 bits).
- GPU memory size: the total amount of memory available on the GPU. It is where LLMs store all their data.
- GPU memory bandwidth: the maximum speed at which data can be transferred (read or write) between the compute engine (or CUDA cores) and the GPU memory
- Nvidia A100 has 312 TFLOPS for FP16 or 19.5 TFLOPS for FP32
Resources
Links to this File
table file.inlinks, file.outlinks from [[]] and !outgoing([[]]) AND -"Changelog"