Ok, props to Intel: with launching #sapphirerapids, they're also launching/releasing the specific *per port* throughput and latency per instruction.
If you want to wring out every last clock cycle, this is the way to do it.
#sapphirerapids #hpc #xeon #avx512 #avx512fp16