you can write pretty performant code based on your textbook knowledge and accumulated wisdom...but writing high-performance code has always, and will i think always, be an empirical process.
you *have* to measure
you *have* to experiment
you *have* to tinker
you *have* to remember making sense isn't a prerequisite
compilers are weird. hardware is fallible. developers are myopic.
#coding #gpulife #hpc #scientificcomputing
fp8 is here for #machinelearning use, with a collab spec from nvidia, arm, and intel.
Maybe now I can push for fp16 in more of the cisTEM codebase since it isn't the new kid on the block anymore : )
#machinelearning #gpulife #cryoem