Computer scientists have discovered a new way to multiply large matrices faster than ever before by eliminating a previously unknown inefficiency, reports Quanta Magazine. This could eventually ...
If you start naively without any library that avoids the problem then memory access is the problem. Have a look at how much effort is needed to avoid the problem, for example with blocking algorithms.