I work in this space... and let's just say that MKL is definitely NOT well optimized for AMD's chips. You'll be lucky to get 10-20% efficiency. Nevermind openblas.
Is it anything like the way their compiler detected SSEn in a way that guaranteed it wouldn't use those instructions on AMD processors even if they supported them?
Of course. It's very much intentional, and "not optimized for AMD" is putting it very very mildly. They don't need to optimize purely for stepping level, they could provide sane codepaths for when the CPU flags indicate certain features.
Yes but it wasn't a (if not amd). It was a series of checks based on specific families of Intel cpus, such as haswell, sandy bridge, etc. So it was never actually querying whether the cpu supported instruction x, it was asking what family it belonged to and then applying static rules based on that. Maybe nuance, but it also has the potential to hurt their processors if not kept up on so maybe less malice and more convenience?
They've been explicit about their motivations in this regard (claiming innocence). Then they backtracked when convenient (surprise!), but in a way that still broke AMD processors. See here: https://www.agner.org/optimize/blog/read.php?i=49#49
By the way, it's interesting to note that Intel has a disclaimer on every MKL documentation page about this; my speculation: this was required by terms of a settlement.
From the above link:
>The Intel CPU dispatcher does not only check the vendor ID string and the instruction sets supported. It also checks for specific processor models. In fact, it will fail to recognize future Intel processors with a family number different from 6. When I mentioned this to the Intel engineers they replied:
> > You mentioned we will not support future Intel processors with non-'6' family designations without a compiler update. Yes, that is correct and intentional. Our compiler produces code which we have high confidence will continue to run in the future. This has the effect of not assuming anything about future Intel or AMD or other processors. You have noted we could be more aggressive. We believe that would not be wise for our customers, who want a level of security that their code (built with our compiler) will continue to run far into the future. Your suggested methods, while they may sound reasonable, are not conservative enough for our highly optimizing compiler. Our experience steers us to issue code conservatively, and update the compiler when we have had a chance to verify functionality with new Intel and new AMD processors. That means there is a lag sometime in our production release support for new processors.
> In other words, they claim that they are optimizing for specific processor models rather than for specific instruction sets. If true, this gives Intel an argument for not supporting AMD processors properly. But it also means that all software developers who use an Intel compiler have to recompile their code and distribute new versions to their customers every time a new Intel processor appears on the market. Now, this was three years ago. What happens if I try to run a program compiled with an old version of Intel's compiler on the newest Intel processors? You guessed it: It still runs the optimal code path. But the reason is more difficult to guess: Intel have manipulated the CPUID family numbers on new processors in such a way that they appear as known models to older Intel software. I have described the technical details elsewhere.
I feel like at this point if you use an intel library or compiler you should know its Intel only. If you aren’t using it in a controlled environment stick to clang/gcc.
I can’t really blame them. Why support your competitor?
https://github.com/flame/blis
Both are well optimized for AMD CPUs.