More Details Uncovered On AMD’s ZEN Cores
Comments Off on More Details Uncovered On AMD’s ZEN Cores
Our well informed industry sources have shared a few more details about the AMD’s 2016 Zen cores and now it appears that the architecture won’t use the shared FPU like Bulldozer.
The new Zen uses a SMT Hyperthreading just like Intel. They can process two threads at once with a Hyperthreaded core. AMD has told a special few that they are dropping the “core pair” approach that was a foundation of Bulldozer. This means that there will not be a shared FPU anymore.
Zen will use a scheduling model that is similar to Intel’s and it will use competitive hardware and simulation to define any needed scheduling or NUMA changes.
Two cores will still share the L3 cache but not the FPU. This because in 14nm there is enough space for the FPU inside of the Zen core and this approach might be faster.
We mentioned this in late April where we released a few details about the 16 core, 32 thread Zen based processor with Greenland based graphics stream processor.
Zen will apparently be ISA compatible with Haswell/Broadwell style of compute and the existing software will be compatible without requiring any programming changes.
Zen also focuses on a various compiler optimisation including GCC with target of SPECint v6 based score at common compiler settings and Microsoft Visual studio with target of parity of supported ISA features with Intel.
Benchmarking and performance compiler LLVM targets SPECint v6 rate score at performance compiler settings.
We cannot predict any instruction per clock (IPC improvement) over Intel Skylake, but it helps that Intel replaced Skylake with another 14nm processor in later part of 2016. If Zen makes to the market in 2016 AMD might have a fighting chance to narrow the performance gap between Intel greatest offerings.
Courtesy-Fud
AMD Coherent Data Reaches 100 GBs
After a lot of asking around, we can give you some actual numbers about the AMD’s coherent fabric.
The inter-connecting technology already sounded very promising, but now we have the actual number. The HSA, Heterogeneous System Architecture MCM (Multi Chip Module) that AMD is working on can give you almost seven times faster score than the traditional PCIe interface.
Our industry sources have confirmed that with 4 GMI (Global Memory Interconnect) links AMD’s CPU and GPU can talk at 100GB/s. the traditional PCIe 16X provides 15GB/s at about 500 ns latency. Data Fabric eliminates PCIe latency too.
AMD will be using this technology with the next gen Multi Chip module that packs a Zeppelin CPU (most likely packed with a bunch of ZEN cores) and a Greenland GPU that of course comes with super fast HBM (High Bandwidth Memory). The Greenland and HBM can communicate at 500 GB/s and can provide highest performance GPU with 4+ teraflops.
This new MCM package based chip will also talk with DDR4 3200 memory at 100GB/s speed making it quite attractive for the HSA computation oriented customers.