AMD, Intel & nVidia Go OpenGL
AMD, Intel and Nvidia teamed up to tout the advantages of the OpenGL multi-platform application programming interface (API) at this year’s Game Developers Conference (GDC).
Sharing a stage at the event in San Francisco, the three major chip designers explained how, with a little tuning, OpenGL can offer developers between seven and 15 times better performance as opposed to the more widely recognised increases of 1.3 times.
AMD manager of software development Graham Sellers, Intel graphics software engineer Tim Foley and Nvidia OpenGL engineer Cass Everitt and senior software engineer John McDonald presented their OpenGL techniques on real-world devices to demonstrate how these techniques are suitable for use across multiple platforms.
During the presentation, Intel’s Foley talked up three techniques that can help OpenGL increase performance and reduce driver overhead: persistent-mapped buffers for faster streaming of dynamic geometry, integrating Multidrawindirect (MDI) for faster submission of many draw calls, and packing 2D textures into arrays, so texture changes no longer break batches.
They also mentioned during their presentation that with proper implementations of these high-level OpenGL techniques, driver overhead could be reduced to almost zero. This is something that Nvidia’s software engineers have already claimed is impossible with Direct3D and only possible with OpenGL (see video below).
Nvidia’s VP of game content and technology, Ashu Rege, blogged his account of the GDC joint session on the Nvidia blog.
“The techniques presented apply to all major vendors and are suitable for use across multiple platforms,” Rege wrote.
“OpenGL can cut through the driver overhead that has been a frustrating reality for game developers since the beginning of the PC game industry. On desktop systems, driver overhead can decrease frame rate. On mobile devices, however, driver overhead is even more insidious, robbing both battery life and frame rate.”
The slides from the talk, entitled Approaching Zero Driver Overhead, are embedded below.
At the Game Developers Conference (GDC), Microsoft also unveiled the latest version of its graphics API, Directx 12, with Direct3D 12 for more efficient gaming.
Showing off the new Directx 12 API during a demo of Xbox One racing game Forza 5 running on a PC with an Nvidia Geforce Titan Black graphics card, Microsoft said Directx 12 gives applications the ability to directly manage resources to perform synchronisation. As a result, developers of advanced applications can control the GPU to develop games that run more efficiently.
Is AMD Worried?
AMD’s Mantle has been a hot topic for quite some time and despite its delayed birth, it has finally came delivered performance in Battlefield 4. Microsoft is not sleeping it has its own answer to Mantle that we mentioned here.
Oddly enough we heard some industry people calling it DirectX 12 or DirectX Next but it looks like Microsoft is getting ready to finally update the next generation DirectX. From what we heard the next generation DirectX will fix some of the driver overhead problems that were addressed by Mantle, which is a good thing for the whole industry and of course gamers.
AMD got back to us officially stating that “AMD would like you to know that it supports and celebrates a direction for game development that is aligned with AMD’s vision of lower-level, ‘closer to the metal’ graphics APIs for PC gaming. While industry experts expect this to take some time, developers can immediately leverage efficient API design using Mantle. “
AMD also told us that we can expect some information about this at the Game Developers Conference that starts on March 17th, or in less than two weeks from now.
We have a feeling that Microsoft is finally ready to talk about DirectX Next, DirectX 11.X, DirectX 12 or whatever they end up calling it, and we would not be surprised to see Nvidia 20nm Maxwell chips to support this API, as well as future GPUs from AMD, possibly again 20nm parts.
AMD’s Richland Shows Up
Kaveri is coming in a few months, but before it ships AMD will apparently spice up the Richland line-up with a few low-power parts.
CPU World has come across an interesting listing, which points to two new 45W chips, the A8-6500T and the A10-6700T. Both are quads with 4MB of cache. The A8-6500T is clocked at 2.1GHz and can hit 3.1GHz on Turbo, while the A10-6700T’s base clock is 2.5GHz and it maxes out at 3500MHz.
The prices are $108 and $155 for the A8 and A10 respectively, which doesn’t sound too bad although they are still significantly pricier than regular FM2 parts.
AMD’s Kaveri Coming In Q4
AMD really needs to make up its mind and figure out how it interprets its own roadmaps. A few weeks ago the company said desktop Kaveri parts should hit the channel in mid-February 2014. The original plan called for a launch in late 2013, but AMD insists the chip was not delayed.
Now though, it told Computerbase.de that the first desktop chips will indeed appear in late 2013 rather than 2014, while mobile chips will be showcased at CES 2014 and they will launch in late Q1 or early Q2 2014.
As we reported earlier, the first FM2+ boards are already showing up on the market, but at this point it’s hard to say when Kaveri desktop APUs will actually be available. The most logical explanation is that they will be announced sometime in Q4, with retail availability coming some two months later.
Kaveri is a much bigger deal than Richland, which was basically Trinity done right. Kaveri is based on new Steamroller cores, it packs GCN graphics and it’s a 28nm part. It is expected to deliver a significant IPC boost over Piledriver-based chips, but we don’t have any exact numbers to report.
ARM & Oracel Optimize Java
ARM’s upcoming ARMv8 architecture will form the basis for several processors that will end up in servers. Now the firm has announced that it will work with Oracle to optimise Java SE for the architecture to squeeze out as much performance as possible.
ARM’s chip licensees are looking to the 64-bit ARMv8 architecture to make a splash in the low-power server market and go up against Intel’s Atom processors. However unlike Intel that can make use of software already optimised for x86, ARM and its vendors need to work with software firms to ensure that the new architecture will be supported at launch.
Oracle’s Java is a vital piece of software that is used by enterprise firms to run back-end systems, so poor performance from the Java virtual machine could be a serious problem for ARM and its licensees. To prevent that, ARM said it will work with Oracle to improve performance, boot-up performance and power efficiency, and optimize libraries.
Henrik Stahl, VP of Java Product Management at Oracle said, “The long-standing relationship between ARM and Oracle has enabled our mutual technologies to be deployed across a broad spectrum of products and applications.
“By working closely with ARM to enhance the JVM, adding support for 64-bit ARM technology and optimizing other aspects of the Java SE product for the ARM architecture, enterprise and embedded customers can reap the benefits of high-performance, energy-efficient platforms based on ARM technology.”
A number of ARM vendors including x86 stalwart AMD are expected to bring out 64-bit ARMv8 processors in 2014, though it is thought that Applied Micro will be the first to market with an ARMv8 processor chip later this year.
AMC Goes To The Clouds
Applied Micro Circuits has released its cloud chip which takes networking and computing and crams it all onto one SoC.
The X-Gene server on a chip, is being billed as the first 64-bit-capable ARM-based server in existence. According to the company it is the first chip to contain a software-defined network (SDN) controller on the die that will offer network services such as load balancing and ensuring service-level agreements on the chip.
Paramesh Gopi, president and CEO of Applied Micro, said that these new chips have now made it past the prototype stage and are being used by Dell and Red Hat. Gopi expects physical servers containing the X-Gene to hit the market by the end of this year.
The chip is manufactured at 40 nanometers and has eight 2.4 GHz ARM cores, four smaller ARM Cortex A5 cores running the SDN controller software, four 10-gigabit ethernet ports, and various ports that can support more Ethernet, SSDs, accelerator cards such as those from Fusion-io or SATA drives.
The cost of ownership, which includes power requirements are about half of that of a comparable x86 product, but wouldn’t discuss actual power consumption, the company claims.
ARM Goes High-End
Nvidia is itself an ARM chip licensee that has seen significant design wins with its Tegra 3 system-on-chip (SoC) processor, however the firm doesn’t see ARM based servers being able to do heavy lifting in server tasks for two years. Sumit Gupta, GM of Nvidia’s Tesla Accelerated Computing business unit said that even with GPGPUs, ARM based servers are not yet able to provide the computing power needed to drive high performance servers.
Gupta said, “Performance of these ARM cores is still not where it needs to be for servers. It is getting there; the new ARM64 [processor] is going to get it part of the way.” However he did say that eventually ARM SoCs could hit X86-like performance levels. “One day I think ARM will at least get to similar performance levels as X86 performance. The belief is that over the next one or two years these ARM SoCs will be good enough for cloud applications and web serving. I think it will take some more time to be good enough for accelerated computing.”
As for Nvidia using its Tegra chips to push work to the firm’s GPGPUs, a scenario that would make the firm’s accountants very happy, Gupta said he was surprised at the level of interest from developers and questioned the need for powerful CPUs. “We did a small development kit called Karma that has a Tegra 3 and a Nvidia GPU, [and] I was shocked by the number of those kits that have been sold. The interest in this ARM plus GPU is far larger than even I expected. If the GPU can do dynamic parallelism, it becomes more independent than how powerful CPUs do you need? I believe the first thing that will happen is that people will start using lower performing [Intel] Xeons […] then at some point when these Atom based processors become available they might use that, and when ARM64 is available they’ll use that.”
ARM Seeing Growth
ARM and Vivante have achieved significant market share gains in the system-on-chip (SoC) GPU market while Imagination and Qualcomm have seen their market shares fall.
ARM has been aggressively pushing its Mali GPU design for the last two years, while Vivante has ridden the surge in Chinese tablet sales, and these factors have resulted in both firms increasing market shares. Analyst outfit Jon Peddie Research claimed that ARM and Vivante scored first half 2012 SoC GPU market shares of 12.9 percent and 9.8 percent, respectively, while the SoC GPU market share leaders Imagination and Qualcomm both suffered declines.
ARM more than doubled its market share from the same period a year ago while Vivante went even better by almost quadrupling its market share. Not only were both firms claiming large pieces of the pie, Jon Peddie Research claimed the SoC GPU market had increased by 91.3 percent, suggesting that Qualcomm and Imagination are having a harder time getting new business. Jon Peddie told The INQUIRER that new vendors are entering the market, typically with lower prices to earn customers.
Nvidia’s SoC GPU operations accounted for 2.5 percent of the total smartphone and tablet market, which given that the firm doesn’t license out its GPU designs is pretty impressive. Nvidia could see its market share increase if Microsoft’s Surface tablet sells well.
Will ARM Get OpenCL Certification?
ARM has submitted its Mali-T604 GPU for OpenCL certification.
ARM’s Mali GPUs have so far shyed away from GPGPU support, however as smartphones and tablets are not expected to see an ever growing number of processor cores the cries for OpenCL support in its GPUs have been growing louder. Now ARM has submitted its Mali-T604 GPU to the Khronos consortium for full profile OpenCL certification.
The Khronos consortium oversees the development of OpenCL and the high-level language is supported by a number of firms including AMD, Nvidia and Intel on their latest GPUs. However until now there hasn’t been an OpenCL certified GPU that is used in smartphones, though firms such as Zii Labs also boast OpenCL support for their chips.
ARM said, “Building on a scalable multicore, multi-pipeline architecture design, the Mali-T600 Series GPU includes a number of advanced features. In particular, native scalar and vector operations for OpenCL’s integer and floating point data types (including 64-bit); support for static and dynamic compilation; hardware accelerated image and sampler data types; fast atomic operations and compliance to IEEE754-2008 precision requirements.
ARM Profits On The Rise
ARM has reported good second quarter financial results, with profit rising by 23 per cent to $102.97 million.
ARM has been riding high in the public consciousness thanks to firms such as Qualcomm, Texas Instruments and Nvidia pushing its chip architecture into smartphones and tablets. The firm announced it managed to take in $209.78 millio in revenue during the second quarter, a 15 per cent increase from the same period last year, while net income rose even faster by 23 per cent to $102.97 million.
ARM said two billion chips using the firm’s various design models were shipped during the quarter, which represented a nine per cent increase from last year. The firm revealed that its core money making operation, processor royalties, rose by 14 per cent.
Warren East, CEO of ARM said, “ARM’s royalty revenues continued to outperform the overall semiconductor industry as our customers gained market share within existing markets and launched products which are taking ARM technology into new markets.
“This quarter we have seen multiple market leaders announce exciting new products including computers and servers from Dell and Microsoft, and embedded applications from Freescale and Toshiba. In addition, ARM and TSMC announced a partnership to optimise next generation ARM processors and physical IP and TSMC’s FinFET process technology.”