SEATTLE — There is almost an obsessive focus at the supercomputing conference here on reaching exascale computing, a level of computing power that is roughly 1,000 times more powerful than anything that is running today, in this decade.
In the lives of most people, something that is eight or nine years off may seem like a long time, but at SC11, it feels as if it is just around the corner. Part of the push is coming from the U.S. Department of Energy, which will fund these massive systems. The DOE told the industry this summer that it wants an exascale system delivered in the 2019-2020 timeframe that won’t use more than 20 MW of power. The government has been seeking proposals about how to achieve this.
To put 20 MW of power in perspective, consider the supercomputer that IBM Corp. is building for the DOE’s Lawrence Livermore National Laboratory. This system will be capable of speeds of 20 petaflops. It will be one of the largest supercomputers in the world as well as one of the most energy efficient. But when it is completely turned on next year, it will still use somewhere in the range of 7 to 8 MW of power, according to IBM. An exascale system has the compute power of 1,000 petaflops. (A petaflop is a quadrillion floating-point operations per second.)
“We’re in a power constrained world now,” said Steve Scott, the CTO of Nvidia Corp.’s Telsa business, “where the performance we can get on a chip is constrained not by the number of transistors we can put on a chip, but rather by the power.”
Scott sees x86 computing processing limited by its overhead processes. GPU processors, in contrast, provide throughput with very little overhead, and with less energy per operation.
Nvidia has been building HPC (high-performance computing) systems with its GPUs and CPUs, often enough, from Advanced Micro Devices. This hybrid approach is also moving toward ARM processors, widely used in cell phones, which may lead to an integrated GPU and ARM hybrid processor.
Scott believes the DOE’s 20 MW goal can be achieved by 2022. But if the government’s exascale program comes through with funding, it may enable Nvidia to be more aggressive in circuit and architectural techniques, making it possible to achieve that power level goal by 2019.
Scott said reaching that level of efficiency will require improving power usage by 50 times.
While 20 MW may seem like a lot of power, Scott points out that there are cloud computing facilities that require as much as 100 MW of power.
Rajeeb Hazra, general manager of Technical Computing at Intel Corp., said his company plans to meet the exascale, 20 MW goal by 2018, one year ahead of the U.S. government’s expectation. He made this remark during the announcement of the company’s unveiling of the Knights Corner , its new 50-core processor that’s capable of one teraflop of sustained performance.
While the hardware makers deal with power and performance issues, exascale, as is petaflop computing, is providing HPC users with challenges in scaling codes to fully use these systems.
Before reaching exascale, vendors will produce systems that can scale into the hundreds of petaflops. IBM, for instance, says the new system Blue Gene/Q will be capable of 100 petaflops.
Kim Cupps, the computing division leader and Sequoia project manager at Lawrence Livermore, will be happy with 20 petaflops.
“We’re thrilled to have to have this machine so close to our grasp,” said Cupps, or her 20 petaflop system. “We are going to solve many problems of national importance, ranging from material’s modeling, weapons science, climate change and energy modeling.”
Of IBM’s claim that its system can scale to 100 petaflops, “that’s IBM saying that,” said Cupps, “I’ll vouch for 20.”