One of technology’s most interesting battles is occurring among server chip makers like IBM Corp., AMD Inc. and Intel Corp. These vendors are continually looking to one-up each other — pushing out eight-, 12- and 16-core processors — and bringing unprecedented levels of high performance computing power to enterprise IT shops.
But while the focus is often on the amount of cores these new chips offer, enterprises must also pay attention to the servers, systems and applications used to take advantage of this new processing power.
Neil Bunn, a technology architect for deep computing at IBM Canada Ltd., said that one of the challenges with technology in the last couple of years has been the dramatic increase in the number of processor cores available to applications. The software industry, he said, has lagged in developing apps to correctly parallelize workloads to take advantage of multiple cores.
“Being able to run a single application or a single job against an extremely large system is definitely a very large issue in HPC today,” Bunn said. “In fact, it’s an issue where there are a lot of perspectives on how we’re going to solve it, but no clearly defined path on which one is going to win out.”
Carl Claunch, a vice-president and analyst covering servers and storage for Gartner Research Inc., agreed, adding that the problem will only get more dramatic as computing cores and clusters increase.
“It’s kind of easy to hack at a program and say, ‘I can take this part that’s doing something and this other part that’s doing something else and separate them into two pieces,’” he said. “But when you’re trying to do that over 64 or 128 cores, the bigger the number gets, the harder it is to divide evenly.”
Claunch added that even though almost every major tech vendor has pumped serious resources into developing a general solution for this problem, no quick fix yet exists.
“Even at a practical level for your readers, a lot of the programmers IT managers have don’t really know how to write parallel code,” he said. “It’s harder. It’s requires different skill sets. And it requires different tools to debug the programs.”
One long-standing technique used for computer clusters is a Message Passing Interface (MPI) implementation, said Roch Archambault, a software developer for IBM’s Rational Group. An MPI is a specification for programming languages such as C++ and Fortran that allows many separate computer systems to interact with each other.
“People have been using this method for the last 20 or 30 years, but people are starting to resent it because it’s not easy to program,” Archambault said.
But in the last few years, the industry has seen a big drive toward integrating MPI into programming languages.
Archambault cited the Unified Parallel C extension, which enables users of the C language to write a program that will run on multiple clusters. Similar add-ons, such as the Co-array Fortran extension, exist for other programming languages, he said.
Another application programming interface, called OpenMP, brings multi-platform shared memory processing to the C, C++ and Fortran languages. “There has been some effort lately to extend OpenMP to deal with task parallelism,” Archambault said.
He also pointed to Open MPI, which he said is a standards-compliant, open source implementation of the MPI library specification for parallel processing. The open source MPI is used by many of the world’s fastest supercomputers and was developed from the work of a handful of notable MPI implementations at major academic research labs
STAY AWAY FROM PROPRIETARY
For Bunn, one method that will give a company a big advantage is to keep with standard programming models and paradigms. A common occurrence in the HPC world, he said, is for companies to “chase performance” and opt for a proprietary offer.
“Some vendor will come up with a brilliant, quick little method that can give you an extra 10 per cent of performance, but those rarely last over the long term,” Bunn said.
He said that IBM tries to encourage clients to focus on standards as it relates to programming, which will also have a considerably longer shelf life than proprietary code.
“There’s OpenMP for single node parallelism, there’s MPI for cross-cluster parallelism and there’s OpenCL, which is a programming paradigm for taking advantage of accelerator units, including graphics processors.”
Bunn said that companies will often not be able to go back and completely redesign a system every four years, so keeping things open and well-documented is the key to being able to integrate future technologies.
“There are a lot of technologies at the processor level, systems level and the software level that aren’t public yet, but are working their way through,” he said. “If you stick to the standard, fundamental elements that have existed and been engrained over the last few years, you’ll do very well into the future.”
Bruce Wright, chief technology officer at Mountain View, Calif.-based Kosmix Corp., said his company is an AMD server shop which custom-builds its own applications. He said that the biggest concern with writing multi-threaded apps is memory allocation.
“When you’re allocating and de-allocating memory to try and keep 48 cores busy, you have to allocate and de-allocate a lot more small energy chunks,” he said. Overall, for enterprises using custom apps, there is no simple solution other than meticulous programming, he added.
Wright praised solid state drives (SSDs) for server systems as a “game-changer” that overcomes the fact that the number of operations a hard drive can perform hasn’t increased that much over the last 25 years compared to computing power and memory densities.
“Having the ability to put SSD into one of these 48-core systems really helps balance that equation of how much resources you give to each of the main components: CPU, memory and disk I/O,” he said.
THINK BEFORE YOU LEAP
While the programming options were covered first, IT leaders can’t help asking themselves why they are buying multi-core powered HPC systems in the first place.
“It’s easy to jump in without needing it,” said James Staten, a principal analyst with Forrester Research Inc. “It’s very easy to say, ‘I want to go down the path of AMD, Intel or Nvidia’ without knowing if these particular architectures are best matched to your work.”
Staten said this happens more often than not when HPC becomes a corporate mandate as opposed to a designer- or engineer-driven one.
“If you have a few applications that have an HPC architecture to them, but you’re not squeezing every last second out of that application, than you want to make a holistic decision about your infrastructure,” he said.
“We’ve gotten to the point now where the general purpose processors coming out from AMD and Intel are extremely robust.”
For companies interested to determine what type of systems they need, Staten recommends building a two-by-two matrix chart.
“On the up axis you’re talking about performance,” he said. “Put ‘extreme’ at the top end of that and ‘standardize’ at the bottom end. Along the axis at the bottom, put ‘level of investment.’”
After building the chart, companies should plot out where they want their HPC apps to fit, said Staten.
“So, if you’re not pushing extremes, but you have an unlimited budget, then you’re probably a mismatch,” he said. “If you want the very best performance possible, but you’re not willing to spend a lot of money on it, that’s your other big mismatch. You want to make sure you’re not in those two quadrants before investing.”
IT shops that classify themselves in the upper right quadrant will be looking to heavily customize their apps.
“You need to look at unique architectures, you need to consider InfiniBand, and you need to consider GPUs,” Staten said. Companies in the lower left quadrant will be looking to find commercial off-the-shelf software wherever they possibly can to keep the costs down, he added.
Staten also cautioned companies going down the open source path, saying that those IT shops had better ensure they are bringing in the right talent to do a lot of the integration work themselves. Companies that are bringing in an “off-the-shelf” HPC application for financial analysis, for example, will probably not need the talent to tweak and customize the software, he added.
Going forward, the differentiation in the software battles will be which vendor makes more proficient use of multi-core systems, Claunch said. “Sometimes you get rid of one piece of software and find a different one that’s more capable,” he said, referencing Adobe Inc.’s Photoshop software suite as an application that took advantage of parallelism very early on in multi-core computing’s ascent.
In addition to carefully questioning and testing vendors on how their software works across clusters, IT shops will also want to be wary of hardware vendors that try and charge you by the number of cores you have.
“Just upgrading to a newer generation of machines will increase your software costs, even if you’re not getting any value,” Claunch said.