In the staid world of insurance companies, The Hartford is making a name for itself by lashing together 200 servers to create a powerful grid that’s tackling compute-intensive financial analytics.
“Grid has mainly been used for academia, but it is actually much more useful for corporate business,” says Chris Brown, director of advanced technologies at Hartford Life. He says The Hartford had been running into scalability issues both in terms of hardware and software that could handle the company’s growing needs.
The Hartford spent six months designing the grid network, grid-enabling its applications and deploying the Condor open source grid management software developed by the University of Wisconsin. The grid network went live in September.
Although Condor is free, Brown says The Hartford spent significant amounts of money on integration and software development. But Brown says the benefits are enormous.
The Hartford is the largest seller of variable annuities in the world. With a client base approaching 2 million, it needed computational horsepower to perform immense calculations to measure both behavior in the financial marketplace and market conditions themselves. The Condor-based system runs in excess of 100,000 analytical jobs each month.
“One of our main hedging calculations recently underwent a performance boost that reduced its runtime from about 10 hours to under 20 minutes,” Brown says. Grid computing “allows us to now do things in near real-time that were previously run overnight, creating some great new opportunities for use of the technology.”
Currently the grid consists of 200 rack-mounted, dual-processor servers, but Brown is now in the pilot stage of a project to add desktops to the grid.
While The Hartford estimates that it is saving millions of dollars through grid computing, there have been some unexpected costs, such as finding integrators and developers who were skilled in grid deployment.
Brown adds, “We are pretty proud of where we are with the adoption of the technology itself. We are seeing a big benefit from it.”
The Hartford is at the leading edge of grid computing’s transition from academics and research to enterprise data centers. “Most early adopters have long-term and far-reaching plans to extend their grid activities, from initial beachheads to multi-application and cross-organizational grids,” says William Fellows, an analyst at The 451 Group.
But he adds that widespread enterprise adoption is still a ways out, for a variety of reasons. “Across different vertical markets we hear similar barriers and challenges to increased adoption with software licensing, organizational and cultural issues, and data management being most prominent.”
License to kill
One of the inhibitors to grid going mainstream has been the licensing issue that has essentially plagued the technology from its inception. “For all of the potential benefits of grids, enterprise IT departments cannot afford to buy software licenses for every device in the grid, a necessity under current licensing schemes, since the grid, by nature, consumes resources dynamically,” Fellows says.
Some grid users have been rather resourceful and have been able to skirt by the regulations by using in-house software, negotiating deals with vendors or paying premium licensing fees for a few, choice applications only. But those are the exceptions rather than the norm. Fellows sees the problem growing over time. He says, “As they evolve into using grids as a more mainstream technology, the restrictions of current licensing will become greater obstacles.”
Fellows supports licensing models that are based on business objectives and take the nature of grids into account.
According to Kenneth Shankland of the Rutherford Appleton Laboratory in Oxfordshire, England, another factor holding back grid adoption is reluctance on the part of users.
Rutherford Appleton Laboratory is owned and operated by the Council for the Central Laboratory of the Research Councils (CCLRC), one of the largest consortiums for the support of science research in the world. Shankland is the group leader for data analysis in the ISIS Science and Diffraction Division of Rutherford Appleton Laboratory. ISIS supports an international community of around 1,600 scientists. In addition, there are countless support staff associated with the group – support staff that has computers just ripe for harvesting CPU power via ISIS’ grid.
Shankland has been utilizing grid technology for a number of years. Currently, he has a grid that uses two servers, and he has purchased a license for up to 80 PC clients to be attached to the grid. Shankland says, “We kept the number of PCs quite small in both our initial and subsequent grid trials for two reasons. The first is that we did not want to unleash this technology across all PCs until we knew what maintenance would be required. But we now know that the maintenance issue is controlled by the software.”
He adds, “The second issue is the sociological aspect. People are suspicious of grid technology.” And the issue becomes even more apparent when you start including non-technical people into the equation. Shankland simply says, “Far and away the biggest stumbling block has been user hesitation.”
Shankland says that when his team approached staff members by basically saying, “We are going to steal all of your unused CPU power, but you can do all of your normal stuff” people tend to question what exactly is going on.
ISIS has been lucky thus far as they started grid enabling PCs from colleagues who trusted that they knew what they were doing, and that nothing malicious would occur. However, Shankland admits that after grid-enabling software is installed on a PC, it is inevitable that the first PC glitch is blamed on the grid itself, or the installed software.
Shankland takes the time to explain to staff that not only are they helping aid in science by allowing the grid to tap in to their CPU power but that they too can tap in to others, as well, to make their systems run faster. Once the concept is understood, half the battle is won.
Forrester Research analyst Frank Gillett adds that going to grid computing can mean a significant investment in software.
Grid computing means that you’re chopping up a large compute problem in manageable bites, but it also means that software has to be written to divide that task into smaller pieces and to put it all back together, even when the machines might not be running at the same time or on the same operating systems.
He suggests that companies should look for specific problems that can be addressed through a grid, rather than trying to deploy grid computing on everyday business applications. “Companies should dig around and see if they have any problems that grid can help with,” he says.
Grid computing has been one of the most significant forces in speeding up the process of drug discovery.
For example, Entelos, a biotechnology firm in Menlo Park, Calif., is unique in that it has no wet labs; all research is done on computers. The core technology of the firm is building large-scale models to support research and development at pharmaceutical companies.
In the late 1990s, the company found that design and discovery had become computationally intense, and running thousands of variations on individual desktops was becoming too time consuming. By early 2002, Entelos was running applications on clusters.
The initial cluster consisted of 100 single-processor, Pentium III 1-GHz, Compaq machines. Some of those original machines still are considered to be part of the grid, but many have been upgraded to dual Opteron machines and dual Pentium 4 machines. A queuing strategy is used to determine into which machine work flows.
For grid software, Entelos chose Platform Computing’s Load Sharing Facility (LSF). “We have created our own simulation workflow software that runs on top of LSF, the PhysioLab Simulation Server, to hide the complexity of the grid and make it easier for scientists to monitor progress of large, complex simulation analyses on the server,” CTO Alex Bangs says.
Though Entelos experimented with having desktops as part of its grid, it has since removed those from the equation because the company found that the desktops added more problems instead of more horsepower.
Bangs says, “Grid does not work well with laptop users, unless they leave their laptops in the office all of the time. Mainly this is because the grid software on the market right now is not smart enough to recognize when someone is dialing in from a remote location.”
He adds, “We also realized that there were just not enough cycles to steal because users were running too many other applications.” Without that available computational power for distribution, the grid just simply does not work as it should.
The Entelos grid now consists of 180 processors. Since its first grid rollout back in 2002, Entelos has created a grid that now is roughly eight to 10 times more powerful than what it initially started with. That allows the firm to run simulations in a matter of hours or days, as opposed to weeks and months. For instance, just one drug trial could involve upwards of 13,000 simulations that could take up to two years running on one server. With the grid, those 13,000 simulations can be performed in about a week.
Entelos has become so adept at using and refining its grid that the company has almost found itself becoming somewhat of a solutions provider. Bangs notes that Entelos has, on occasion, gone out to business partners and delivered a grid solution.
Though still thought of mainly as a solution for science-based studies, grid is moving into banking and financial markets, and is adaptable to other market environments, as well. Fellows says that eventually grid computing will become so integrated into systems and software stacks that it won’t exist as a stand-alone technology, but will part of the way data centers are built and operate.
Stong-Michas is a freelance writer in Pennsylvania. She can be reached at firstname.lastname@example.org.