At my cloud computing consultancy, we’ve been approached several times in the past few weeks by companies that have put their apps up on Amazon’s cloud infrastructure and are now running into problems. Problems like:
1. Applications are installed on Amazon Machine Images and run just fine, but if the EC2 instance crashes or needs to be terminated, the app is out of commission until a new instance comes on line.
2. If an EC2 instances gets overloaded, there’s no way to add more resources to improve app performance.
3. No way exists to update the application without taking it completely offline.
4. Performance gets bottlenecked by the database, but there’s no manageable way to move to database replication.
In our discussions with these companies, their question is: “Shouldn’t this problem be solved by cloud computing? After all, the cloud offers resource elasticity, processing power on demand, huge scalability. So why is my application running into these problems?”
The challenge they’ve run into is that they treated cloud computing like Hosting 2.0, and now they’re suffering for it.
The shorthand response to them is “cloud scalability isn’t the same as application scalability, and unless you architect a cloud app, you aren’t going to garner the benefits of cloud computing. In our workshops, we phrase this as “build cloud apps, not apps in the cloud.”
So what does “build cloud apps” mean, and how is it different than treating the cloud as Hosting 2.0?
Here are key principles in building a cloud application:
* Recognize that individual compute resources can, and do, fail. In Amazon, individual EC2 instances will occasionally experience poor performance, stop responding, or crash. At scale, resources fail. And this is true of all cloud providers. Google is well-known for its philosophy of building ultra-cheap computers with (literally) the disk drives velcro’d onto the naked motherboards (Google’s machines have no metal shell);when one of its computers fails, Google removes it and puts it in for recycling. With hundreds of thousands of machines running, failures are common, so Google architects its solutions to remain robust in the face of resource failure. Likewise, one should architect individual applications that run in cloud environments as though the individual resources (including virtual machines) will fail. So an application should be written to run on two EC2 instances — at a minimum.
* Understand that the potential for failure means that your application must run on at least two instances in EC2. This means application files need to be placed on both virtual machines or located in a central location both machines can access. It doesn’t mean that every application must be segregated onto its own instances — a single EC2 instance can support multiple applications; for example, a single instance can host a number of different web sites. It does mean that each application must be written so that it can span multiple instances.
* Write your application so that session management is handled properly. This either means that session affinity is handled by, for example, the load balancer that sits in front of the application, or that the application itself places session information in a shared location. This can be accomplished by placing session information in a database server that is shared among application servers, although this approach can end up bottlenecked by the load on the database server. A common fix for this is to move session information into a memcached layer which provides better performance. In any case, session information must somehow be available for whatever part of the application is going to require it.
* Ensure that additional compute resources can join and leave the application dynamically and gracefully. One key reason to use the cloud is to enable applications to dynamically access the resources they need, varying the amount of resource according to load. If human intervention is required to add or subtract resources, the bottleneck has moved from compute resources to human resources, which is not ideal. If the application is not written so that resource levels can vary dynamically, then one has to assign a fixed level of resource; this ends up returning to the old tradeoff between availability and investment, i.e., do I waste money or users?
I don’t want to trivialize the move to “cloud apps.” Writing applications so that they can dynamically scale without human intervention is not trivial. For one thing, most software components assume manual, not automatic, administration, and follow an “update the config file and restart the server” approach. This is fine for a fairly static application topology, but a real pain in a dynamically changing application topology.
Another issue is deciding how to handle files and objects common to multiple copies of an application. They can be placed on a networked file system, but performance is often an issue. For cloud environments that support SAN- or NAS-type functionality, the files can be centrally located, although that may impose latency issues. Copies of the files can be placed on each server, although that may cause a challenge in distribution and version control. The best approach is to have all the files placed in a central location (e.g., in S3 for Amazon-based applications) and have the virtual machine download the “official” files and install them on itself as it instantiates. Again, this is a bit out of the ordinary and not common to a non-dynamic environment. The usual approach in most environments is to emphasize hardware (and virtual machine) robustness and not plan for dynamic application topologies.
As I wrote a couple of weeks ago about the nascent “devops” movement, it’s not clear what percentage of applications will experience the need for dynamic topologies based on load. So not every application may need to be a “cloud app.” On the other hand, it’s often difficult to predict what loads an application will experience throughout its lifetime. In his presentation last night at the Hacker Dojo (see my blog post about this event here), Josh McKenty, chief architect of NASA’s Nebula cloud project, noted that NASA applications often have an odd user load: years of no traffic, with a short period (one to two days) of massive traffic when the mission does something spectacular (his example was the project that landed on the Moon to check for water). Because of the unpredictability of load and the odd load patterns that will be increasingly common to future applications, it’s likely that the design patterns associated with writing dynamic apps will eventually become standard practice–in other words, every application will be written so that it is robust in the face of highly dynamic loads. For those apps that experience those type of loads, well, they’re ready to respond; for those apps that don’t experience those type of loads, well, the capability will remain in reserve, unexercised, available in the eventuality it’s required.
For architects and software engineers, learning those design patterns today is important because the applications being designed and written now will be in service for years and will, in all likelihood, end up running in cloud environments. This means that applications should be written with an eye toward being “cloud apps,” even if the current plans don’t call for them being operated in cloud environments.
Bernard Golden is CEO of consulting firm HyperStratus, which specializes in virtualization, cloud computing and related issues. He is also the author of “Virtualization for Dummies,” the best-selling book on virtualization to date.