Storage Virtualization:  How, What and Why

Corporate marketers have discovered – and are busily hyping – the concept of ‘virtualization’. But the precise definition of ‘storage virtualization’ depends very much on who’s using the term. Because storage virtualization sets the foundation for the enterprise storage utility, it’s crucial for business IT planners to understand its first principles, and not be fooled by the hype.

Virtualization separates the representation of storage to the server operating system from actual physical storage. This division of physical storage devices from the logical storage space presented to users and applications turns storage into a generally available utility pool.

Virtualization fulfills a role for storage similar to that which an operating system does for a computer system – namely, making programming and operation simpler by automating resource management ‘housekeeping’. When this process occurs, computer users are said to be ‘viewing its resources at a higher level of abstraction.’ Thus, in short, virtualization is the abstraction of storage.

This concept – and process – has great potential benefits for storage administration because it dramatically increases the amount of storage an administrator can manage in the same amount of time, or slashes the work necessary to manage the same amount of storage. But there’s one glitch: virtualization that will meet this goal of abstraction and serve as the key enabler for policy-based storage management at the highest strategic level is not yet widely available. Vendors recognize the importance of virtualization and have been sharing their roadmaps for product delivery. Most important to customers: the approach the vendors take, and when the solutions are delivered.

How Is It Done?

Storage virtualization hardware and/or software automatically maintain a set of tables (or other control structure) that maps the logical storage space that applications need to the physical storage actually available to the system. When logical-to-physical mapping occurs while applications are running – just as virtual memory operates in a PC – the storage virtualization is dynamic.

If the mapping is based on a simple rule, or algorithm – so that the storage available to an application is only provided preplanned, and not while the application is running – the storage virtualization is static. The relationship of dynamic storage virtualization and physical storage is analogous to virtual memory and RAM, because programmers do not need to think about memory requirements. Static virtual storage is to physical storage as memory overlays are to physical memory – users don’t have to think about program memory requirements as long as the programmers have.

Vendors are implementing storage virtualization in different ways – via varied architectures, or hardware and software (or mixed) approaches.

Virtualization can be implemented as:

    Software that runs on servers, either entirely on a master server or distributed among servers to run cooperatively;In the fabric of a storage area network (SAN), either as a ‘SAN appliance’ or software on a switch in a SAN that uses metadata (i.e., data about the data in files) to present virtual volume space (i.e., logical unit numbers) or files, or within a domain within a storage network through a domain controller or router.At the level of a particular storage system, with the software value added in the SAN controller.

Note, however, that storage aggregation, which is static mapping for striping data across a number of disk drives, or mirroring data to a second disk, done by redundant arrays of independent disks (RAID), is not virtualization. That process is nothing more than a rote application of a data placement and/or parity

generation. Aggregation does not affect the amount of storage space presented to the application or users.

File intelligence and file sharing ability also should not be confused with virtualization. Network attached storage (NAS) filers have file intelligence but may not offer storage virtualization. Almost all virtualization work today is being done for SANs. Storage within a SAN is ‘seen’ by servers as basic block-level storage. SAN storage (except in rare cases) does not ‘know’ about files, and servers that run different operating systems connected to the same SAN only ‘recognize’ a portion of the storage that has been assigned to their zone.

Without virtualization, even servers attached to a SAN and running the same operating system must have ‘their’ storage preassigned and managed by administrators for it to be accessible.

Virtualization Design Tradeoffs

Server-hosted virtualization can be highly OS-dependent and require difficult constant server-to-server replication – but may be relatively easy to adopt in operation. Virtualization implemented in the fabric can add a layer of performance-impacting complexity to a fabric’s basic switching tasks, but centrally administering a broad expanse of storage with the same tools and procedures may reduce administrative costs.

Implemented at the storage system level, virtualization can involve storage vendor lock-in for users and not have a wide scope of storage that can be virtualized (limiting potential administrative efficiencies) but may be transparent to other operational changes.

Presently, storage virtualization is emerging piecemeal from myriad storage hardware, storage networking, and storage software vendors. And, at this time, diverse vendor solutions are not necessarily interoperable, which will be crucial to preserving capital invested in storage. Also, the main stream of virtualization today applies to storage space, with only a trickle addressing virtual files (the objects actually being stored in the space created).

Why Is Storage Virtualization Needed?

Data growth at companies everywhere is creating an appetite for storage that is outstripping the ability of employees to manage storage. Storage virtualization will permit administrators to deal effectively with far more storage than would otherwise be possible.

Virtualization can turn all storage into a generally available utility pool and be the enabler for significant administrative cost reductions. The key metric is the amount of storage that a person can administer. Virtualization allows storage to be administered with the same tools and treated in a consistent manner, using policy-based (not case-by-case) storage management.

Policy management software will automatically handle conditions such as load balancing, adding storage, migrating data, etc. Virtualization lets policy management control heterogeneous storage, and present it to users, with the same characteristics and attributes. Pools of data created by virtualization will be accessed and managed by an organization’s policies rather than mandated by a static OS. The result is significant business value and cost justification for investing in virtualization solutions.

Some day, applications and users will not have to care about where the storage is, the type of computer and operating system in use, or what kind of networking is used. That day may seem far off, but recent strides in storage virtualization are bringing it closer to reality.

Storage virtualization solutions being delivered today can significantly ease storage management burdens and enable wider use of commodity-priced storage components. Despite the hype, they should be evaluated knowledgeably.

Dan Tanner is Senior Analyst, Storage and Storage Management, for Aberdeen Group, Boston. He can be reached at