Network World (US)
When you enter the world of storage-area networks (SAN), you may want to check all you know about data networking at the door. SANs represent an offshoot branch of networking that, whether by design or happenstance, has become a unique world unto itself.
Even so, SAN switches -which power SANs just as Ethernet switches and IP routers drive traditional data networks -can still be quantitatively tested and compared. The latest wares from four leading SAN switch vendors are quite different.
The Network World Blue Ribbon Award goes to Brocade Communications Systems Inc. The SilkWorm 2400 and 2800 SAN switches proved to be truly plug-and-play, and exhibited the best overall performance of the products tested.
QLogic placed second. While not as easy to get up and running as Brocade ‘s SilkWorm SAN switches and exhibiting so-so performance, QLogic ‘s SANbox 8 and SANbox 16 HA showed strength in its SANsurfer management, which we considered the best of the lot.
(QLogic acquired Ancor earlier this year, and the switches we tested came from Ancor before the acquisition was completed. We are assured by QLogic that its current switch products are essentially the same as the Ancor switches we tested.)
Each vendor -QLogic, Vixel and Brocade -with the exception of Gadzoox Networks, provided two switches, an eight-port and a 16-port version. Performance of the eight- and 16-port versions of each vendor ‘s switch was extremely close, letting us group the performance figures by vendor, rather than reporting on a switch-by-switch basis. Because scores for installation, performance, manageability and features were about the same within each vendor ‘s product line, we could rate the switches by vendor, rather than individually. That ‘s why there are only four, rather than seven, scores on the NetResults Scorecard.
Rating SAN switches
We evaluated the SAN switches in five categories:
— Performance, comprising a dozen measurements and metrics, including latency, through one switch as well as a multiswitch fabric; automatic failover, representing the case in which a switch or an inter-switch link (ISL) fails; throughput, which we tested with one to seven Windows NT servers, performing disk reads, writes, and then a mix of reads and writes across the SAN fabric; and overall stability.
— Management and administration, including intuitiveness and effectiveness of the management interfaces -graphical user interfaces as well as command-line interfaces; real-time monitoring capabilities; and additional management capabilities such as event, alarm or trap logging (files in which events, alarms and traps are stored) and the generation of management reports.
— Configuration, with criteria including support for multiswitch, mesh topologies; per-port frame buffers; different Fibre Channel classes of service; connection types -such as fabric (direct, switched) and older loop (shared-bandwidth) connections; port density; modularity; hot-swappability; and a secondary, back-up power supply (an important redundancy option).
— Features, including support for the various Fibre Channel physical interfaces; multiple ISLs for load sharing and/or failover; and zoning.
— Installation and ease of use, including the degree of plug and play for connecting storage systems and servers; and documentation, including interoperability details.
Loop vs. fabric
Vendors submitted four switches, two of each model, for testing. Four was the minimum number of switches we needed to set up a multiswitch fabric with alternate routes and then test failover of a link. Except for Gadzoox, the vendors dutifully complied: QLogic, Brocade and Vixel sent us a mix of their respective eight- and 16-port switches. Other switch vendors were invited to submit switches, but either did not respond to, or declined, our invitation.
Gadzoox submitted its Capellix 2000G, which the vendor describes as a loop switch. That means the ports on the switch support only loop-type device connections. A loop connection, short for Fibre Channel Arbitrated Loop (FCAL), is an older Fibre Channel connection arrangement in which multiple nodes share the bandwidth of a common transmission channel.
However, to interconnect multiple SAN switches, each switch has to support switched, or fabric, connections on at least some of its ports. The loop vs. fabric difference in SANs is akin to the hub vs. switch evolution of Ethernet. Before Ethernet switches came along, devices on the same Ethernet LAN shared the bandwidth of a common coaxial cable or hub.
The lack of support for fabric connections and multiswitch topologies hurt Gadzoox ‘s scores in the configuration and features categories. Using only one switch, users cannot deploy high-availability, alternate-route topologies. A single-switch Gadzoox SAN would be limited to a maximum of 11 ports (the Capellix 2000G comes with eight ports, plus an option slot that accepts a three-port expansion module). Gadzoox says it has developed a Fabric Switch Module designed to plug in to another modular switch model, the Capellix 3000. They were not available for our testing.
While performance and management aspects differed considerably, the SAN switches tested had many features in common. For example, the switches all featured Gigabit Interface Converter (GBIC) modularity for all ports. This lets the user readily swap the physical connector of the port. We often switched between copper DB-9 Fibre Channel ports and short-wavelength optical ports to test optical and copper configurations. The vendors all offered both GBIC types, as well as other Fibre Channel physical GBICs, such as those designed for long-wavelength/single-mode fiber transmission. We intermingled GBICs from each vendor with switches from each vendor (for example, a Gadzoox GBIC in a QLogic switch), and experienced no performance or compatibility problems. It appeared that as far as GBICs and GBIC ports are concerned, these devices really are plug-and-play.
All switches supported the same 1G bit/sec transmission rate on all ports, although Fibre Channel specifications have also been defined for 2G bit/sec transmission. A 4G bit/sec Fibre Channel specification is also reportedly now in the works.
All the switches featured a 10/100 autosensing Ethernet port for management access. Brocade, Vixel and Gadzoox all offered a console port, which is the way the IP address is typically entered for management access. In QLogic ‘s case, an IP address is predefined, which we thought was somewhat awkward. The user has to keep track of this prespecified IP address, and go in later and change it to a different IP address, one that ‘s appropriate for the user ‘s particular IP network.
All the switches also supported zoning to roughly the same extent. Zoning is the SAN equivalent of a virtual LAN, which is where ports and the switch ‘s attached nodes can be logically isolated from others. Zoning is used primarily for traffic-control purposes.
In addition, all the switches also supported the same two classes of Fibre Channel service, Class 2 and 3. Class 3, which is an unacknowledged, connectionless service, accounts for virtually all the traffic carried via SANs today. Class 2, also connectionless but with acknowledgements, is not widely used.
In the configuration category, we awarded the top rating to Brocade ‘s SilkWorm SAN switches. The SilkWorm switches supported all the configuration criteria we looked for -multiswitch topologies, fabric and loop support, GBICs, console port and 10/100 autosensing Ethernet access. In addition, only Brocade offered a redundant power supply for its switch models, both eight- and 16-port. QLogic supports redundant power, but only on its 16-port model, the SANbox 16 HA. Neither Gadzoox nor Vixel offered redundant power for the switches they submitted.
Frame buffering, in which the switch temporarily holds onto transiting data frames, is another issue we explored. It helps ensure that frames aren ‘t lost or dropped when unusual events or traffic conditions occur. We checked to see how much buffering the switches implemented on a per-port basis. Gadzoox doesn ‘t employ any frame buffers. QLogic ‘s SANbox switches have eight frame buffers per port. Brocade ‘s SilkWorm switches have 16 per port, but also employs a dynamic buffer pool, which allocates additional buffer memory to ports as needed. Vixel ‘s 7100 switch has 32 frame buffers per port.
There was less disparity in our assessment of features. A major distinguishing factor was the vendor ‘s documentation of interoperability. We asked the vendors to send us whatever documentation and notes they offer to customers regarding interoperability of their switches with other SAN switches, storage systems and host bus adapters (HBA) -the SAN term for the Fibre Channel network interface card that goes into SAN-attached servers.
None of the vendors offered much in terms of working with other vendors ‘ SAN switches. Brocade stated it does not claim any interoperability with any other SAN switches. QLogic, Vixel and Gadzoox were more ambiguous about interoperability. For storage systems and HBAs, Brocade had the most specific details, indicating it works on assuring customers that it interoperates with specific HBAs and storage systems. QLogic offered some specific details.
Plug and play?
The main criteria in our scoring of installation and ease of use was how long it took to get the vendor ‘s switches up and running, whether we could readily connect our storage system and HBAs, and the problems we encountered along the way.
We used the same QLogic HBAs for testing all the switches. It is not clear whether a different HBA would have resulted in performance or compatibility issues, however, because SAN interoperability is still evolving. Thus, your mileage may vary if you use other HBAs or Just a Bunch of Disks (JBOD) with the switches tested.
Brocade ‘s SilkWorm 2400 and 2800 were truly plug-and-play, and received the highest scores. The next-most plug-and-playable was Gadzoox ‘s Capellix 2000G. Gadzoox, however, avoided the complexity of a multiswitch topology -although everything connected and worked on the one switch on the first try.
Vixel ‘s 7100 and 7200 and, to a lesser extent, QLogic ‘s SANbox switches, experienced considerable start-up problems. We aren ‘t sure why, and it seemed the companies ‘ tech support people weren ‘t sure, either. (The vendors each had at least one tech support person on-site at the test lab at some point during the testing.) We believe the problems related to subtle interoperability issues between SAN switches, adapters and storage systems.
We were most impressed with the QLogic ‘s switch management. The SANsurfer Web/Java-based software was clean, intuitive and stable. An automatic topology map shows how the multiswitch fabric is interconnected, down to the port level. Traffic levels are accurately displayed in real time and there ‘s a good, legible event log.
Brocade ‘s Web Tools Java-based management was reliable and effective, but not as informative or feature rich as that of QLogic. There was no topology diagram, and we could not readily discern from the management interface what types of physical ports were on the switches. Reporting of traffic statistics was good, but there ‘s no online help, which was sorely needed in a few areas.
Vixel ‘s SAN InSite switch management, also Java-based, featured a good event log. The software involves multiple client and server pieces, which we thought was complicated. We evaluated a late beta version of the vendor ‘s SAN InSite 2000 3.0, and we encountered more than a few bugs. In one case, a switch port was consistently reported as DB-9/copper when it was optical fiber. There are many nice features with the package, including excellent online help. However, the whole management interface operated erratically. At one point real-time traffic reporting stopped, and we were unable to re-establish or fix it.
Gadzoox ‘s Java-based management utility, Ventana SANtools, was leaner than the others with regards to graphics and features. For example, there was no capability for real-time monitoring of traffic statistics. We also encountered some issues relating to the layout and navigation of the interface. There is some online help, but no option to search through the help file.
The first of our performance tests, measuring the latency of transiting data, turned out to be a nonissue. All the switches imposed less than 15 msec of delay on data passing through their multiswitch fabrics. QLogic ‘s SANbox, Brocade ‘s SilkWorm and Vixel ‘s 7100 and 7200 ranged between 10 and 15 msec. The latency of Gadzoox ‘s Capellix 2000G was less because there was just one switch to go through.
What about when the switch is being bombarded with data? We clocked the average elapsed time for seven NT servers to perform 10M-byte disk reads and writes -in random order -to a disk-storage system across the SAN switch fabric.
The average transaction times of the SilkWorm, Capellix 2000G and Vixel 7100 and 7200 switches, at 1.515, 1.512 and 1.536 seconds, respectively, per I/O transaction, were nearly identical. The SANbox switches took a little longer, 2.177 seconds on average, to move the same amount of data under the same conditions. This number is significant and could impact performance if there is a lot of traffic flowing over the SAN.
Throughputwise, we measured maximum throughput on the Fibre Channel link that connected our target disk-storage system to the SAN switch fabric. We then launched one to seven NT servers, conducting first reads, then writes, and then mixed reads and writes, to the storage system across the SAN switch fabric. (In the case of Gadzoox ‘s Capellix 2000G, the servers and disk storage were connected to the same single switch.)
For one server writing to the disk system, throughputs were nearly identical for all switches. They handled on average between 77.8 and 79.6M byte/sec, a difference that was statistically insignificant. The same was true with server performing reads; average throughputs ranged from 81.6 to 85.1M byte/sec.
However, with seven servers performing disk reads at the same time, differences began to appear. The Capellix 2000G and Vixel 7100 and 7200 switches averaged 95.3 and 94.3M byte/sec, which nears the 100M byte/sec maximum capacity of Fibre Channel. The SANbox and SilkWorm switches lagged behind with 88.9 and 73.9M byte/sec average throughputs, respectively.
However, in other comparative throughput tests -with the servers performing disk writes, and mixed reads and writes -SilkWorm emerged with the highest average throughputs. The Capellix 2000G came in second, and the Vixel 7100 and 7200 placed third. The SANbox consistently came in last. The concurrent reads-and-writes environment is closest to a real-world environment that users are likely to see.
Two other tests yielded surprising results: disconnection and reconnection of the target disk subsystem from the switch fabric with no traffic running, and fail-over of a switch in a multiswitch SAN fabric with full traffic running between multiple servers and the disk system.
The SilkWorm switches and the Capellix 2000G had no problems with the disconnection and reconnection of the disk system. However, Vixel ‘s multiswitch fabric would not accept and recover from the topology change. Sometimes QLogic ‘s SANbox switch fabric would accept the interruption, then reinitialize and coalesce, while other times it would not. Again, there was no traffic running during this test.
The fail-over test could not be conducted with the Capellix 2000G because the vendor did not support a multiswitch fabric. With full traffic running between seven servers and the disk system, the SilkWorm switch fabric automatically failed over every time, the interruption lasting between 8 and 12 seconds. SANbox also failed over reliably under a heavy load. Because SANbox automatically balances the traffic load on the available routes through its multiswitch fabric, the interruption was barely noticeable. We think this is a plus for the SANbox design.
With light traffic from one server, the Vixel 7100 and 7200 switch fabric would fail-over reliably. However, under a heavy load from all the servers, Vixel ‘s multiswitch fabric would not fail-over.
With all performance metrics considered, Brocade ‘s SilkWorm 2400 and 2800 were the clear winners in this category followed by Gadzoox ‘s Capellix 2000G.
Overall, Brocade ‘s SilkWorm switches edged out the SAN switch competition, earning a total score of 8.4 on a 10-point scale. In Mier Communications ‘ experience, we consider any product earning a total score more than 80 percent as a “recommended buy,” and that is certainly the case here.
Mier is president and founder of Mier Communications, a network consultancy and product test center in Princeton Junction, N.J. Percy is a test engineer at Mier Communications. They can be reached at [email protected] or [email protected]
Copyright 2000 Network World (US), International Data Group Inc. All rights reserved.
Prices listed are in US currency.