Introduction
Organizations of all sizes are struggling to meet the conflicting challenges associated with information storage growth and complexity juxtaposed with global financial uncertainty. A growing number of IT managers are turning to virtualization and consolidation technologies to meet these challenges.
Background
ESG research indicates that a number of factors are driving IT decision makers toward more cost efficient storage solutions. As shown in Figure 1, accelerating data growth, storage system costs, and increasing complexity are cited as significant challenges by IT managers.[1]
In addition to the storage challenges listed in Figure 1, ESG research indicates that reduced operational costs and reductions in capital expenditures are also top priorities when making purchasing decisions.[2] Put it all together and it’s clear that IT managers are looking for modular, cost effective storage solutions that are both efficient and scalable.
Coraid EtherDrive SAN
Coraid EtherDrive products combine commodity hardware, lightweight Ethernet networking, and a scale-out virtual storage architecture that can grow from a single appliance to multi-petabyte installations. As seen in Figure 2, Coraid provides both cost/capacity optimized and performance optimized storage appliances supporting SATA, SAS, and SSD drives. Coraid systems support all standard RAID types including RAID 0, 1, 5, 6, and 10.
To make the offering as turnkey and simple to deploy as possible, Coraid also offers HBAs, servers, and replication appliances. All Coraid products communicate using the lightweight AoE (ATA over Ethernet) protocol and standard Ethernet switches, which provides secure storage networking for industry standard x86 servers.
Coraid EtherDrive SAN promises an impressive list of capabilities, including:
- Price-performance: Higher performance than comparable Fibre Channel configurations, at approximately 20% of the cost.
- Massive throughput: More than 1200 MB/sec of throughput per Coraid EtherDrive SRX-Series storage array shelf for large-block sequential workloads.
- Simple scalability: Ease of implementation and management of Coraid EtherDrive storage compared to Fibre Channel and iSCSI.
- Optimized for virtualization: VMware and Hyper-v see Coraid storage as local-attached disks, with no need for switch configuration or multi-pathing software.
ESG Lab’s testing was designed to explore Coraid’s EtherDrive SAN and the AoE protocol, paying special attention to ease of use and management, capacity and performance scalability, and integration and operation in virtualized environments.
ESG Lab Validation
ESG Lab performed hands-on evaluation and testing of Coraid’s EtherDrive SAN at Coraid’s Redwood Shores, CA headquarters. Testing was designed to demonstrate the ease of installing and configuring an EtherDrive SAN as well as the cost-effective performance and capacity scalability of the platform.
Background: Ethernet SAN
Coraid’s EtherDrive SAN utilizes the AoE protocol to present disk storage to servers across a standard Ethernet network. AoE is an extremely simple method for sharing disk drives through a network. The communication that would normally take place between a motherboard and an IDE disk drive is arranged into data packets and sent across the Ethernet. As can be seen in Figure 3, AoE is a simpler and more direct protocol than either iSCSI or Fibre Channel. AoE is not built on IP, TCP, or SCSI; packets are addressed to devices using their Ethernet MAC addresses and sent across the network with a minimum of overhead.
Fibre Channel and iSCSI are both based on SCSI, which is a complex protocol designed for a variety of devices (scanners, printers, etc.), in addition to disk drives. Because of this, they incur significant overhead when processing each packet. Both Fibre Channel and iSCSI run SCSI over high level networking protocols on top of a physical network infrastructure, consuming additional overhead and processing compared to AoE, which connects servers and storage directly across the physical Ethernet layer. The typical AoE packet contains just 48 bytes, plus the data payload, enabling “bare metal” performance and native Layer 2 multi-pathing. Fibre Channel and iSCSI first encapsulate the data in the SCSI command set and then wrap SCSI in a transport protocol.
Because they do not run over high level networking protocols like IP, AoE packets (like Fibre Channel) are non-routable. While they can travel across the switches that make up an Ethernet LAN, routers cannot send them to another network and devices outside of the AOE devices local network cannot communicate with them. This makes AoE packets intrinsically secure. Coraid enables remote access to EtherDrive SANs for administration via AoE tunneling, which is similar to VPN access to a corporate network over the internet.
Getting Started
ESG Lab testing was conducted on a pre-wired, rack-mounted environment consisting of multiple SR2421 and SRX3500 EtherDrive SAN disk shelves. The ESG Lab test bed, as presented in Figure 4, consisted of multiple industry-standard x86 servers with both 20Gbps Coraid HBAs and 1Gbps Ethernet NICs installed. Servers were running VMware ESX server with Red Hat Linux and Windows 2008 installed as guest operating systems as well as physical Linux and Windows 2008 installations. An industry standard Ethernet switch was used for SAN connectivity.[3]

ESG Lab Testing
ESG Lab testing began by powering on an SRX3500 EtherDrive SAN shelf, then logging into a Linux server. Coraid’s cec utility was used to scan for the new chassis using the AoE protocol. In less than a minute, the shelf was visible.
The next step was to name the shelf to make it easier to identify it in a large deployment. Shelf 3 was chosen as the name for these tests. Next, using just three commands, RAID groups were created (Coraid automatically creates one LUN per RAID group), hot spares were assigned, and the LUNs were brought online, as seen in Figure 5.
LUN masking, the means by which servers are given exclusive access to volumes in a SAN environment, is done by Ethernet MAC address using the “mask” command. ESG Lab did not use LUN masking in these tests.
On the Linux server, ls /dev/etherd showed all AoE devices on the network. The storage administrator has nothing else to do—no iSCSI mount, no NFS mount. The AoE LUNs look like local storage. Next, ESG Lab used mkfs to create and format a file system on each of the AoE LUNs.
Creating LUNs and presenting them for use on the network took less than one minute, while creating the file systems for use by the server took about another minute. In less than two minutes and just four simple commands, ESG Lab configured, provisioned, and was using Coraid EtherDrive storage.
Why This MattersStorage deployments are growing in capacity and complexity within organizations of all sizes and IT managers are increasingly being asked to manage more storage capacity with stagnant, or shrinking, budgets and staffing. Coraid EtherDrive SAN is designed to address these challenges by providing simple to manage scale-out storage in a cost-efficient commodity package. ESG Lab was able to configure, provision, and start using Coraid networked storage in a Coraid SRX3500 system in less than two minutes from power on. ESG Lab found the ease of implementation and management of AoE-attached Coraid storage shockingly simple compared to Fibre Channel and iSCSI. |
Disruptive Price-Performance
Coraid EtherDrive SAN storage is a modular disk storage system providing massive scale-out capacity and performance with granular, just-in-time scalability to industry standard, open systems environments. The Coraid solution scales by simply installing additional disks and shelves, allowing organizations to start small and scale capacity to petabytes. Using 2 TB SATA drives, users can scale to a petabyte of capacity and 100 GB/sec of raw storage bandwidth in just two racks.
Performance in a storage environment is best measured with the metrics used by the applications organizations actually run. For an e-mail application, that measurement is the number of users or mailboxes a given system can support. For a streaming media application, the number of objects served concurrently that can be sustained during peak periods of activity is the measurement that matters most.
ESG Lab Testing
Performance was tested using the IOMETER workload generator via simulated application workloads based on Microsoft Exchange and streaming media services. Tests were performed to verify a Coraid platform’s ability to deliver predictably scalable performance in a clustered scale-out environment over a standard Ethernet network. The Exchange workload is random in nature and very disk intensive.
Microsoft guidelines recommend a maximum of 1,000 Exchange users per core and less for a server performing multiple roles. This means that a quad-core server, doing nothing but Exchange, should support about 4,000 users.
Microsoft’s IOPS per mailbox guidance for Exchange 2007 is calculated based on the number of messages per mailbox, the user memory profile, in what Outlook mode the mailboxes are operating, and whether any third party mobile devices are used. The baseline value provided by Microsoft is .32 IOPS per mailbox.[4] This means that a quad core Exchange server with 4,000 exchange users will, on average, drive 1,280 IOPS to the Exchange Datastore. As can be seen in Figure 6, a single SRX3500 LUN was able to support enough transactional IO to support more than 4,500 Exchange users using just 12 SAS drives and scaled linearly to just over 9,000 users with 24 SAS drives.
Next, streaming media performance was examined. This type of traffic is sequential in nature and uses larger block sizes than transactional workloads, putting more of a load on the storage network.
As Figure 7 shows, streaming media performance was excellent, delivering 826 MB/sec from just 6 SSD drives and more than 1,200 MB/sec from 24 SATA drives. Put into perspective, a single shelf was able to drive enough bandwidth to saturate a 10Gbps interface.
The maximum throughput recorded (1200+ MB/sec) was used to calculate the number of streams that could be delivered for a couple of well-known content types including standard definition and high definition broadcast video. Bit stream rates of 3.75 Mbps for standard definition broadcast video and 80 Mbps for high definition video were used to determine that a single SRX3500 has the bandwidth required to simultaneously stream 120 high definition broadcast videos or 2,560 standard definition broadcast videos as shown in Figure 8.

What the Numbers Mean
- The system showed excellent disk response times for both random and sequential IO. The simulated Exchange disk IO response time was 20ms, while streaming media requests from SATA disk were satisfied in just 1ms.
- Microsoft stresses that, to ensure a positive user experience, the Exchange database LUN requires read and write response times of 20 milliseconds or less so that Exchange can service users’ client software quickly and efficiently. In this context, the SRX 3500’s performance is right on target.
- A single SRX3500 has the raw bandwidth required to service 2,560 concurrent standard definition, broadcast-quality video streams.
Next, ESG lab examined cost of acquisition for a petabyte of storage and SAN connectivity for various technologies. Each storage technology was configured to support the same class and quantities of storage, and SAN connectivity was calculated to support 200 physical servers with redundant connections. Table 2 summarizes the configuration built for each technology.
The cost of storage and SAN connectivity hardware was obtained from a combination of publically available sources, including reseller websites, GSA pricing schedules, and online pricing available directly from vendors.
The cost was calculated for modular dual controller Fibre Channel SAN arrays from three major vendors. The cost of dual controller multi-protocol arrays from two major vendors and the cost of direct attached storage (DAS) solutions from two major vendors were also calculated. The solution with the lowest overall price in each category was used for the comparisons presented in this report.
The bottom line results are summarized in Figure 9. Note that the costs of iSCSI, multi-protocol, and FC SAN solutions are significantly higher than a comparable Coraid EtherDrive SAN system and that the base costs of a Coraid SAN solution are lower even than DAS.
Calculated costs are detailed in Table 3.

What the Numbers Mean
- Coraid EtherDrive SAN has the lowest cost of acquisition, by a wide margin.
- The relative cost of acquisition of alternative technologies ranges from roughly 1.4x for DAS to more than 5x for FC SAN.
- The FC SAN solution is so much more expensive in part due to the cost of acquiring FC SAN connectivity.
- DAS technology has a number of limitations that were not considered in this analysis. First and foremost, it is a dead-end when it comes to server virtualization. SAN attached storage is needed to take full advantage of the benefits of server virtualization. Storage capacity held captive within, or directly attached to, a server can’t be moved non-disruptively to another server for maintenance or better quality of service. SAN attached storage is also needed to achieve valuable disaster recovery capabilities that have recently become available from server virtualization vendors (e.g., VMware Site Recovery Manager). And finally, islands of DAS capacity typically lead to poor storage utilization. Poor storage utilization dramatically increases the overall cost of ownership.
- In addition to CAPEX, ESG Lab believes it is likely that Coraid EtherDrive’s simplified architecture and management would also yield OPEX savings over alternate technologies.
ESG Lab also compared price-performance for the Coraid EtherDrive SAN systems tested to publically available results published for DAS and traditional Fibre Channel SAN systems. Price-performance was determined using a simple calculation of cost in dollars for a specific configuration divided by the number of MB/sec supported by that platform.

Why This MattersThe metrics that matter when shopping for a high capacity, high performance storage solution are performance, price, and scalability. In other words, how many dollars will be needed to meet the performance and capacity needs of scale-out applications? ESG Lab has confirmed that each SRX3500 can deliver hundreds of MB/sec of throughput for bandwidth-intensive scale-out applications using cost-optimized, high capacity SAS, SATA, and SSD drives and users can scale up to a petabyte of high performance capacity in only two racks at a cost of storage and connectivity far below Fibre Channel, iSCSI, or even DAS. |
Virtualization Optimized
Coraid EtherDrive SAN storage systems integrate with VMware using a simple driver that enables VMware to mount EtherDrive storage arrays as if they were local drives. A VMware administrator can provision and manage virtual machine storage without the need for FC SAN administration or iSCSI client configuration.
ESG Lab Testing
ESG Lab performed virtualization tests on a VMware ESX 4.0 environment with two physical servers and six virtual machines.
First, ESG Lab logged into the vSphere client and clicked on server 192.168.0.214. As seen in Figure 10, the Coraid EtherDrive HBA was visible in the list of storage adapters and volume 10, created using the steps in Figure 5, was visible and ready for use.
The volume was formatted and made available to virtual machines using the Add Storage wizard, shown in Figure 11.
Next, the volume was assigned to a virtual machine using the native VMware Add Hardware wizard. Once the addition was complete, the volume was visible to the Windows operating system on the virtual machine. Figure 12 shows the Windows Disk Administrator tool with the new drive circled in green.
Finally, ESG lab examined availability, testing the synchronous mirroring capability of the Coraid EMX EtherDrive Mirror Appliance as well as the ability to physically move disk drives between chassis without disruption.
The availability test bed, depicted in Figure 13, consisted of three Coraid EtherDrive SR2421 shelves, one EMX Mirror Appliance, and one vSphere server, with one virtual machine running Windows Server 2008.
Two 12-disk RAID5 LUNs were created on two separate shelves and synchronously mirrored through the EMX appliance. Mirroring two volumes using the EMX appliance could not have been simpler. The mkmir command was used to select the source and target volumes to be mirrored. This single command pairs the volumes and starts the synchronization.
Next, the volume was assigned to a Windows server 2008 VM on the vSphere server. Once the volumes were fully synchronized, an IOmeter workload was started on the server, performing a mixed read/write workload against the volume, set to continue indefinitely. Power to the primary SR shelf hosting one side of the mirror was killed. Iometer continued reading and writing to the volume with no errors.
Finally, a single eight-disk RAID5 LUN in a single chassis was used to test the online drive relocation capability of the Coraid architecture. The LUN was assigned to a Windows 2008 VM and an IOmeter workload was started on the server, again performing a mixed read/write workload against the volume, set to continue indefinitely.
Power was killed to the chassis housing the eight-drive RAID 5 LUN. All eight disks were then physically relocated from the primary chassis to a spare chassis. The spare chassis was then renamed to have the same shelf number as the original chassis and the eight-disk LUN was placed online.
Total time for this physical failover was approximately three minutes. After the LUN was placed back online, the IOMeter transactions resumed successfully with no further service interruption. Most, if not all, other architectures, including highly available Fibre Channel and iSCSI SANs, simply cannot take LUNs offline in a VMware environment while machines are running without bringing the server to a crashing halt.
Why This MattersAs virtual infrastructures grow, the requirement for storage space grows exponentially. According to ESG research, over half (54%) of current server virtualization users estimate their organization has experienced a net increase in total storage volume since their organization implemented a server virtualization solution.[5] The ability to take advantage of networked storage as if it were locally attached storage allows common storage functions to be performed quickly and easily, reducing wait times for storage needs. As virtualized environments grow, more critical applications find a home there. As more critical applications are placed on virtualized servers, the need for highly available networked storage becomes essential. ESG Lab was able to provision storage for virtual machines without the need for a storage administrator to complete the task. Likewise, the entire virtual storage infrastructure and the mappings to Coraid storage devices were visible through the vSphere client. The Coraid EMX Mirror appliance was able to synchronously mirror a live volume and provide seamless failover with no interruption in service. The ability to move disks between chassis live and online, while under load, was an eye opener, the support implications of simply relocating disks to a hot spare chassis are profound. Most, if not all, other architectures, including highly available Fibre Channel and iSCSI SANs, simply cannot take LUNs offline in a VMware environment while machines are running without bringing the server to a crashing halt. |
ESG Lab Validation Highlights
- ESG Lab configured, provisioned, and was utilizing Coraid storage in less than two minutes from power on.
- The SRX3500 demonstrated the ability to support thousands of Exchange users using just 12 SAS drives.
- Coraid EtherDrive SAN was able to drive more than 1200MB/sec from a single appliance, enough to stream 2,560 broadcast quality video streams simultaneously.
- Commodity hardware and cost-efficient AoE connectivity enable a cost of acquisition far less than Fibre Channel, iSCSI, and even DAS.
- Coraid proved well-suited to virtualized environments, providing simple to provision SAN storage that looks to a VMware cluster like direct attached disk.
- The EMX Mirroring appliance provided synchronous data protection for volumes across shelves with no disruption to service.
- ESG Lab was able to remove drives that were actively being accessed and move them to a different chassis with only a momentary pause in IO and no errors.
Issues to Consider
- Coraid’s EtherDrive SAN is currently managed through a command line with no GUI. The system is incredibly simple to use and manage, with all necessary functions controlled through a few simple commands and logical, human readable addressing of shelves, disks, and LUNs. Coraid indicated plans to ship an upgraded management system in Q3 2010 with a GUI and REST API support.
- The Coraid EtherDrive SAN solution does not yet offer advanced storage virtualization functionality such as thin provisioning or storage tiering. The driving factor behind these features, reducing the cost of storage, does not necessarily affect Coraid as it does traditional SAN architectures, which typically sell for many multiples of Coraid’s acquisition cost. In addition, these features are increasingly available in software at the hypervisor or file system layer, further obviating the need for them as array-based features.
The Bigger Truth
With storage costs consuming at least 28% of IT budgets,[6] companies are under constant pressure to find ways to reduce costs. Taking a long hard look at reducing capital and operational costs in the storage environment makes sense and so today, more than ever, IT is investing in new technology with a clear focus on reducing storage costs.
The high capacity and performance requirements of scale-out applications including backup to disk, content delivery, server and desktop virtualization, clustered computing, rich media, and un-structured bulk storage are taxing the budgets and infrastructure of IT organizations. Traditional storage network infrastructure can provide the capacity, agility, and performance these applications need, albeit at a high cost of entry and daunting complexity. Rows of equipment are often needed to provide a petabyte of capacity and gigabytes per second of throughput. Data center managers are being pushed to the limit as administrators spend more and more time managing an ever-expanding SAN infrastructure.
ESG Lab found that the Coraid EtherDrive SAN storage system delivers shockingly simple deployment and management, with complete functionality delivered via a handful of easy to use commands and rock solid Ethernet SAN connectivity delivered via the extremely lightweight AoE protocol. The ease of management, deep scalability, and performance required for bandwidth-intensive scale-out applications are seamlessly extended to VMware environments as well.
ESG Lab testing has confirmed that Coraid’s architecture provides consistent levels of throughput—even during hardware faults. Sustained throughput in excess of 1,200 MB/sec was observed for large block sequential reads. Cost-efficiency was impressive, with acquisition costs as low as 20% of the costs of traditional SAN attached storage. ESG Lab also verified a very interesting recoverability and resiliency feature, whereby drives can be moved to a spare chassis while an application is running.
With EtherDrive SAN storage, Coraid has dramatically simplified storage for consolidated and virtualized environments while enhancing performance and providing incredible cost efficiency. While the speeds and feeds are impressive, ESG Lab is most impressed by the shocking simplicity of both the AoE protocol and the Coraid architecture, making management of petabytes a reasonable task. If your organization is struggling to keep up with exponential data growth while providing ever higher levels of performance and availability, ESG Lab recommends that you consider Coraid EtherDrive SAN storage as the foundation for your virtualized data center.
Appendix

[1] Source: ESG Research Brief, Enterprise Storage Priorities Emphasize Information and Infrastructure Efficiency, January 2009.
[2] Source: ESG Research Report, 2010 IT Spending Intentions Survey, January 2010.
[3] Configuration details can be found in the appendix.
[4] http://msexchangeteam.com/archive/2007/01/15/432207.aspx
[5] Source: ESG Research report, The Impact of Server Virtualization on Storage, December 2007.
[6] Source: ESG Research Report, Enterprise Storage Survey, November 2008.
ESG Lab ReportsThe goal of ESG Lab reports is to educate IT professionals about emerging technologies and products in the storage, data management and information security industries. ESG Lab reports are not meant to replace the evaluation process that should be conducted before making purchasing decisions, but rather to provide insight into these emerging technologies. Our objective is to go over some of the more valuable feature/functions of products, show how they can be used to solve real customer problems and identify any areas needing improvement. ESG Lab’s expert third-party perspective is based on our own hands-on testing as well as on interviews with customers who use these products in production environments. This ESG Lab report was sponsored by Coraid. |






An alert reader pointed out a technical error in Figure 3 of this report. IPSEC was incorrectly placed below IP in the iSCSI protocol stack. Figure 3 has been corrected and replaced. Thanks Stephen!