Enterprise Strategy Group | Getting to the bigger truth.TM
Register to view ESG Content
Search

reports.gif Market Reports: Virtualized Scale-Out NAS: Enabling Flexible, On-Demand Data Services
Published on Monday, November 30th, 2009 at 10:05 am
Categories: File-based Disk Storage Systems and File System Software | IT Infrastructure | Market Reports | Storage |
Authors: Terri McClure |
starstarstarstarstar
User-generated content, unstructured data, video surveillance, low-latency trade data, and life sciences are some of the fastest data growth types, but it seems no industry is safe from massive file data growth. Across the board, file formats are richer and file sizes are growing exponentially. This massive data growth is creating demand for new and innovative scale-out file storage solutions to economically scale bandwidth, manageability, and performance to previously unheard-of heights. Using traditional scale-up architectures in these types of environments is unrealistic. “Scale-out NAS” are systems designed from the ground up for economically dynamic scale and for supporting extremely high bandwidth applications.

New Economic Realities

The Information Evolution

The information we store today is very different from the information we stored a mere decade ago.  File formats are richer, creating exponential growth in file sizes and data.  Every endpoint has become a content creation and capture device.  These devices have advanced to enable faster and more efficient business processes and are driving massive unstructured data growth of both business- and user-generated content.  Nowhere has the impact been felt more than in the data center storage domain.  Chip manufacturers are rendering multi-terabyte files, oil and gas exploration rely on 3-D models in the hundred terabyte range, and health care, with high-definition and 4-D imaging, is creating files in the hundreds of megabytes.  Online service providers are seeing tremendous growth from user-generated content.  According to YouTube, more than ten hours of video are uploaded every minute of every day.[1] And over two billion photos and fourteen million videos are uploaded to Facebook each month.[2] User-generated content, unstructured data, video surveillance, low-latency trade data, and life sciences are some of the fastest data growth types, but it seems no industry is safe from massive file data growth.  Across the board, file formats are richer and file sizes are growing exponentially.

Corporate computing environments, while lagging behind consumer markets, are slowly but steadily moving into the realm of Web 2.0.[3] Online communities, social networking, new media, collaboration, and other applications are pushing their way into the commercial computing world. Like it or not, IT is going to have to get ready for these rapidly evolving realities of today’s business. Tools such as SharePoint, blogs, wikis, streaming media, and a host of other digital content creation and management applications are enabling organizations to redefine themselves in near real-time.

This massive data growth is creating demand for new and innovative scale-out file storage solutions to economically scale bandwidth, manageability, and performance to previously unheard-of heights.  Using traditional scale-up architectures in these types of environments is unrealistic.  “Scale-out NAS” are systems designed from the ground up for economically dynamic scale and for supporting extremely high bandwidth applications.  These systems have become standard fare in throughput- and performance-intensive environments, such as media and entertainment and high performance computing (HPC).  But the enterprise adoption of commercial HPC applications and advances in digital media, combined with advent of Web 2.0, bring the requirement for these types of systems squarely into corporate data centers.

Users Examine More Cost Effective Storage Approaches

Today, the challenge facing most enterprises is that file data growth is already out of control; the growth of file data has been outpacing e-mail- and database-driven growth for quite some time now.  In fact, in a recent ESG survey, more than one in five medium-size businesses cited rapid growth in file-based content as one of their most pressing storage challenges.[4] ESG also estimates that file-based data will account for 70% of total archived capacity by 2012.[5] For commercial enterprises, the faster growth and new file characteristics enabled by advances in content capture devices, combined with the emergence of Web 2.0 applications in the data center, only exacerbates the problem.  As such, ESG expects customers to continue to make new NAS purchases to accommodate this growth and to drive capital and operational cost efficiencies by consolidating sprawling file servers.

But consolidation and richer file data are not the only issues driving users to examine scale-out NAS solutions.  The global financial crisis has driven IT to examine every new purchase with an increased focus on finding opportunities to reduce both capital and operational expenses.  Technologies that reduce overall storage requirements or that drive higher levels of resource utilization are seeing a significant uptick in interest—and traditional ways of doing business are being re-examined to find ways to drive better efficiencies.

Scale-Out NAS Makes Economic Sense

Multi-dimensional scale is a core requirement of rich file-based storage architectures as well as other applications with similar requirements.  Scale-out, the ability to independently scale and tune bandwidth, processing, and storage capacity on the fly—all while managing the file system and single global namespace—is becoming the new backbone of file-based storage solutions.

Scale-out storage architectures are significantly different than the monolithic, scale-up storage architectures (e.g., traditional NAS or SAN systems) that were developed to meet distributed computing needs.

Scale-Up versus Scale-Out

Scale-up NAS is just as it sounds: it is designed to be monolithic, where lots of storage sits behind one or two NAS heads, and is designed to scale into the multi-TB range behind those NAS heads.  Once the limit on storage is hit, a new monolithic system is installed, with a new file system to manage.  There is no way to share the workload between the systems and migrating directories or files between systems means remapping and remounting for each and every client with access.  Those that have been through it know the pain of the process; it can be excruciating in a large enterprise environment with lots of clients and zero tolerance for downtime.

Performance in today’s monolithic scale-up systems is often scaled by adding a storage rack and more spindles to increase throughput and reduce latency.  In addition to adding spindles, the extra drives are often short-stroked to reduce seek time and further increase throughput, creating an environment with very low storage utilization rates.  This is an expensive proposition for serving throughput-intensive applications.  To create an efficient file serving environment, storage capacity needs to be scaled independent of bandwidth and servers (scale-up) and IO capacity and sequential file serving performance needs to be scaled by adding nodes (scale-out).

The Economics of Scale-Out NAS

Scale-out NAS not only meets rich media performance requirements, it does so cost effectively. With independent scaling of storage capacity, processors, and bandwidth, users can grow when and as needed without buying racks and power supplies in advance of capacity or buying extra spindles to stripe files across.  Consequently, scale-out NAS provides “just-in-time” scalability.  And, with most scale-out systems, many low level storage management tasks are automated, such as expanding the file system when new physical capacity is added and load balancing performance across processors, significantly reducing management costs.

Adding processing power independently, as can be done with scale-out systems, saves more than floor and rack space. In addition to getting better performance, it significantly reduces power consumption relative to scale-up systems since processors typically use 95% less power than an additional disk shelf consumes.

This granular scaling capability provides a price/performance advantage as it allows users to start small and scale where needed.  And, since scale-out systems scale into the multi-petabyte range and are managed as a single entity under a global namespace, the systems can meet most users’ needs without paying the management penalty associated with deploying tens or hundreds of scale-up systems.

In late 2008, ESG conducted a survey of 504 North American and Western European IT professionals to assess data storage environments, including the adoption of scale-out NAS. Market drivers for early adopters included faster provisioning, improved scalability and performance, easier management, and the need to support specific fast-growing applications. Lower cost of infrastructure was literally last on the list of buying criteria. However, planned and potential users have vaulted lower cost into the top tier of purchasing criteria, second only to improved scalability, which is the crux of the technology (see Figure 1). For users evaluating new NAS solutions, initial cost has become a higher priority than the advanced features and functions of scale-out NAS systems, though scale-out systems provide cost advantages that compound over time.

Figure 1. Scale-out NAS Adoption and Planned Adoption Drivers

scaleoutF1Scale-out NAS architectures have a number of cost advantages over scale-up solutions, ranging from start up costs to managing technology refreshes—and most of the steps in between.  Scale-out NAS carries a significantly lower infrastructure cost compared to scale-up systems for a number of reasons:

  • Low entry cost: The entry cost for scale-out systems varies depending on the minimum configurations supported.  Most systems start as small as two nodes and scale out from there.  This stands in significant contrast with enterprise-class scale-up storage systems, where you have to buy a big system and fill it with disk drives over time—powering, cooling, and taking up floor tiles well ahead of putting additional capacity online.
  • Just-in-time scalability: Performance or capacity nodes can be added as needed.  Because of the modular nature of scale-out systems, there is no need to buy (and power or cool) frames, power supplies, and mostly empty cabinets in advance of storage capacity.
  • Riding the commodity curve: Scale-out NAS systems typically use low-cost, high capacity, commodity SATA disk drives.  Because of their multidimensional scale and load balancing capability, the slower performance of SATA drives relative to Fibre Channel drives can be mitigated and is entirely suitable for the markets previously discussed, like HPC, life sciences, and Web 2.0.  The same cost advantages are typically found on the NAS head, where commodity processors are used.  Because of the granular scalability of scale-out systems, users don’t need to buy frames or processors far ahead of disks themselves, so they typically get better pricing as Intel processors and disk prices decline in cost over time.  Riding the Intel and high capacity disk commodity curves can add up to significant cost savings, especially at the scale seen in these types of environments.
  • Higher utilization rates: Better utilization means deferred purchases of new capacity.  Since all of the NAS heads in typical scale-out systems can address the entire pool of usable capacity in the cluster, there is no capacity locked away behind underutilized NAS heads—a common problem in scale-up systems.  It is not unusual to see utilization rates of 30% or less in scale-up systems and 60% or more in scale-out systems.  Some scale-out vendors report utilization rates greater than 80%.

Relative to scale-up systems, operational savings can be achieved over time with scale-out systems thanks to:

  • Reduced change management planning cycles. When one file can be multiple terabytes in size, conventional three or six month change management planning cycles are no longer effective.  Requirements are unpredictable and time-to-provision is more important than ever.  The modular and easily scalable characteristics of scale-out NAS allow for extremely fast provisioning while the lengthy change management and provisioning process required for monolithic systems just isn’t fast enough to respond to today’s rich media and Web 2.0 demands.  Organizational agility demands that change management cycles be reduced—and scale-out NAS allows that to happen.
  • Non-disruptive technology refreshes.  With most scale-out systems, the process of managing technology refreshes is faster and easier than with monolithic scale-up NAS.  In a clustered NAS architecture, everything is redundant—the data paths, NAS heads, and the data itself.  Several scale-out vendors provide both forward and backward compatibility with new versions of hardware, firmware, and software so new versions can co-exist in the same namespace as older versions.  This provides users with the ability to do rolling upgrades, plugging new nodes into the system and unplugging nodes when they need to be retired.  Each scale-out vendor does this slightly differently and it is not a process that should be undertaken lightly, but it is a vast improvement over the lengthy process required to migrate terabytes of data off of a monolithic system and onto a new one.
  • Capacity scaling without scaling headcount. Essentially, it should be just as easy to manage a clustered storage system with 100 nodes as it is to manage one with two nodes.  Scale-out NAS systems enable this through a global namespace.  This is a virtual representation of a group of disparate physical file systems.  It sits between clients and the assorted file servers in a given environment and adds a layer of abstraction that divorces what the client sees as mount points from the physical server mount points.  It is a map that translates the virtual mount points to physical file servers and presents users with one consolidated view of the file server ecosystem.  It is the secret sauce that enables a single point of management and non-disruptive data migration.  Regardless of how big the cluster gets, it should still remain a single logical system to manage. The ease of management over the life of the storage system is even more valuable than scalable performance.
  • Automated, policy-based management. Removing the need for human intervention in low-level storage management functions is another way that scale-out NAS reduces management cost.  Most scale-out file storage systems support deep levels of policy-based self management and healing.  Most systems are also plug-and-play—add a storage or processor node and the system self-discovers and expands the file system or incorporates it into load balancing algorithms on the fly.  There is typically no disruption of service and no requirement to plan data layouts, create LUNS, or migrate data. Many of these products are newer to the market and have been designed from the ground up to automate storage management processes.  Scale-out systems typically absorb new processor, bandwidth, and storage capacity and then automatically re-balance and optimize across the newly added resources—with little or no human intervention.  This is significantly different than managing these functions in scale-up NAS systems.  Most scale-up systems have hot spot reporting and some have load balancing across drives within RAID groups, paths, or host bus adapters.  Some of the load balancing is manual, some is automated, but scale-up systems do not have the capability to automatically balance loads across NAS heads—not without adding a virtualization appliance to mask the move from clients.  For scale-up systems, balancing workloads across NAS systems is a fully manual process—one that takes significant time and effort to migrate file systems and directories and remap mount points.

Based on the compelling economic benefits of deploying scale-out NAS solutions, it’s no surprise that ESG research indicates that users are applying scale-out NAS systems to new use cases.  While most scale-out systems are tuned to perform well for high bandwidth applications, a steadily increasing number of vendors are offering scale-out systems that can also be tuned to support the smaller transaction-oriented file serving requirements of today’s distributed computing environments.  In fact, 43% of scale-out NAS users surveyed by ESG indicated that the technology is used to support database and OLTP transactions.  Further proof that scale-out NAS is increasing its footprint in the general storage space is that even though only 11% of those surveyed indicated they use scale-out NAS systems today, 40% indicated they plan to deploy it within the next 12 months and while another 37% have no immediate plans to deploy scale-out NAS solutions, they are investigating the technology (see Figure 2).

Figure 2. Scale-Out NAS Adoption Plans and Interest

scaleoutF2

The Software Virtualization Layer

For the past several decades, data management and protection functionality has been migrating from being a server and application functionality to being a storage functionality.  But in the past few years, users have realized that there can be some drawbacks to that approach, such as vendor lock-in, specialty information stovepipes, limited scalability, and management challenges as each system is deployed and managed separately. On the other hand, there are a number of advantages to deploying scale-out as a software virtualization layer.  Migrating this functionality back up the stack and creating a virtual layer brings benefits that include:

Cross-platform tiering. Many vendors are discussing storage tiering, but they often mean “in the box” tiering that supports a variety of drive types within a single array.  Considering that most data only stays active for about 30 days after it is created, it makes sense to migrate that data to denser, slower disk drives.  But if the data was created in a Tier 1 storage system, there is a high price associated with buying the system, no matter what types of drives are included—the system was designed from the ground up to meet demanding performance needs.  Moving data with one of these systems from high-performance storage— disk or solid state—to slower denser drives still carries the “array tax” that comes with the high end enclosures, electronics, and operating systems.  With a software virtualization layer, cross-platform tiering can be deployed to actually migrate data across hardware types and drives from a Tier 1 array to an array designed for bulk data stores—one without all the magic (and associated costs) required to support demanding storage requirements.

Resource pooling. A virtualized storage infrastructure allows users to create a shared resource pool that can flex resources on demand when and as needed.  The storage pools can be dynamically configured to meet performance or capacity or throughput optimization requirements by adding resources as required, and multiple resource pools optimized for different performance and protection profiles can live within the same namespace.  Systems can be scaled up, down, or out as needed to meet business demands.  Because the environment is virtualized, stovepipes are eliminated and storage is offered in a flexible deployment model to support business requirements.  Because the shared pools can span across platforms, higher utilization rates can be attained for further savings on power, cooling, and floor space.

Infinite availability. Because the software masks the physical ties between clients that access data and where that data physically lives, users can deploy a true always-on architecture.  Data can be transparently migrated to newer generations of hardware without disrupting data access.  Scheduled maintenance can be performed without taking the system offline.

Reuse of existing assets. A software virtualization solution can be plugged into an existing SAN infrastructure and leverage legacy storage.  This extends the life of existing investments as older, slower technology can be deployed for bulk data stores and newer higher tiers can be brought into the resource pools to support growing file demands.  The virtualized environment is an open environment that gives users choice and flexibility for the underlying hardware layer.

More and more functionality is migrating up the stack, adding greater flexibility and faster deployment of resources to meet business demands.  The virtualization layer brings IT a giant step forward in its march to provide on-demand IT services to the business.

A Virtualized Scale-out Infrastructure is Foundational for Cloud Storage

Cloud storage requirements and challenges are strikingly similar to those found in the traditional data center, only magnified.  Traditional scale-up storage technologies won’t make the cut thanks to limited scalability of scale-up platforms—cloud storage providers, whether building public or private clouds, can’t afford to install hundreds of scale-up systems to meet capacity demands.  As the number of arrays under management grows, the storage environment becomes increasingly complex, harder to manage, and more costly to operate.  This brings negative consequences to the business: increased time to market, loss of productivity, and decreased flexibility.  Traditional storage technologies continue to excel in the areas they were designed to address—namely, transactional and distributed computing—but these solutions fall short at Internet scale.

When considering cloud storage infrastructures, a number of characteristics must be included to make it cost effective to deploy and they are the same characteristics that make scale-out systems attractive to data center managers: cloud storage needs to scale quickly and to tremendous capacities.  Cloud storage must be elastic, to quickly adapt the underlying infrastructure to changing subscriber demands, and automated, so that policies can be leveraged to make underlying infrastructure changes or place content on different storage tiers or in geographic locations quickly and without human intervention.

Today, the Internet has reached every corner of the world, effectively creating a flat global network with few, if any, barriers to connectivity.  The combination of WAN acceleration and ubiquitous network connectivity allows business to be conducted anywhere.  On the platform front, scale-out, commodity-based platforms that provide massive scalability, parallel data transfers, and economies of scale not just for hardware, but ease of use and management, are currently available.  This combination of network and storage technology advancements is enabling the cloud storage movement.

The Bigger Truth

The current macroeconomic climate has created an environment that has users examining every purchase with a new eye on how to reduce operational costs.  They want vendors they can trust that offer proven solutions and products that offer real value in terms of cost savings and enabling business agility.

New rich media content is being created for everything from research and development, to training, to marketing, and is becoming a mandatory component of everyday business.  Whether it’s blogs, video, or HD imaging, content is easier than ever to create—and management will become harder than ever without significant changes.  Keeping up with data growth driven by new types of applications, richer media types, and the ubiquity of content capture devices requires a new approach to keeping storage costs in check.

Enterprises that deploy scale-out NAS solutions can get more value, dollar-for-dollar, from their infrastructure investments.  Scale-out NAS has a compelling value proposition relative to scale-up systems; its lower infrastructure costs, power efficiency, and management efficiencies should put scale-out solutions on the short list for anyone deploying new NAS capacity and is mandatory for those looking to deploy storage as a service via private or public clouds.

There are certainly trade-offs to be considered.  Not all applications require scale-out solutions—there is still plenty of room for traditional big-iron monolithic NAS systems.  Matching application performance profiles and business requirements to the proper storage platform is important, but IT managers have an opportunity to realize significant savings, including lower management costs, right-sizing, and scaling only in the required dimensions (capacity, processing, and/or bandwidth) by deploying scale-out NAS solutions.  The operational savings associated with just-in-time scale—reduced power, cooling, and floor space requirements; reduced storage management headcount; and faster response to provisioning fire drills—can all add up to a more efficient and agile enterprise.  Deploying a virtualized software layer can further free up resources and create cross-hardware-platform pools of resources that can be more efficiently deployed across the enterprise as needed and can be better aligned to better meet business requirements.


[1] Source: YouTube Fact Sheet, http://www.youtube.com/t/fact_sheet, September 28, 2009.

[2] Source: Facebook Press Room, http://www.facebook.com/press/info.php?statistics, September 28, 2009.

[3] For more information on the Internet Era of computing, see ESG Market Report: Commercial Computing Market Dynamics, May 2009.

[4] Source: ESG Research Report, Medium-Size Business Server and Storage Priorities, June 2008.

[5] Source: ESG Research Report, 2007 File Archiving Survey, December 2007.

Printer-Friendly Version.
Please login to view a printer-friendly PDF version of this document. If you are not a member, please register. When you register, you will be able to view PDF versions of all our freely available documents, and rate and comment on site content.
For important information about using this content, please review our Terms & Conditions

4 responses to "Virtualized Scale-Out NAS: Enabling Flexible, On-Demand Data Services "

  1. andy_sparkes from HP (IT Vendor) says:

    I agree with most of this report as data growth ultimately requires a scale out file system with integrated data management capabilities, in fact a single repository that can have different characters. These could be different levels of performance tiers but also different levels of data management applied to the tiers or directories of the file system. As we have to deal with the new world of Petabytes then its clear that the scale up approach is no longer going to cut it. What I'm interested in understanding more is the velocity of this change. Some would probably quite rightly catagorise Scale out NAS as niche. When do you think that they will become mainstream. Your data above suggests that 38% intend to deploy in the next 12 months. Do you think this means that scale out NAS becomes mainstream in 2010?

  2. SteveDuplessie says:

    My two cents: yes. Not that people will necessarily "know" they are actively buying scale-out NAS systems, but within a few years, you won't be able to buy systems that don't have this capability. HP is mainstreaming their NAS on Ibrix, which is scale-out by default. It's only a matter of time before NetAp moves in earnest to their scale-out platform. These two powerhouses alone can alter the market landscape in a huge way. Combine that with the likes of Isilon, etc. that are already doing it, and it's sort of easy to see how it can only accelerate. The entire new wave of NAS/file system/cloud companies are built on scale-out technologies - with really no one investing in scale-up any longer outside of traditional incumbents. Plus, at the end of the day, who doesn't want scale-out?

  3. TerriMcclure says:

    Thanks for the comment Andy. We're still pretty early in mainstream adoption of scale-out technology, but there is greater awareness of scale-out technology today and it is growing thanks to some of the major vendors standing behind it. I also think as interesting as the 38% planned adopters number is the 37% of users interested says a lot, so I do think scale-out adoption will accelerate in 2010. We'll be digging in to the topic with a follow up survey on scale out technologies in the 1H 2010 - but I agree with Steve on this one, the shift to scale-out is inevitable, there are too many benefits to ignore.

  4. andy_sparkes from HP (IT Vendor) says:

    Didn't think I would get a response this late after the report was issued :). I think we are all agreed here but having been in the storage industry for more years than I care to remember the adoption rate in storage can be glacial and am hoping you are right that 2010 is an important year for scale out and it will become the defacto mechanism for file serving.

Please register and/or login above to post a comment.