Enterprise Strategy Group | Getting to the bigger truth.TM
Register to view ESG Content
Search

brief.gif Briefs: Unstructured Data in 2010: Trends to Watch
Published on Thursday, February 11th, 2010 at 11:44 am
Categories: Briefs | File-based Disk Storage Systems and File System Software | IT Infrastructure | Storage |
Authors: Terri McClure |
starstarstarstarstar
CIOs care about risk, cost, and business responsiveness but the complex, inflexible, and proprietary nature of legacy storage systems has not helped those causes. The biggest opportunity for storage savings comes from addressing unstructured (or file-based) data. Unstructured data growth shows no signs of abating: it will make up the bulk of data growth in the data center in 2010, driving IT to take a long hard look at unified storage platforms, scale-out NAS, and cloud storage services to alleviate the strain. ESG expects a number of significant events to unfold throughout 2010 as the scale-out market matures and users begin the journey towards cloud storage.

Overview

There is little question that 2009 was one of the toughest years in recent history for the IT industry, with far-reaching budgetary implications for technology vendors and IT end-users alike.  While budgetary pressures won’t let up in 2010, ESG’s 2010 IT Spending Intentions Survey[1] finds that cautious optimism reigns among IT shops: most organizations are moving out of cost reduction mode, but are likely to characterize themselves as being in cost containment, not growth, mode.  This should lead users to continue to look for ways to reduce operating costs by reducing storage infrastructure complexity.  ESG expects the combination of unstructured data growth, the increasing maturity of scale-out and unified storage platforms, and the emergence of cloud storage as a viable storage alternative will push users to start the long process of addressing unstructured data storage and transforming their environments into a more efficient services-oriented architecture.

File Storage Trends to Watch in 2010

  1. Continued interest in commodity-based scale-out platforms in the data center. Driven by the long-term aftershocks of the economic slowdown, there is both user pull and vendor push for these solutions.  On the user demand side, ESG research conducted in late 2008/early 2009 showed significant user interest in scale-out NAS solutions thanks to their scalability, business agility, and operational efficiencies.[2] With 2009 spending slowing to a near stop, interest in scale-out mostly stayed just that: interest.  In 2010, ESG expects that interest to translate into actual spending, aided by increased visibility from big-name vendors like EMC, HDS, HP, IBM, and NetApp as they continue to invest in scale-out offerings and validate commodity-based scale-out architectures for enterprise applications.
  2. Bifurcation continues, as both integrated vertical stacks (such as HP’s X9720, which integrates its IBRIX Fusion software, blade servers, and StorageWorks arrays; IBM SONAS; NetApp 7G series; and Isilon’s X, S and NL-series products) and horizontal services-based approaches (such as Bycast StorageGrid or EMC Atmos layered on commodity hardware) gain steam.  Of course, there are pros and cons for each approach: integrated systems are typically faster to deploy and easier to manage, with a very favorable ratio of storage or system managers to each terabyte of data.  But these solutions are proprietary and create vendor lock-in across the stack from storage arrays, to NAS heads, to file systems.  Horizontal, layered approaches let users choose best-of breed technologies for each layer and can be flexibly deployed.  The tradeoff is that they are more complex to install and manage—requiring a higher level of professional services support to scope, size, and deploy the solution—and typically require a higher ratio of storage or system administrators per terabyte than integrated solutions.
  3. Unified storage displaces specialty SAN and NAS for Tier 2+ applications. This is continued fallout from the global economic meltdown as users look to continue driving down operational costs.  Users are suffering from storage “complexity fatigue” as the need to deploy new specialty storage systems for different applications has become commonplace.  With unified storage, users can plan and manage storage as a flexible pool to support either block- or file-based data rather than planning for and managing separate block and file-based storage environments.  The flexibility to deploy resources where needed helps increase utilization as capacity isn’t locked away in the wrong type of storage, reducing the number of systems that need to be deployed.
  4. Unified storage continues to gain momentum as a back-end for virtual server and virtual desktop environments.  At the end of the day, encapsulated virtual server and virtual desktop images are just files.  Users have been experimenting with file back-ends since the beginning of the virtualization wave with good results.  Vendor alliances like the NetApp, VMware, and Cisco hookup in early 2010 will help accelerate this trend.  That said, ESG expects enterprise users to deploy VMs with block-based raw device mapping (RDM) for Tier 1 application performance reasons, while smaller IT shops and Tier 2 applications will use fully encapsulated VMs, which have the OS, application image and data in the VM and can use a NAS back-end.  Either way, unified storage gives users a choice of how they store virtual machine data, without having to forecast and buy separate SAN and NAS capacity.
  5. “Green” comes back into fashion. Despite the fact that organizations can significantly reduce operational costs by deploying a more energy-efficient infrastructure, the fact that “green” had become somewhat of a trendy buzzword by the end of 2008 meant that both IT users and vendors like were predisposed to tune out “green” messages once the global economy went into freefall.  In 2010, users will come out of tactical cost reduction mode and look at the bigger picture to find ways to reduce the environmental impact (which also helps reduce operating costs) of IT.  More efficient, dense storage systems using high capacity disk drives will gain momentum to reduce data center footprint and power and cooling costs.
  6. 7. Policy-based storage management gets more focus. Automated storage tiering got a lot of play in 2009 thanks to the big guns at EMC and its Fully Automated Storage Tiering (FAST) announcement.  ESG expects to see more developments on this front from NAS vendors in 2010. One of the big inhibitors of tiering and information lifecycle management was and continues to be the ability to classify data to determine its proper storage tier.  That can’t be done within the storage array, which lends itself only to moving data based on access patterns.  It can be done for unstructured data from the file system management layer by leveraging file metadata. 
  7. 8. Object storage continues to get buzz, but little traction. Vendors like Panasas and EMC (with both Atmos and Centera) offer object-based storage.  Object-based systems store data in a variably-sized “container” that holds both the data and metadata.  Presenting objects provides some distinct advantages: for example, the enhanced metadata that can be added and packaged with each object provides enhanced management capabilities.  Panasas can also break files into multiple related objects and stripe them across nodes to increase performance throughput using parallel channels.  The challenge with object storage, however, is similar to the challenge with vertically integrated systems: vendor lock-in.  Once the object-based storage system is deployed, users have to buy from that vendor as long as that data exists.  The alternative is a painful and lengthy migration.  For a number of users, the ease of use, scalability, and overall efficiency of these systems make them worth the investment; EMC shipped a lot of Centera systems thanks to those very characteristics.  For its part, Panasas has seen good traction in markets that require the massive throughput attained with its parallel architecture.  Despite the potential benefits of object-based storage, alternatives like NFS and CIFS are already adopted, standards-based, and well-understood—there is a comfort factor for IT in deploying tried and true technology.  The trade-offs to move to object-based systems are attractive, but may not be worth the lock-in penalty. 
  8. Unstructured data builds public cloud traction. Local gateways that create an on-ramp to a cloud storage “tier” will be increasingly used to augment onsite capacity for non-critical data.  Users remain concerned with regulatory compliance and security for much of their data, but there is a great deal of data that doesn’t require lock-down and audit, especially in industries that are not tightly regulated, making this information well-suited for the cloud.  Education is a great example: a university could easily leverage cloud storage services for student home directories.  Imagine the potential saving for a campus supporting tens of thousands of students.  Long-term archive of non-critical information is also a good use for cloud storage and should drive businesses to cloud archive providers like Iron Mountain Digital that offer policy-based archive protection and management.
  9. Continued market consolidation is pretty much a given in 2010. We saw some big moves in 2009 with HP buying IBRIX and LSI buying OnStor. A number of smaller NAS vendors are still struggling with the hangover from slow spending in 2009 and investors are still tight with funding.  One trend we saw in 2009 was users reducing the number of vendors they do business with as a way to reduce costs.  You can bet these organizations generally were not throwing out their major IT vendors—another trend that didn’t help the little guys.  Dell is (and has been) rumored to be looking at a number of NAS vendors to augment its Windows-based offerings and, as this brief goes to press, confirmed it bought the assets of scale-out NAS vendor Exanet.  Both HDS and IBM rely on OEM relationships for the bulk of their NAS business, so it would not be surprising to see moves from either of these powerhouses.

The Bigger Truth

Unstructured, file-based data will continue to grow at a blistering pace in 2010 and IT will continue to struggle to manage it.  Now that the budget shackles put in place since late 2008 have been loosened somewhat, users are backing away from tactical savings and budget cuts and looking for opportunities to spend capital dollars to realize long term operational savings.

Managing data growth more efficiently is one area of “low hanging fruit” by which CIOs can reduce IT costs and cycle times. The predominant NAS architectures in today’s data centers really have not changed much in the past 15 years.  These systems were designed to support distributed computing environments and to scale to hundreds of disk drives.  Today’s petabyte-sized environments are exceeding the limits of scale-up systems, giving rise to storage system sprawl and increasingly complex storage environments.  In addition to increasing costs, this complexity introduces risk: complex environments and the manual processes that have developed around them can compromise the best laid data protection strategy. They create an environment in which it is hard to identify which storage systems hold what data without the use of a vast spreadsheet—any time such a level of manual tracking is involved in a deployment, the likelihood of human error increases, resulting in the possibility, even probability, that critical data is accidentally left unprotected.

IT managers are cautious and resistant to change, which is why change takes forever in IT, especially in storage. But make no mistake: we will see change in how we store data.  There is a good reason for caution as no one in IT wants to risk the corporate crown jewels by making major infrastructure changes—no matter how bad the current situation is—but everything eventually reaches a breaking point.  Large-scale users are constrained by availability of power and cooling while midsize enterprises are hitting the wall when it comes to floor space requirements. Complexity and its associated risk and operational costs must be addressed.  There is early evidence that the storage evolution is starting in the enterprise and will gain steam in 2010. Sure, Tier 1 applications will likely continue to have dedicated, scale-up systems optimized to support them, but for Tier 2+ applications, the journey to a new model has begun.


[1] Source: ESG Research Report, 2010 IT Spending Intentions Survey, January 2010.

[2] Source: ESG Research Brief, Scale-Out NAS Adoption & Market Drivers, February 2009.

Printer-Friendly Version.
Please login to view a printer-friendly PDF version of this document. If you are not a member, please register. When you register, you will be able to view PDF versions of all our freely available documents, and rate and comment on site content.
For important information about using this content, please review our Terms & Conditions

1 responses to "Unstructured Data in 2010: Trends to Watch"

  1. Direct2Dell - Direct2Dell - DELL COMMUNITY says:

    [...] 95% of that 1,800 EB is considered unstructured data, and it’s becoming more important to wrangle this beast that can’t be stored in rows and columns to make storage as efficient as possible. Better yet, [...]

Please register and/or login above to post a comment.