Intelligent data mobility delivered through data virtualization will allow IT professionals to specify service-level objects (SLO) such as performance, reliability, high availability, archiving and cost, and then let software automatically move data to the right storage in real time. Let's examine the problem of data immobility and how data placement through data virtualization will finally solve common mismatch of compute and storage, resource sprawl and the cost of overprovisioning.
Data silo snowball rolls ever faster
The value and demands of data change over time, but today most data stays where it is first written. This is because applications are not storage aware. When initiated, applications are specifically configured to the available storage, whether DAS, SAN, NAS or tape, fast (low latency), shared, protected or slow. This one-to-one relationship creates data silos that make it a complex and time-consuming task to move data to a faster or more cost effective storage location as business needs change. More importantly, it means interrupting services.
Data silos have long been a problem, but the challenges facing modern businesses are making this a critical issue. To stay competitive, companies must now support multiple devices per employee, run real-time business analytics over all kinds of data, support development methodologies that accelerate time to market, and respond to quickly evolving compliance and regulatory requirements. This pressure is causing data needs to change faster than ever before, but companies today do not have the visibility or the time to size, procure and migrate data frequently enough to make the best use of their resources at all times. The scale of the challenge is growing quickly, and IT is in a race to keep up.
The data silo tax: Overprovisioning and migration migraines
As a result of the sprint to stay ahead of exponential data growth, enterprises today put extraordinary effort into initially allocating the right storage for each application and typically overprovision to defer migrating data as long as possible. When migration can't be deferred any longer, realigning data to resources is a manual process that consists primarily of putting smaller containers into bigger ones. This approach alleviates the symptoms without addressing the core problem: as business needs evolve, data can't easily be moved to more suitable storage.
Given that IT professionals are in high demand, it can be difficult for companies to even hire enough experts to keep their critical computing infrastructure running smoothly. Even when enough staff can be hired for the IT team, the labor and infrastructure costs of data silos are immense. IDC recently reported that 60% of all large IT project spending is on migrations--just moving data to new storage. In addition, research from Schneider Electric found that actual capacity load at initial deployment of new storage resources is just 20% and peaks at a mere 60%. This means that 40% or more of data center infrastructure is commonly wasted, simply to defer the need to migrate data later on.
One might think that advances in modern data center technology would be making storage simpler. In fact, the opposite is true. The growth and diversity of today's data is making data center management more complex. For example, supporting today's real-time business analytics, sub-millisecond transactional response times, and potentially massive and unpredictable workload spikes require higher performance and lower latencies than shared storage can cost-effectively support. Consequently, many organizations are implementing shared-nothing architectures for mission-critical applications that place data on flash in a server, creating an entirely new (yet faster) silo.
Another example is cloud computing, which promises plenty of inexpensive capacity to help enterprises reduce costs, flexible access to resources to increase agility, and edge locations that enable globally distributed workforces. But cloud management tools don't offer the same visibility and control as on-premise tools. In addition, they don't integrate well with existing management software, making the cloud yet another new silo to contend with.
Intelligent data mobility
The ideal solution to the problem of data silos would give enterprises comprehensive visibility into all of their data, while transparently automating data movement by policy. In fact, new technologies are now leveraging data virtualization to offer just such a solution.
Data virtualization abstracts an application's logical view of data from the underlying storage hardware within a global data space, allowing enterprises to see and access all of their data from a single pane of glass. The diagram below illustrates one approach to data virtualization:
In this architecture, the metadata, which includes information about the data, is separated from the actual data. Applications are able to look up the location of the data from the metadata server, and then access the data directly, instead of accessing a dedicated storage device that only has information about data stored on that particular device. This is similar to the way DNS servers are used to translate "Google.com" into the physical location of a server hosting the web page.
Once data is virtualized, it can live anywhere, allowing policy-driven, intelligent data mobility to automatically move data to the right storage as business needs evolve and without application interruption. Rather than having to add yet another new type of storage to introduce data virtualization to an enterprise's infrastructure, storage agnostic solutions enable existing storage to be used at maximum efficiency.
Automatically place data on the right resource
By automating many complex storage management tasks, enterprises can finally stop managing storage and start managing data. IT professionals can stop spending time sizing storage capacity and performance based on highly educated guesses regarding the long-term needs of applications. With data virtualization, they can finally get ahead of the data snowball by creating policies that specify Service Level Objectives (SLO) to best meet their customers' needs, leveraging software to automatically place data on the best resource to meet these objectives.
For example, instead of dedicating an entire storage device to a mission-critical application, IT can specify which data in the application requires high performance and availability. Then, software can automatically place the data on the right storage device to meet latency, performance, and availability requirements. At the same time, data the application is no longer accessing can be automatically archived to the cloud.
Once automated, intelligent data mobility removes the complexity of storage management, freeing IT to focus efforts on delivering applications and services that meet the business needs of end users. This allows enterprises to directly align their technology investments with the services they are providing, helping them stay ahead of the competition by getting to market faster, at less cost, while producing satisfied customers. Once we enter the era of self-driving data, we will all wonder how we ever lived without it.
Lance Smith is CEO of Primary Data. Follow Primary Data on Twitter at @Primary_Data.