Yesterday I received a letter in the mail at home that started off:
Dear Sir or Madam,
We are writing to let you know that computer tapes containing some of your personal information were lost while being transported to an off-site storage facility by our archive services vendor. While we have no reason to believe that this information has been accessed or used inappropriately, we deeply regret that this incident occurred....
So the first question I have is how does an archive vendor lose tapes? How hard can it be to take the tapes from your customer put them in a secure truck and drive them to the storage facility? Isn't that your whole business model - you will pick up, transport and store these tapes safely and securely 100% of the time?
Now I understand that any activity with humans involved cannot be guaranteed to work 100% of the time. So what really happened? A bit more of an explanation would have been helpful, such as the truck was in an inadvertent accident and the contents of the truck were spilled into a river or all over the highway and could not all be recovered. Without more details I'm left wondering did someone make off with the tapes by accident or on purpose? Or was this just sloppy work by the company?
Anyway, I hope this is a call to action for this company to do at least two things to prevent such an incident in the future.
1. Look into tape encryption such as the LTO-4 offers. I would have been more much pleased if that second sentence read "While the tapes were physically lost, the data they contained cannot be accessed or read by anyone because the data on the tapes is securely encrypted with sophisticated technology requiring encryption keys to make the data readable. Our security policy ensures that these keys are always stored in or transported to physically separate locations from the computer tapes."
2. Consider the use of replication and electronic vaulting for moving data off-site for archiving. With new technologies such as deduplication and low-bandwidth replication, this company would perhaps be able to reduce the amount of data that is stored on tapes and physically transported to archive storage. Again, I don't know the specifics here, but as an example let's say this company had four sites that they were backing up to data to tape and transporting those tapes to off-site archives. With replication and electronic vaulting, they could replicate data from three of their sites to just one site for backup to tapes and then only have to move tapes from the one site to archive storage thereby reducing their risk exposure by 75%.
If you're worried about how a similar incident could impact your company and what risks are involved HP is here to help. We can work with you to significantly reduce your data security exposure from the desktop to your data center. On the storage side, we offer a FREE storage security risk assessment. For more details on HP's other data security options beyond storage please check HP's Security web page.
By Jim Hankins
If you remember back in my HP Deduplication - Part 1 post when we announced our new deduplication products back in June, I said that the deduplication ratio you can expect from a product can vary based on a number of factors. We now can share with you deduplication test results from our D2D4000 Backup System conducted by a 3rd party, Binary Testing Ltd.
Binary Testing conducted testing that backed up and deduplicated data for file serving, SQL and Exchange environments with various data change rates over a simulated three month backup period. The results can be found here: http://h71028.www7.hp.com/ERC/downloads/4AA2-0799ENW.pdf
Again, your mileage may vary but this report should give you some idea of what's possible if your business runs these types of applications.
- By Jim Hankins
Watch as "IT Guy" Al Jones sweats while dangled above a Piranha tank. He has to perform 3 tasks on the HP StorageWorks Virtual Library System before he becomes fish food.
Here's the link to the video: http://www.youtube.com/watch?v=aqN2Q-yZemQ
This is the second of the Impossible IT videos. If you remember the "runaway train" in the first one, then you MUST see this one.
For a refresh on the first Impossible IT video, go here: http://www.youtube.com/watch?v=HmlsT_rMkOc
- By Jim Hankins
As I stated in part 1, it's difficult to judge the merits of a disk-based backup or virtual tape products with deduplication capabilities based solely on the deduplication ratio that a particular vendor might be touting for its product. The deduplication ratio a customer will see is dependent on a number of variables unique to that customer's backup environment. What I'd like to examine in part 2 are some other features that a customer should consider when comparing disk-based backup or virtual products from different vendors.
As I also mentioned in part 1, HP is introducing deduplication capabilities into our existing HP StorageWorks Virtual Library Systems as well as introducing two brand new products, the HP StorageWorks D2D2500 Backup System and the HP StorageWorks D2D4000 Backup System. These products are intended to meet different customer needs depending on the size of the customer's backup environment. Therefore, it's important to examine the features of these two different products versus other competitive products based on these different customer requirements.
For medium to large scale enterprise backup environments, the key features of our VLS9000 product is its grid-like scalability and appliance-based management.
Grid-like scalability simply means customers can extend the performance and capacity of the VLS9000 by adding additional nodes (one node standard, expandable to eight) or by adding additional capacity per node. Each node of the VLS9000 provides 600 MB/sec and 30 to 80 TBs of capacity. Similar competitive products are offering only single node capability which means it's difficult to grow performance and capacity as the customer's needs grows.
All nodes and components of the VLS9000 can be managed as a single appliance. This allows the customer automatic management of all the hardware and software, including self-configuration, self-monitoring, self-diagnostics, performance load balancing, self-tuning and self-maintenance. Additionally, the entire capacity of the VLS9000 can be presented as a single virtual library. Similar competitive products come pre-configured from the factory, but any expansion or repair tasks can be tedious and complex, and if not done carefully can impact the system's performance and reliability.
For our brand new D2D2500 and D2D4000 Backup Systems, HP focused on delivering three key features for smaller backup environments. Those features are a very affordable price point, ease of deployment and management, and robust integration to physical tape.
One of HP's unique design advantages for our D2D2500/4000 products is our ability to leverage the economies of scale of our volume server and disk drive business. This allows us to build these new products and offer them at price points that are 30 to 40% below competitive product offerings in this customer segment.
The D2D2500/4000 also provides a fully functional graphical user interface (GUI) that allows the product to be easily setup and deployed. Ongoing management and configuration of the products is also performed via the easy to use GUI. Other competitive products provide only limited functionality from their user interface and some tasks require the customer to switch to a command line interface to complete important configuration steps. For some customers this can be a challenging and difficult way to setup and maintain the product.
Our D2D2500/4000 was also designed with tape integration fully in mind. Tape integration allows customers to copy data from the D2D2500/4000 directly to physical tape for offsite disaster recovery or for long-term archives. Tape integration is not available for many competing disk-based backup products, leaving the customer to piece together a disparate and perhaps non-interoperable solution for physical tape backup of their system.
And finally, one of the other exciting capabilities that deduplication enables is low bandwidth replication for disaster recovery. So for customers with multiple data centers or remote sites, deduplication and replication can be a powerful combination for a faster, more reliable way of moving backup data between sites. With deduplication, only the changed backup data needs to be sent between sites allowing customers to use more affordable low bandwidth WAN connections. HP will be offering replication on our VLS, D2D2500 and D2D4000 products before the end of the year.
-By Jim Hankins
Earlier today, HP announced new deduplication capabilities for customers who are considering deploying disk-based backup or virtual tape as part of their data protection processes. Deduplication is one of the most talked about new technologies in the storage industry today as customers continue to look for innovative ways to protect the ever growing amounts of data in their IT environments.
However, in talking with customers we found that there are very different needs for disk-based backup and deduplication depending on whether the customer wanted to use the technology in a larger scale "data center" type installation or in a smaller scale "office" type installation. Because of these very different needs, HP is offering its customers two different deduplication technologies.
First, HP is making available by license accelerated deduplication for our HP StorageWorks Virtual Library Systems. Our VLS products with accelerated deduplication technology are uniquely scalable for large data centers where both high performance and high capacity are required.
Second, HP is introducing two brand new products, the HP StorageWorks D2D2500 Backup System and the HP StorageWorks D2D4000 Backup Systems with dynamic deduplication technology. Dynamic deduplication was developed by HP specifically for smaller environments where low cost and ease-of-use are key customer needs. Dynamic deduplication is a built-in feature on the D2D2500 and D2D4000, rather than by licensed option.
For more information about the above products please see our announcement page at: www.hp.com/go/deduplication
One of the most frequent questions we heard from customers that we talked to about deduplication prior to our announcement was, "So what kind of deduplication ratios can I expect to get with HP's deduplication technologies?" We've done some internal testing that has shown it's possible to reach at least a 50:1 deduplication ratio, but the ratio that you will achieve in your environment depends on a number of variables. You may hear some other vendors quoting deduplication ratios that are much larger or smaller, but it all depends on a number of factors.
One of those factors is the type of data that the deduplication process is being applied against. Some data types lend themselves to being better candidates for deduplication than others. As an example, data from a PACS (Picture Archiving and Communication System) used in X-rays and other medical imaging will have very little duplicate data so the ratio would usually be quite low. In another example, a database, where there may be many records with empty fields or the same data in the same fields, would typically be a good candidate and could produce very high deduplication ratios.
Other factors to consider are what is your backup policy and the daily change rate of your data? Are you doing daily full and weekly full backups? Or are you doing daily incremental and weekly full backups? Is the daily change rate of your data 1%, 2% or even more? It is important to remember that the less your data changes the more benefit you'll see because over time the deduplication engine will see more and more of the same (duplicate) data during the backups.
Lastly, how you measure deduplication is important to the overall ratio. Are you measuring the deduplication ratio of just your last backup to the previous backup? Are you measuring the ratio over the aggregate of all backups stored? Or is the measurement somewhere in between?
Another word of caution here, some might think that deduplication means that you can buy a smaller disk- based backup system, but be aware that it may take many backups over a long span of time to yield substantial deduplication ratios. Initially, the amount of storage you buy with your disk-based backup or virtual tape product needs to be sized correctly to reflect your existing backup tape rotation strategy and expected data change rate within your environment.
HP believes that the various deduplication technologies in the industry are going to deliver relatively the same ratios, so it's much more important to consider other features such as the scalability, cost and ease-of-use of competing technologies.
In part 2, I will take a look at these other features more closely.