Around the Storage Block Blog
Find out about all things data storage from Around the Storage Block at HP Communities.

StoreOnce and EMC Data Domain performance flap

Headshot 100X100.jpgBy Calvin Zito, @HPStorageGuy

 

(If you'd rather listen to this as a podcast, you can download the file or you can subscribe to my podcast on iTunes - go to the iTunes store and search for "Around the Storage Block" or you can open this link and click on the "View in iTunes" button under my picture on the page that opens.  I also have an embedded player at the bottom of the blog post.)





 

Last week after we announced our latest enhancements to the HP StoreOnce family, a flap broke out on Twitter, in the press, and on blogs. We announced that with StoreOnce Catalyst (an API), the B6200 can backup 100TB/hour and that is over 3X faster than the latest EMC Data Domain product they just announced a few weeks ago at their event.  I think this is little more than a diversionary tactic - raised by EMC. Despite their market share, I think it’s pretty obvious that EMC are concerned about the major steps we've been taking over the last couple of years with HP StoreOnce.

 

"I vigorously object your Honor!"

 

Shortly after we made our announcement, Jack Clark - a reporter from ZDnet UK - asked EMC for comment.  I saw the conversation developing on Twitter and wasn't surprised - who doesn't like a vendor food fight.  The resulting article was titled "EMC: HP's 'ridiculous' 100TB dedup claim is bogus".  You can read it for yourself but my summary of what Jack reported EMC said is this: EMC "vigorously" dispute HP's claim of leap frogging the DD990. EMC said they found it "puzzling why HP put these ridiculous claims out".  

 

Chris Mellor also had a follow up article with more vigorous objections.  You can see his entire article titled, "Is HP pulling a fast one on deduplication".  EMC's objections in this story can be summed up in a quote from Mark Twomey - "I don't get how HP can call it scale-out when those are four separate dedupe pools. That [100TB/hour] number is from four 2-node systems, isn't it? Yes they have one manager, but it's still four systems. If I get a manager can I compare four Data Domains?" (BTW, if you listen to my bad Irish accent in my podcast, I can do that because I am half Irish).

 

And with that, the prosecution (EMC) rests. 

 

Inside the StoreOnce B6200

 

I think to be as clear as possible and hopefully head off the inevitable "he said she said", let me answer the questions raised.  Here we go:

 

  • Is the B6200 a single system?  Yes it is.  It's purchased as a single system, ships as a single system, installs as a single system, and is managed as a single system. How anyone can say it doesn't pass the "single system sniff test" is perplexing. Or maybe I should just say that it's ridiculous and I vigorously dispute EMC's claim that it isn't a single system.
  • How can you call it a single system when it has four pools of storage?  The underlying technology that gives the B6200 its scale-out capability is the IBRIX file system.  We haven't talked much about that on my blog but that is what we introduced last November when we first announced the B6200.  That gives us the capability of scaling to 1000's of nodes and a 16PB single namespace.  Yes, the current configuration is four different pools in order to balance ingest performance and dedup efficiency. It's still a single system.

My question back is who decided the number of pools determines whether it's a single system or not.  That is random and absurd. It's ridiculous and I vigorously dispute it. EMC is trying to make an argument that in order to be a single system, global deduplication is required.  Sorry, not a good argument and down right silly. 

 

In Chris Mellor's article EMC suggested that they could compare multiple DD990's if they could manage them as a single system. I say have at it.  But I dare say if EMC could manage multiple DD990's as a single system, they'd already be doing it.  Or maybe they wouldn't because of some inherent flaws in their architecture.

 

Let me make a few things clear that EMC failed to mention in their whining:

 

  • The B6200 is split into 4 high availability couplets - if a node fails, the B6200 forces an Autonomic Restart and the backup continues.  EMC has no such capability.  Even if they could manage multiple systems under a single management application (which as far as I know they can't), they don't have any high availability failover between systems.  So sure, pool together multiple DD990's systems and increase the failure rate of nightly and weekly backups.  That will go over well with customers who value their data, won't it?
  • When we announced our Converged Storage strategy at HP Discover last year, one of the key tenants was using modern scale-out architecture.  I mentioned this already by saying the B6200 is built on  an underlying clustering engine capable of scaling to 1000's of nodes and 16PB in a single namespace.  Based on what I know today, Data Domain can only build single controller system. 
  • StoreOnce Catalyst and the B6200 is a Federated Deduplication architecture - a single deduplication engine that doesn't require you to "rehydrate" data as you move it around your enterprise.  EMC is a collection of deduplication engines: Avamar, NetWorker, and Data Domain appliances that weren't intended to work together. 

Summing it up



So we've covered a lot of ground of who said what - allow me to summarize it all:

 

  •  A single multi-node StoreOnce B6200 has a 300% performance advantage over a single Data Domain DD990.  The comparison we drew is factual and real.  Period!
  • The B6200 is built on a scale-out architecture - the DD990 isn't. 
  • The B6200 is a single system based on a clustering engine that can scale to 1000's of nodes and a 16PB namespace. 
  • Currently the B6200 is split into four couplets to deliver high availability (Autonomic Restart) and uses four deduplication pools to balance ingest performance and deduplication efficiency.

I have several blog posts talking about the HP StoreOnce announcement from last week - if you haven't heard the details, I highly suggest you check them out.   

 

EMC decided to attempt to recast our great results with HP StoreOnce - I'm happy to answer questions to clarify the differences as we understand them - so respectful and honest questions are gladly welcome. 

 

 

 

 

Comments
Chuck Hollis | ‎06-15-2012 02:16 PM

Hi Calvin

 

Bless you for waving the company flag here, but just about everyone in the industry felt a bit mslead when they discovered your "single system" claim was actually for discrete pools of storage, and not a single dedupe pool.  

 

I think the car analogy I saw sums it up best:

 

"Worlds fastest Porsche!  1600 horsepower, 24 cylinders, 16 wheels.  All housed in a single garage and maintained by the same mechanic".  The "car" described here meets your criteria, doesn't it?

 

From a customer perspective, a single dedupe pool is a meaningful distinction from any aggregate of disparate parts.  People felt that HP was trying to pull a fast one to manufacture some excitement for your event.  At least, that's what it looked like from here.

 

-- Chuck

| ‎06-16-2012 09:32 PM

Hi Chuck - long time no see on the blog. 

 

We hid nothing and stand by all our claims.  As I said in the blog post, the StoreOnce B6200 is purchased as a single system, ships as a single system, installs as a single system, and is managed as a single system.  It even has failover (Autonomic Restart) as you'd expect in a single system.  We're on a path of Converged Storage based on a modern scale-out architecture - I think we need to agree to disagree on this one. 


Have a good weekend - Calvin

 

Johan123 | ‎06-18-2012 09:32 AM

All this from EMC, who quite happily benchmarked four completely independent VNX systems on SPEC SFS and presented them as a single system benchmark.

http://recoverymonkey.org/2011/02/24/emc-conclusively-proves-that-vnx-bottlenecks-nas-performance/

 

Or

 

How about EMC's very recent performance claims for their DD990 ? "We think we're roughly six times faster than our closest competitor on performance,"  Marketing and selective language at it's finest, you couldn't make this up....pot kettle


ARRITDOR | ‎06-19-2012 06:46 PM

As a customer, I have no problem with multi-"system"/node, scale-out architectures.  Isn't that really how storage tech is going?  In some ways, storage "tiering" is an indication of this.

 

However, we (customers) want to know how performance results are achieved, especially when claims are significantly higher compared to competitors.  Most people should be skeptical of seemingly unreal performance claims and want a more technical explanation and verification.

 

As for EMC product, both the Centera and Avamar have multiple nodes that are fault tolerance and sold as a single system with scale-out capability.  So they know what's going on, they just don't have this in the DD line (yet?).  Don't be surprised if they're working on it (especially now!).

 

Leave a Comment

We encourage you to share your comments on this post. Comments are moderated and will be reviewed
and posted as promptly as possible during regular business hours

To ensure your comment is published, be sure to follow the community guidelines.

Be sure to enter a unique name. You can't reuse a name that's already in use.
Be sure to enter a unique email address. You can't reuse an email address that's already in use.
Type the characters you see in the picture above.Type the words you hear.
Search
Showing results for 
Search instead for 
Do you mean 
About the Author
25+ years experience around HP Storage. The go-to guy for news and views on all things storage..
Featured


Follow Us
The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the Terms of Use and Rules of Participation.