Around the Storage Block Blog
Find out about all things data storage from Around the Storage Block at HP Communities.

NetApp spins their SPC-1 results

Headshot 100X100.jpgBy Calvin Zito, @HPStorageGuy    vexpert 2012 logo.gif

 

I just finished watching the exciting conclusion of the Euro2012 semifinal match between Spain and Portugal.  Call me weird but I couldn't help think about NetApp and their new SPC-1 results as the match was nearing its end.  If you missed the game, it went to penalty kicks (PK) to decide who would move on to the Euro2012 championship game.  Both goalies made excellent diving saves to start things off but then Portugal drew first blood scoring on their second PK. 

 

What does the game have to do with NetApp SPC-1 results?

After Portugal scored first, I thought they should "pull a NetApp" and just leave the field and claim victory. That's what NetApp is doing with their latest SPC results. Here's what I mean.

NetApp in their pitching the press on the SPC results is claiming it has better results than HP 3PAR.  Then, quietly under their breath, they say, if you redefine the SPC-1 rules and ignore the fact that HP 3PAR was nearly 2X faster than NetApp, we have found a way to claim victory.  
NetApp picked a latency number and compared the results of the two systems at that latency and started running a victory lap. 

 

Say what?

 

NetApp and HP 3PAR SPC-1 Compare.pngI gotta give it to NetApp for the creative marketing; if they would have been coaching the Portugal team during the Euro2012 match today, I'm sure they would have pulled the team off the pitch after the second penalty kick and claimed victory. Unfortunately, neither result would stand the test of reason.  Check out this chart with some of the data comparing the SPC-1 results from NetApp and HP 3PAR.  A couple of things jump out of this for me:



  1. HP 3PAR results were 200,000 IOPS better than NetApp, nearly 50% faster.  NetApp is trying to redefine the intended purpose of the SPC benchmark and its shenanigans like this that are probably a big reason why EMC still refuses to submit SPC results. I'm a bit surprised that the SPC allowed NetApp to do this. 
  2. If NetApp could have beaten the 3PAR numbers outright, they would have - they can't.  Even with over 3TB of SSD FlashCache, they couldn't get close to the HP 3PAR IOPS.  So plan B is to find some other way to declare victory. 
  3. NetApp also claimed that they have a lower $ per IOPS and justified it by saying other suppliers discount by up to 50% compared to their supplied prices.  Unfortunately, that's not how the SPC-1 $ per IOPS works - again, I'm surprised that the SPC would approve of what NetApp has done here. The SPC certainly wouldn't allow them to publish these re-spun numbers in the official report and I'd venture a guess that the SPC isn't very happy with the games they playing in the press with recasting their results.
  4. In NetApp's results, their array's useable capacity was almost 50% less than we had with HP 3PAR.  I find great humor in this given the way my old friend Alex McDonald used to attack HP capacity utilization compared to FAS several years ago.   
  5. Lastly, let's talk about the "clustered system" NetApp used vs. the HP 3PAR architecture.  NetApp used six nodes in their cluster but it isn't anything like a 3PAR cluster.  Each controller in the NetApp cluster has its own disks assigned and there's no load-balancing across the other controllers.  A NetApp controller can only access another controller's assigned pool of disks when one controller fails.  With HP 3PAR, data is wide-striped and load balanced across all controllers.  

With deep apologies to Portugal, Spain won the Euro2012 semi-final match 4-2; the match didn't end after the first PK was scored.  And with no apologies to NetApp, they don't have a better SPC-1 result than HP 3PAR. 

 

I have a couple of blog posts on our HP 3PAR results from last year that you can review to understand our results.  Here's also a link to the SPC-1 website where you can read the NetApp results.  



 

Comments
RobertF(anon) | ‎06-28-2012 02:16 AM

I agree with you, they slant the numbers towards their arrays. But if you look at the SPC-1 latency charts the P10000 dramatically increases latency as you push the 450K IOPS from ~3ms at 250K IOPS to about 14ms. Latency can dramatically impact application performance, depending on its IO patterns. Other all flash arrays, as pointed out on The Register, have even lower and flatter latency curves than Netapp or 3PAR.

Jeff Katcher(anon) | ‎06-28-2012 04:26 PM

Lastly, let's talk about the "clustered system" NetApp used vs. the HP 3PAR architecture.  NetApp used six nodes in their cluster but it isn't anything like a 3PAR cluster.  Each controller in the NetApp cluster has its own disks assigned and there's no load-balancing across the other controllers.  A NetApp controller can only access another controller's assigned pool of disks when one controller fails.  With HP 3PAR, data is wide-striped and load balanced across all controllers. 

 

But isn't it ironic that the HP 3PAR SPC-1 configuration explicitly doesn't stripe across nodes, thus trying to use the 3PAR cluster like a pile of NetApps?  Nor is it thin-provisioned.

nate | ‎06-28-2012 07:39 PM

To be technically accurate 3PAR controllers do have their own disks - each node pair is responsible for a set of disks that other nodes can't access directly (indirectly of course through the backplane).

 

I too was kind of surprised as to the latency on the P10k results I mean compare them to the F400 results of 2009 which topped out at 8ms latency at full tilt.

 

Though 3PAR clustering is more of a true cluster as the volumes and file systems are distributed over all of the resources opposed to NetApp where (as far as I know anyways) aggregates are confined to a single controller at a time (something I was really expecting them to solve with this fuss they are making about 8.1.x - the high level marketing makes it appear as if they did). Their new hybrid aggregates extend this design even further.

 

Another question I have is  - why didn't NetApp scale up or out more for SPC-1 ? They posted 24-node results for SpecSFS, I'm curious why only 6 nodes for SPC-1 ? Is there some difference between the size limits of their cluster for block vs file workloads? I don't know the details. I imagine it wouldn't of been difficult to run SPC-1 results on that same big cluster at around the same time they ran SpecSFS.

 

Netapp has come a long way though in the past few years with the help of their flash cache stuff from a performance and latency perspective. My company was reeally close to going NetApp for our last vmware project (I prefer 3PAR more though I figured it wouldn't be bad having some more NetApp exposure), in the end we didn't though. Post deployment I'm glad too since the workload ended up being 90% write, something that fancy flash cache wouldn't do much to help with. Hybrid aggregates may of helped, but that technology didn't exist at the time we purchased (and thus we would of had a lot of pain until that technology became available assuming it could help a lot).

 

Though it seems NetApp been losing steam to EMC recently in the NAS market (well perhaps everyone is losing steam compared to EMC? except 3PAR of course in the markets that 3PAR competes best in).

 

That's not to say that 3PAR doesn't have some caching up to do (get it? HA HA) with regards to flash.  I know you guys are working on it... I've been so busy recently still planning on going to Fremont to get some good information about what's going on down there, I fear it may not happen until late next month with so much going on at work and a pending vacation.

 

I only hope that the support and the sales folks behind the 3PAR platform can be high quality - it doesn't appear as if HP is interested in retaining the quality on that front. I had some friends at 3PAR in Seattle turn stuff around for me in 72 hours - something that the team in the bay area couldn't turn around in 6 months (literally! - that team is no longer my team - I am without a team at the moment and due to HP's territory shenanigans I can't get the team in Seattle to be official though they still happily support me in a less official capacity). I can't put into words how frustrated I was with getting a simple quote to upgrade our 3PAR array recently, it wasn't until the VAR came back and said they are having trouble getting the info then I reached out to friends up north who got the info for me over last weekend (I sent the email 5PM friday, and they turned the configuration around by Sunday night - for something they aren't even going to get credit on! You can't pay people like that enough to keep customers happy)

 

HP isn't alone though, NetApp and Dell are doing similar things - not sure about EMC since I don't deal with many folks that deal in EMC. It seems these big players are feeling a lot of pressure and are reacting poorly to it. In the end the customer is the one that suffers.

| ‎06-29-2012 06:41 AM

Robert - I spent some time today with our one of our performance experts (he actually works on our SPC-1 submissions among other things) and he didn't have any issues with the latency we submitted with our results. His experience says that under 15ms latency is acceptable to most if not all applications. He talked to me about how the latency allows the array and disk drive to reorder IO's.  I didn't completely follow all that with him and plan to spend more time digging in.  Once I better understand what he's saying, I'll probably talk about it here.  Or maybe I should just do a podcast with him and let him explain it.

 

Jeff - ummm, SPC-1 doesn't allow you to use thin provisioned volumes.  But let's face it, that was just a diversion.  If I wasn't clear enough in my blog post, let me say it again.  NetApp claiming it has won the top SPC-1 benchmark when HP 3PAR results were 200,000 IOPS better is like getting 4th place in the Olympic 100 M dash but claiming the gold medal because you had the best smile crossing the finish line.  450,212 vs. 250,039.  The first number is higher and better in the context of SPC-1 IOPS.

 

Nate - my knowledge of 3PAR doesn't compare to your real-world experience using 3PAR. I learned a bit more about the 3PAR architecture from the team today and am going to have a few more discussions to make sure I really understand but all of this has given me a few ideas for topics.  As to the results, I do find it interesting that NetApp really pushed the limit of the SPC-1 rules. The unused storage ratio must be below 45% - and NetApp came in at 43.2%.  Add to that the 3TBs of Flash Cache and its starting to sound like this was really a cache test. As to HP 3PAR futures, good one on the "caching up" comment.  I'm probably going to come spend a week in Fremont in mid-July; if you make it over to get an NDA briefing, let me know.  I'd love to meet you in person.

Geert(anon) | ‎06-30-2012 06:21 PM

@nate,

 

Too bad you didn't get yourself educated on ONTAP's NVRAM/WAFL's write optimization. You would have learned it does some neat tricks on optimizing those writes so it wouldn't have needed that many spindles (sometimes even 1/3rd compared to any other arrays spindles, with RAID-6 enabled too!). Not only does it provide low latency on heavy random-write intensive workloads, it also saves a lot on acquiring (less) spindles and low power consumption.

JimG(anon) | ‎08-15-2012 05:20 PM

@Geert

But why bother with all that old WAFL rubbish - with all it's architectural limitations, whan 3PAR' ease of management, performance and capacity utilization destroy Netapp anyway ? 

| ‎08-15-2012 11:20 PM

@JimG - my SPC-1 expert down under (Paul Haverfield) looked at the complexity of NetApp's configuration. It's mind-bogglingly complex and I don't think there's a customer on the planet that could configure what they did on their own. The measure of complexity with SPC-1 is command lines needed for configuration.  Paul tells me the NetApp configuration took 1185 command lines to configure just 12 LUNs and present them to 6 hosts. Comparing this to our 3PAR SPC-1 results, this is 7X more complex - at 50% of the performance. 

Johnmh(anon) | ‎08-16-2012 09:43 AM

Calvin,

 

I don't think you even need to count the command lines to understand the complexity, just look at the executive summary http://goo.gl/4vlT1 Page 7 and the "Priced Storage Configuration Diagram", then read the foot note :-

 

"The two illustrated FAS6240 nodes and their components (controllers, disk shelves, disk drives, etc) was [sic] repeated 3 times to build the 6-node cluster"

 

The configuration is so complex that they decided not to layout the whole thing in the diagram, because that would have made the complexity self evident. Their marketing is nothing if not consistent.

Leave a Comment

We encourage you to share your comments on this post. Comments are moderated and will be reviewed
and posted as promptly as possible during regular business hours

To ensure your comment is published, be sure to follow the community guidelines.

Be sure to enter a unique name. You can't reuse a name that's already in use.
Be sure to enter a unique email address. You can't reuse an email address that's already in use.
Type the characters you see in the picture above.Type the words you hear.
Search
Showing results for 
Search instead for 
Do you mean 
About the Author
25+ years experience around HP Storage. The go-to guy for news and views on all things storage..
Featured


Follow Us
The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the Terms of Use and Rules of Participation.