Last month we announced a world record SPC-2 number by the XP24000. At the same time we extended yet another challenge to EMC to join the rest of the world in publishing benchmarks. They continue to decline the offer, arguing “representativeness”. I thought I’d clear up the “representativeness” question.
EMC’s argument that this XP is too costly starts from the assumption that SPC-2 only represents a video streaming workload. To quote, “128 146GB drive pairs in your 1152 drive box? A pure video streaming workload?” We actually see a widely diverse set of workloads used in the XP. The power of having both SPC-1 and SPC-2 benchmark results is that they provide audited data that applies to almost any workload mix a customer might have. But if one had to pick a most common workload it would probably be database transaction processing by day, then back up and data mining workloads joining the transaction processing by night. SPC-2 models the back up and data mining aspects, with SPC-1 representing the transaction processing. SPC-2 is about a lot more than video streaming.
When people need bullet proof availability and high performance for transaction processing they turn to high end arrays like the XP24000. It’s probably the most common use for a high end array. Our data indicates that on average the number of disks in an initial XP purchase is right around the 265 in our SPC-2 configuration. Some of those won’t have the levels of controllers in the SPC-2 configuration. But an increasing number use thin provisioning. In those cases they will often get all the controllers they’ll need up front, delaying the disk purchases as you’d expect with thin provisioning. So the configuration and workloads look pretty representative.
Then consider a real use of the benchmark. A maximum number is key in assessing an array’s performance. Below that you can adjust disks, controllers, and cache to get fairly linear performance changes. But when you reach an array’s limit, all you can do is add another array. So once you know an array’s maximum number you know its whole range of performance. By maxing controllers we provide that top end number, giving the most useful result. For sequential workloads like back-up and data mining maxing disk count isn’t necessary, whereas it generally is for random workloads like transaction processing.
Now let’s discuss how one might use XP’s SPC-2 results. Let’s say you need a high end array for transaction processing. The most common case we see requires backup and data mining operations at night in a limited time window. Since the XP’s SPC-2 result is twice that of the next closest high end array, you can expect it to get the backup and data mining done with half the resources of the next fastest array. But with SPC-2 you can go further. You can look up the specific results for backup and data mining workloads which are around 10GB/s for the XP24000. Knowing how much data you need to backup and mine you can estimate how much of the system’s resources you’ll need to get those things done in your time window and therefore what’s still left for transaction processing during that window. You can scale that for the size array you need for transaction processing. And you can compare to other arrays that have posted results. All using audited data before you get sales reps involved.
SPC benchmarks are all about empowering the storage customer. XP24000’s SPC-2 result is important to the most common uses for high end arrays, as well as for less common uses like video editing. The configuration we used looks pretty typical, with choices made to make the result most useful to customers. The cost is pretty typical for this kind of need. At HP we expect to continue providing this kind of useful data for customers. And our challenge to EMC to publish a benchmark result still stands, though they’ll probably continue inventing reasons not to.