Around the Storage Block Blog
Find out about all things data storage from Around the Storage Block at HP Communities.

Time to get real about disaster tolerant all-flash array solutions

Pd_J.jpg

By Priyadarshi Prasad (Pd), Product Line Manager, HP 3PAR Storage

 

Now that the initial euphoria with all-flash solutions is starting to settle down, the conversations are getting more real. All-flash solutions are coming under the microscope. And it’s time to ask some very real questions of all-flash solutions: Granted you are fast, and you can be cheap if I believe your 6:1 (or is it 12:1 today) dedupe ratio. But can you protect my data and applications when a disaster occurs? How much data would I lose? How fast can I come back online?

 

The Three D's of DR: data loss, downtime and distance

The world of disaster recovery (DR) is vast and deep. Every application has its own special requirement, but overall, it boils down to what I call the Three D’s of DR:

 

1. What is the cost of data loss?– Point Objective, or RPO – If you want to lose absolutely no data in a DR event, the RPO of the solution has to be ZERO, meaning you need to go for a synchronous replication solution.

2. What is the cost of downtime?  Recovery Time Objective, or RTO – When the cost of downtime is huge and every second matters, you need to have a solution that enables applications to come online ASAP. This requires application integration such as VMware vMSC support.

 3. What is the strike distance of disaster you are protecting yourself against?  Consider:

  • Local or Campus – array failure
  • Metro – disaster that affects a city, typically within a 100 mile radius
  • Country/Continent – outside of a 100 mile radius, typically a 1000-3000 mile radius

And the best DR solution?

Obviously, the best DR solution would be one that has zero RPO (no data loss), zero downtime and protects against a country/continent wide disaster. It is possible as long as you don’t care about application write latency. But wait, I thought flash was all about speed and latency? Welcome to the new world of conflicting priorities when designing a disaster-tolerant all-flash solution.

 

To design a disaster-tolerant all-flash solution, we need to look at each of our applications and understand their RPO (data loss acceptability) and latency requirements. Here is a framework that can help us design the right DR for each application deployed on flash. 

 

Figure1_J.jpg

Figure 1. Loss-Latency Framework

 

Because we are talking DR, the way to read the chart is to look at the x-axis and decide whether data loss is acceptable for an application or not. Let’s start with this case:

 

First case: Data loss is NOT an option

Applications dealing with sensitive data, such as financial applications fall into this category. We have two choices:

  1. Go for the best latency possible, and still protect against an array failure  data loss is not an option, we have to rely upon using a synchronous replication solution, such as HP 3PAR Remote Copy Synchronous. The two arrays can be next to each other, or within the same data center. Small distance between the arrays keeps the replication latency overhead to a minimum. This provides customers with a protected infrastructure while allowing applications to get close to sub-millisecond latencies. For VMware environments, you can also use HP 3PAR Peer Persistence to get a completely automated and transparent failover and failback capability. Transparent failover and failback capability ensures zero downtime for applications.
  2. Go for good enough but consistent latency, but protect against a site-wide disaster –  I say good enough, but the reality is sub-2.5ms is great latency, especially if you are comparing against a disk-based environment where latencies are in tens of milli-seconds with little consistency. In addition, Synchronous replication only impacts write latency. Read latencies can continue to be sub-millisecond. Here again, HP 3PAR Remote Copy Synchronous is a perfect fit, along with Peer Persistence for VMware environments.

What’s more, in both of the above cases, you can implement a 3-data-center solution where the primary array is replicating to a secondary array in Sync mode (hence zero data loss). At the same time, the primary array is also replicating to a third array in Async mode. This solution ensures that if the primary array goes down for any reason, applications can continue to run on the secondary array in a protected manner (secondary array gets in a replication relationship with the third array). Additionally, in an event that affects both the primary and the secondary array, applications can still come online on the third (async) array. This solution is called 3PAR Synchronous Long Distance Remote Copy.

 

Second case: Some data loss is OK

This case is applicable to a variety of applications where data loss, while undesirable, can be managed. There are again two choices available:

  1. Need the best latency – Applications that can take full advantage of the sub-ms latency delivered by all-flash-arrays fall in this category. Examples are web-based applications where every micro-second saved results in greater business opportunity. Clearly, you can implement an asynchronous replication solution here.
  2. Need good but consistent latency – For a large number of applications, a consistent and deterministic latency is more important than getting a sub-millisecond latency. Large databases and health care applications are a few examples that fall in this category. Such applications can use all-flash arrays to meet consistent latency requirements. The beauty is they have a flexibility in configuring their DR set up – they can go for a zero data loss DR plan (with Synchronous Remote Copy) or go for a long distance DR plan (with Asynchronous Remote Copy).

How does all this apply to you?

When evaluating all-flash arrays, don’t get sold on the argument that all-flash systems only require asynchronous replication. This is a simplistic argument, one that doesn’t withstand real-world scrutiny. It implies that you will always have to live with data loss, something that is unacceptable to a large number of applications that are deployed today using Synchronous Remote Copy.

 

74590J.jpgHP 3PAR StoreServ 7450 All-Flash Array provides you with tremendous flexibility when looking to deploy all-flash systems for real world applications that usually need to be disaster-tolerant. With its Synchronous Remote Copy, it ensures zero data loss. Its Asynchronous Periodic Remote Copy allows applications to get sub-millisecond latency while still remaining protected and enables a geographically separated DR solution. Furthermore, HP 3PAR Synchronous Long Distance brings both of the above options together, and ensures zero data loss with a Sync copy nearby and a long-distance DR solution with an async copy in a different geography (state, country or continent).

 

All of the above replication options are available natively within the 7450 all-flash array, without needing any external hardware or appliance. Few all-flash arrays provide so much DR flexibility, and provide it natively within the array.

 

Icing on the cake

With a common OS running within all HP 3PAR StoreServ arrays, you have the flexibility to invest in HP 3PAR StoreServ 7450 Storage all-flash arrays for their key applications – and use a different 3PAR array (7200/7400/10400/10800) for DR. This flexibility directly reduces the upfront investment you need to do for getting their applications on all-flash performance.

 

So go deploy your applications on flash. We have the performance, the cost, the scale AND the DR options available to take care of your real-world needs.

Comments
White Jason(anon) | ‎07-11-2014 02:47 PM

Great post PD!

 

I don't think people really understand how difficult it is to develop high quality DR and HA solutions, even for large companies. It is NOT easy. It takes a lot of resources (financial, engineering time, etc). Small starups simply do not have these resources and it shows. I spent some time at a storage startup and HA was put on the back burner simply because we did not have the resources.Even some well established companies still have not been able develop and deploy DR or HA solutions. Example EMC's XtremIO does not currently support native remote replication!

 

The fact that we have an all flash array is good. But the fact that it is built on the amazing 3PAR technology foundation (ASIC, Mesh Active, etc) makes our 7450 the best array on the market!

 

JW

 

 

Priyadarshi Prasad(anon) | ‎07-12-2014 12:00 AM

Thanks Jason - very well said!

Leave a Comment

We encourage you to share your comments on this post. Comments are moderated and will be reviewed
and posted as promptly as possible during regular business hours

To ensure your comment is published, be sure to follow the community guidelines.

Be sure to enter a unique name. You can't reuse a name that's already in use.
Be sure to enter a unique email address. You can't reuse an email address that's already in use.
Type the characters you see in the picture above.Type the words you hear.
Search
Showing results for 
Search instead for 
Do you mean 
About the Author
This profile is for team blog articles posted. See the Byline of the article to see who specifically wrote the article.


Follow Us
The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the Terms of Use and Rules of Participation