By Karl Dohm, HP Storage Architect
Extensible NetApp Blog (http://blogs.netapp.com/extensible_netapp) contains some posts describing WAFL. It sums WAFL up as an internal component which...
...provides mechanisms for building file-system semantics, it manages the on-disk format, it manages the free and allocated space, and provides a logical and physical volume manager.
and further making the argument that it is not a file system, but rather an essential part of one. Fair enough, calling it a file system might be splitting hairs and technically incorrect, but given the amount of confusion across vendors in the industry, it is likely that this common misunderstanding emanated from NetApp's own documentation.
Whether the details of the FAS internals are technically part of WAFL, or part of OnTap, or part of something else, the main point is that none of that is particularly relevant to the Storage Administrator. What matters most is performance, space efficiency, and ease of use.
From this point on I'm not going to split hairs and discern between WAFL and the remainder of FAS internal software/firmware - because no one but NetApp architects really care. For the benefit of simplicity lets call it all WAFL.
WAFL is a rather unique approach to organizing data on the spindles and controlling the flow of the data to the spindles. No doubt WAFL can come across as impressive in sales presentations because it is very different than the approach used by EMC, IBM, and HP. In this series of posts, we will explore the other side of WAFL, highlighting some of the problems that WAFL brings to the Storage Administrator, none of which we expect NetApp will fully acknowledge.
Today lets touch on fragmentation. Some in the industry say, WAFL is "fragmentation by design". I didn't make it up, but like WAFL being called a file system, its one of those things you tend to hear if the conversation is around NetApp. This statement strikes me as accurate because WAFL tries to do full stripe writes whenever it can, meaning that it prioritizes writing non sequential blocks in the same stripe over read modify write operations associated with RAID-4 or RAID-DP parity calculation.
Translating that to the world of the Storage Administrator, this means that gradually the throughput of the FAS degrades over time when the workload has a random component. Most real world workloads have a random component. Applications along the line of Microsoft Exchange present a nightmare situation for the FAS. The throughput degradation can be significant, and throughput can be unpredictable because it varies depending on history of writes.
For those that question this assertion as being somehow biased, try the following. Take a FAS system and create a new volume with a new LUN. Baseline the system by running a sequential read workload and measuring the result. Notice that the number is already not very impressive. Next run a few hours of random workload, say 8KB 50/50 R/W, which is similar to MS Exchange. Now try the sequential read load again and observe the new throughput. Chances are you will have some new questions for your NetApp sales rep.
Next time we will discuss the benefit of reallocation, NetApp's answer to fragmentation, and explore how much this really helps the problem.