- Channel HP
- :
- Enterprise Business Blogs
- :
- Networking
- :
- HP Networking
- :
- VXLAN: consider the network virtualization technol...
- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Email to a Friend
- Printer Friendly Page
- Report Inappropriate Content
VXLAN: consider the network virtualization technology cost angle
By Vishwas Manral, Distinguished Technologist, HP Networking, Advanced Technology Group
At the recent VMworld 2011 in Las Vegas, VXLAN made headlines when VMware CTO Steve Harrod called it “a new vision and approach.” Just in case you were hiding in some remote corner of the world and missed it, you can read this: VMware, Cisco stretch virtual LANs across the heavens: VXLAN virtualizes Layer 3 networks.
This topic has generally been beaten to death by various bloggers. Most blog posts have rightly gone to great lengths to elaborate the technology and its advantages. However I see one critical aspect that has been totally neglected. Let us examine the cost angle of the technology here. This should hopefully help you see some of the hard facts and make an informed decision about using the solution.
VXLAN is a technology that allows the server to do Layer-3 tunneling of server Ethernet traffic, even before it hits the first networking device. This contrasts with the Edge Virtual Bridging (EVB) approach many vendors have taken. This requires signaling between the server and the top-of-rack (ToR) switch with all the tunneling actually being done on the switch. The two approaches sound similar but have a very different cost dynamics.
Examining economies of scale
In the VXLAN context, when the switch moves into the server (vSwitch), the server needs a bigger, costlier CPU to achieve the networking functionality that may be required, such as switching, routing or security. (Think of all the ACLs and encryption that may be required to be done on the vSwitch when the ToR becomes opaque to application traffic.
As each vSwitch handles traffic only for that server, it needs to be sufficiently provisioned for CPU cycles to take care of traffic spikes that may occur. This means a lot of the CPU cycles stay idle in most cases, resulting in considerable additional costs. (Other interesting questions like scalability and management also need to be tackled but are probably best served with a separate blog post.)
However, if the switching, ACL’s and encryption/ decryption are all done in a common place, where the ToR handles traffic for all the servers attached to it, we gain the “economies of scale.”
Most current ToRs can handle the traffic at peak rates. However, by putting the heavy functionality on the ToR hardware, we gain economies of scale—hence optimizing costs. This is exactly what the EVB approach intends to achieve, by passing the load onto the central switch that handles traffic on behalf of multiple servers.
The EVB idea is nothing new. We have done a similar thing in the WLAN architecture in the enterprise where we have moved from architecture of multiple thick costly APs to a model where we have a few costly controllers and a lot of light APs.
What to keep in mind: VXLAN vs. EVB
To summarize, with each individual server provisioned for peak traffic, the cost of the server goes up. In data centers where there are a considerably larger number of servers than networking ToRs, the costs can considerably proliferate.
BTW, we should not forget about the ongoing associated power and cooling costs. The VXLAN solution can lead to considerably higher costs when compared to the EVB solution.
This is not to say VXLAN is never useful. There are certain advantages in the VXLAN solutions too. But before rushing in, vendors trying to use the solution should look at the cost they may have to incur with the increased server usage or increase in the number of servers that may result when the VXLAN solution is implemented.
What are your thoughts on the VXLAN topic?
>> Learn more about HP Networking products and solutions for the Instant-On Enterprise.
- Mark as Read
- Mark as New
- Bookmark
- Highlight
- Email to a Friend
- Report Inappropriate Content
There is a long answer and a short answer, Vishwas.
The short answer is: there is only one thing that abunds in current server architecture (and consequently in data centers) and that we can afford to "waste": that's CPU cycles.
If what you anticipate for VXLAN is true, it may be the solution to properly use those 12/24/64 cores per sockets we are going to see down the road vs being their problem. And no, customers wouldn't be using those cpu cycles to support more VMs (IMO).
Massimo Re Ferre' (VMware)
- Mark as Read
- Mark as New
- Bookmark
- Highlight
- Email to a Friend
- Report Inappropriate Content
The argument you present is one that is typically presented by hardware vendors trying to abstract network function from the hypervisor. I spent more than a year supporting Cisco's efforts around EVB during my 14 years with the company. Here are some items you may be over looking:
There are still many vendor fights over 802.1qbg and bh and whose version of virtual interface identification is going to be used. HP and Cisco are still fighting over those issues. Incrementally, regardless of the implementation, the control plane at which those are controlled are still within a single vendor and still locks customers into those solutions. Also, this is yet again a hardware swap and a complex environment that vendors can use to drive up costs. You CPU argument is incorrect as well and if you study Intel optimizations, you can understand how some of this can lower the CPU impact. Your battle is with Intel, Moores Law, and possibly a Fulcrum chip on an adapter - not an ACL and CPU.
Contrast network virtualization technologies. There are many, and much like the efforts around EVB, vendor positioning remains supreme relative to version.
VXLAN encapsulation is not exciting in itself. There are many vpn-like encapsulations and GRE/L2/L3 tunneling protocols have been around a long time, and they will all be supported - including NV-GRE, by the same vendors listed in the announcement around VXLAN. The fact that they tie functions to hardware multicast creates more complications and vendors like Cisco are already playing down VXLAN for OTV/LISP/Fabric Path (read expensive boxes).
None of this is focusing on what customers want. So what do I think that is? 1) Choice, 2) Lower Cost, 3) Combine 1 & 2 with very fast delivery, and they want to stop woorying about the "somewhere engineering" they have to do around the physical network. They want a cheap fast fabric, reduced protocols, and to not think about the physical fabric beyond bandwidth and multi-pathing. VXLAN does not address that goal.
Who does... well, check out the link above
- Mark as Read
- Mark as New
- Bookmark
- Highlight
- Email to a Friend
- Report Inappropriate Content
As with all technologies there are good and bad points to it one thing it does well is to push the functionality to edge of the network which does allow it to scale with the number of servers deployed very nicely. The key IMO when looking at new technology is to pick one that is standards based and does not take you down the proprietary vendor lock-in that many companies are promoting under the guide of a scalable data center 'fabric'. Providing a scalable fabric can be done with this really cool technology many of you may have heard about its called Ethernet.
Both EVB and VXLAN are valid solutions with pros and cons I think what is much more important is the fact they are in the standards process and not a vendor's attempt to push a proprietary solution (e.g. Cisco Fabric path, Juniper Stratus, Brocade 'TRILL').
A quick point of clarification, unlike the title of the story you referenced, "VMware, Cisco stretch virtual LANs across the heavens: VXLAN virtualizes Layer 3 networks.", VXLAN has absolutely no dependancies on Cisco hardware or software and the draft specification was written by a group of individuals from many companies not just Cisco, as a mattter of fact Arista Networks won the VMWorld award for best hardware virtualization with their tight integration with vmware and VXLAN.
- Mark as Read
- Mark as New
- Bookmark
- Highlight
- Email to a Friend
- Report Inappropriate Content
Massimo, interesting points!!! We have discussed this on other forums too already.
A point you need to consider is that with older data centers CPU cycles were wasted (exactly the problem VMWare is solving), however with the proliferation of virtualization the idea is that the CPU capacity is not wasted (unless you are questioning the cloud scaling solutions) and we can scale out horizontally, rather than just vertically. In such cases if we used VXLAN with all the bells and whistles required , the number of virtual instances that we can run would be considerably lower (and considering the bursts of traffic once protocol run over the tunnels, the over-provisioning required would be considerable).
In the early 2000’s the networking industry faced phenomenal growth and let me tell you there were numerous Network meltdowns due to CPU usage. I have worked with AT&T on those so very well am aware of what I am talking about. Have a look at an RFC (http://tools.ietf.org/html/rfc4222 ) I wrote along with the AT&T folks, if you search for the same set of authors you will find numerous other work.
Let me know if you have any other questions.
- Mark as Read
- Mark as New
- Bookmark
- Highlight
- Email to a Friend
- Report Inappropriate Content
Let me know your views on the same.
Regarding vendor lockin and standardization we at HP wholly agree. Here are my views on the same http://h30507.www3.hp.com/t5/HP-Networking/Wired-f
http://www.ietf.org/mail-archive/web/armd/current/
The part of the question I disagree is that the blog is not about general purpose CPU versus a Broadcom ASIC (though that can be an argument too), it is more about where to do network functionality, in the first hop or the server itself and the cost implications of the same. I agree EVB has not yet been ratified though I know it is in the process, however the point here was more generic about where the role of networking best lies.
- Mark as Read
- Mark as New
- Bookmark
- Highlight
- Email to a Friend
- Report Inappropriate Content
Your views as always are greatly appreciated.
All great points Mark. Besides the points you mention a question a customer needs to ask in such cases is "In what cases does "a particular technology" not work?" They can then choose the restrictions that they are ready to work with for the rest of the architecture lifecycle.
Regarding your sentiment about proprietery solutions and issues with them, I totally agree. Here is a link to my thoughts on Open innovation and tricks used by vendors
http://h30507.www3.hp.com/t5/HP-Networking/Wired-f
- Mark as Read
- Mark as New
- Bookmark
- Highlight
- Email to a Friend
- Report Inappropriate Content
Frank BTW I agree to the requirements you state a customer needs – they want an ease of management/ automation/ elasticity to handle success (traffic bursts)/ commodity costs and of-course a lot of flexibility.
However like is well known, if you make something generic you cannot do a lot of optimizations (no wonder ASIC solutions are cheaper than NP’s). I agree it does not make sense to run particular software on only particular servers if they are all have the same requirements, however in the case of networking where traffic proliferation is the norm, using general purpose CPU would be considerably costlier.
With that said I know there is a lot of network functionality which require a lot of intelligence/ tweaking and are already run as appliances or blades on a router chassis. I agree such functionality could be put on general purpose processors.
Another point to make is that you will observe that some solutions work for certain cases and the same solution may not be the answer for all the problems. As an example Zynga unveiled their own Z-cloud, while moving away from the public one because they couldn’t realize the cost benefits. A lot of other companies are similarly looking hard at the cost factors than just accepting that a particular solution is more cost effective.
It is for this purpose I wanted to bring out the cost angle.





