By Chuck Klein
Now it was time for the bloggers to head to the Insight Software lab to see what HP does for managing data centers for power & cooling. John Schmitz, Ute Albert, and Tom Turicchi went over the System Insight Manager software (SIM) all the way up the management stack to Insight Dynamics. This is the software stack that allows system administrators to install, configure, monitor, and plan for BladeSystem chassis in the Datacenter.
Tom then gave a demonstration of how the Data Center Power Control part of Insight Control allows for data center managers to plan, monitor, cap and control the amount of energy and cooling is used by their infrastructure. Tom set up policies and rules to manage events that may happen in a Data Center from utility brown-outs to loss of cooling units. He also went over how you can monitor energy usage for the Data Center all the way down to each blade. This would allow you to better plan for capacity and where to install new blades.
The attendees wanted to know what couldn't be managed as they thought the list would be much shorter than reviewing what the software could do. So Tom went over that it managed only HP servers presently, that scripts could be used to manage or shutdown multi-tiered applications, network devices, and storage. These devices did not have the iLO2 ASIC chip in them and that was a foundational element that needed to be there.
Tom also went over a demo of what could be done to setup the event manager to respond to utility policies and help save companies money. He used an example from PG&E in California. That's all for now.
I was going to add some more details to Chuck's post on how a blade server powers on, but I got sidetracked by a brilliant post from Mike Manos of Digital Realty on the real basics of what is going with power in your datacenter.
What Mike is explaining, far better than I could, is how power gets used up and reserved in your datacenter by breaker sizes, redundancy and natural tendency of the facility management to be conservative when allocating power to servers, and as he says they have good reason to be. If they plug in a device that causes a breaker to trip taking down multiple servers - it's their butts that are on the line.
He raised a good question about why the faceplate label, the label on power supply that indicates the max power input, is so high that most facilities managers are comfortable de-rating it by 20% - 30%. Well the reason is explained in part by my post on how configuration affects power consumption; the power supply is designed to deal with maximum configured load. The range from a minimum configured load for a 2 socket server e.g. 1 Low Power CPU, 1 or 2 DIMMs, 1 x SSD drive and no PCI cards, to a maximum configured load e.g. 2 x 120W or 130W CPUs, 12 or 18 DIMMs, 8 x 15K RPM Drives, 3 x PCI Cards including a 200W graphics card, is huge and that’s just one server. The example I use in the Configuration Matters post shows a difference of over 1kW across an enclosure. Talk to any power supply designer and you'll find out that they are just as conservative as any facility manager (and unappreciated) and for pretty much the same reasons. Who gets blamed when you run a high power program like Prime95 or Linpack and the server shuts down because the power supply couldn’t deliver enough juice.
That’s why HP came up with the common slot power supply design for rack mount servers. It allows you to size the power supply for the actual configuration you will be using rather than just stuffing a 1200W power supply in every server.
This has two great consequences:
It reduces the amount of trapped or stranded power by reducing the amount the power that the facility manager has to allocate to a given server.
It increases your power supply efficiency, reducing energy wasted. All power supplies have an efficiency curve that for servers at low outputs has a low efficiency and gets to peak efficiency at about 35% - 50% 65% load (Got corrected by one of the engineering team on this. Must remember in future to check my numbers). Remember most servers have redundant power supplies and in the HP case they load share so the PSU can only ever exceed 50% load in the event of a redundancy failure.
This does add complexity to your buying decision, now you have to pick the power supply you need based on your configuration. That's why we created the HP Power Advisor to help with that decision. Of course you can still just use a 750W or 1200W PSU for every server if you want to, but you won't be running as efficiently as you could.
One area though where I must respectfully disagree with Mike is in his comments on Power Capping. I agree that is a technology that has huge potential in the datacenter to allow your facilities team to recover that trapped capacity, but I disagree that it is not ready for prime-time.
HP delivered our first version of power capping in 2007. This was relatively slow acting and was really only good for controlling the average power consumption of a server. This was great if you had a cooling issue on your datacenter and wanted to control the heat output of your servers as heat is largely related to average power of the server, but you couldn’t use it to protect circuit breakers.
In November 2008 HP introduced Dynamic Power Capping with circuit breaker protection. This is a hardware based solution that can respond to changes in power consumption in less than 500ms and because it’s a hardware solution it’s operating system and application independent. This is supported on all G6 servers, most blade servers and selected G5 rack-mount servers. When run on an HP Blade Enclosure you gain additional capabilities; the Onboard Administrator can manage the blade server caps to optimize the performance of the enclosure. It will change the blade level power caps so that busier blades get more power and less busy blades will get less power while maintaining the enclosure level power cap so you can protect your breakers.
For a demonstration of this on the rack mount servers showing how we deliver circuit breaker protection see this video with “Professor” Alan Goodrum and for more information Dynamic Power Capping go to http://www.hp.com/go/powercapping
I know a lot of you think technology marketing is full of crap <<or insert your own colorful descriptor>>. I know we can sound that way. It's one of my pet-peeves too.
I also know that some of you may hear a term like "Thermal Logic" and your "marketing-crap' sirens start to go off. So today I wanted to take a moment to explain in plain English the concept of Thermal Logic technology and to show you that it's not a make-believe idea, but a practical approach that HP is taking to address your bigger power and cooling issues in the data center.
It's a very simple idea really. Make the data denter more energy efficient, simply by making it more intelligent.
That's it. No green-ovation, grandious claims or a high brow vision, just a statement of how the power and cooling problem must, and will be addressed by HP.
Here's where that came from. Back in 2003-2006 (even earlier in the mind of Chandrakan Patel in HP Labs), when a lot of our current power and cooling technology was being created in the lab, intelligence was a common theme. Whether it was smarter fans, smarter power supplies, smarter drives, smarter CRACs, smarter reporting and metering, or smarter whatever; putting intelligence behind the problem of power usage came up again and again.
We described the problem as "you can't manage what you can't measure". If every component, system and data center understands its need for energy as well as the total supply of energy, it could take action to conserve every watt of power and every gram of cold air. What we find is that in most cases, every component, system and data center allocates more energy that what it really needs and often wastes energy that isn't being used to do effective work.
Now, back to today.
Every, I repeat EVERY, technology vendor in the world today is building systems with more efficient parts. Big deal. This is basic 'bread and water' today and quite honestly, if your vendor isn't doing everything they can to squeeze every watt out of the basic components, you need to look elsewhere. HP, IBM and Dell all have access to the latest chips, drives, DIMMs, etc. I imagine Cisco is even figuring out who to call these days.
Every vendor is also able to show power savings with virtual machines. Big deal. Taking applications off of a bunch of out dated power hog servers and putting them on fewer, more efficient ones saves a bunch of energy. Again, is there a vendor that can't do that too?
99% of the claims that vendors make to differentiate themselves and claim "power efficiency" superiority are based on these 2 concepts: the lastest systems with the latest, most efficient components versus last years' model and comparisons based on using virtual machines. Even worse, it's done with a straight face and backed up with claims with based on stacked benchmarks comparing today's lab queen design versus the last generation just isn't helpful to anyone.
The data center power problem is so much bigger than a benchmark at any one point in time. Power consumption is happening every second of every day over years. I know that's a lot of variables to consider: tempurature, humidity, workload, usage and growth. That's why intelligence is so darn important. It's a big, complex problem. I only wish I had a magic benchmark with a magic number that could prove my claim definitively in every circumstance. It can't. Nor can anyone else.
Only HP, I repeat ONLY HP, is inventing energy-aware components, systems and data centers. And yes, we call it Thermal Logic technology. Last week, the ProLiant team announced their next generation servers and talked a lot about the concept of a 'sea of sensors'. Those sensors are the starting point to collect the data need.
Here's another example to make this real for you: Dynamic Power Capping. I shared with you in the past a demonstration. Now that you're getting our unique point of view, I'd like to share with you the technology details behind it in this new whitepaper "HP Power Capping and HP Dynamic Power Capping for ProLiant servers"
Read this and you'll quickly see that Thermal Logic isn't IT marketing crap. It's a real answer to the real challenges every datacenter in the world is facing - the rising demand and cost for power and cooling.
Every time a competitor introduces a new product, we can't help but notice they suddenly get very interested in what HP is blogging during the weeks prior to their announcement. Then when the competitor announces, the story is very self-congratulatory "we've figured out what the problem is with existing server and blade architectures". The implication being that blades volume adoption is somehow being constrained by the very thing they have and everyone else is really stupid.
HP BladeSystem growth has hardly been constrained; with quarterly growth rates of 60% or 80% and over a million BladeSystem servers sold. So I have to wonder if maybe we already have figured out what many customers want - save time, power, and money in an integrated infrastructure that is easy to use, simple to implement changes, and can run nearly any workload.
Someone asked me today "will your strategy change?" I guess given the success we've had, we'll keep focusing on the big problems of customers - time, cost, change and energy. It sounds boring, it doesn't get a lot of buzz and twitter traffic, but it's why customers are moving to blade architectures.
Our platform was built and proven in a step-by-step approach: BladeSystem c-Class, Thermal Logic, Virtual Connect, Insight Dynamics, etc. Rather than proclaim at each step that we've solved all the industry's problems or have sparked a social movement in computing; we'll continue to focus on doing our job to provide solutions that simply work for customers and tackle their biggest business and data center issues.
data center 3.0
x86 server market
NOTE: This blog post will self-destruct on December 31, 2009 should anyone feel the need to analyze our prognosticating skills in 2010.
1. Power as a resource is HOT. Power as a commodity is NOT. If you knew your old refrigerator in the garage was sucking fifty bucks in juice a month, you’d pitch it or replace it, right? The problem is; you have no idea how much power it costs you. In 2010, you’ll never think about power in the same way. It’s no longer just a spigot of electrons with a bill that goes to the suits upstairs. Power is a precious resource to your data center and a big part of your budget that stands in the way of growth in 2010. “You can’t manage what you can’t measure”, so 2009 is the time to start measuring your power usage in detail so you understand what you need, what you have and what you’re wasting.
2. TCO is HOT. TCO is NOT. Huh? TCO will be reprioritized in 2009. Take Cost Out is the new TCO. Okay, we don’t want to overplay this one. Of course you want to be as efficient as possible down the road once the 2009 storm passes. However, if you’re ever going to get there, you have to take cost out now. That’s going to mean you have to make tough choices and some big leaps forward in order to put in place an infrastructure that can deliver savings today and be ready for tomorrow. We believe very few stones will be left unturned in 2009 as businesses scourer their data center to find hidden pockets of cost – cables, steps in processes, HA, fibre channel, aging servers, wasted watts, unused A/C – nothing can hide from the new TCO. Those trying to limp through 2009 by patching up some aging technologies will find themselves in world of hurt in 2010.
3. Knowing is HOT. Guessing is NOT. Whether you’re talking capacity planning for your apps and virtual machines, the power and breaker size you need per rack or the storage for your data explosion, using the old ‘rules of thumb’ for quarterly budgets aren't going to cut it in 2009. Getting better data out of every circuit board that you can then use to take informed action will be critical to justify growth and to help you squeeze the most cost out from your consolidation projects.
4. Packaged infrastructure is HOT. Piecemeal infrastructure is NOT. Sorry IBM, the mainframe isn’t part of this one. We’re talking about pooled and shared infrastructure based on industry standard components. We think you’ll see more packaged infrastructure solutions tailor-made to different applications and environments whether it’s a rack at a time for mega clusters or a unified communication platform for a small branch office. You already see it with ExDS, BladeSystems, PODs, NeoView, NonStop blades; the trend is probably already here but it’s going to really take off in 2009. The idea is simplified delivery, integration and expansion. You just won’t have the time in 2009 to try and figure out how to get widget A to talk to widget B.
5. Unified is HOT. Siloed is NOT. Whether it’s Cisco, Microsoft, IBM or us, the vision of unified infrastructure is clearly our shared goal. We just have different names for it. The only question is how do we make progress in 2009? We know one thing for sure; you can’t get there by forcing the perspective one silo one on another. Network packets won’t unify your infrastructure any more than processor architectures will. The only path to the unification you seek is from the top down starting at business and application services. Understanding and managing in a unified way provides a different perspective on what tomorrow’s infrastructure looks like. When you recognize that, you see that the center of the universe can’t be found inside the network, the storage or the servers. It’s at the business level.
6. Performance per sq ft, per dollar per watt is HOT. Moore’s Law is NOT. The days of chasing the tail of processor performance are quaint, but the new global economic reality will create a whole new class of benchmarks to help you better compare your choices. Whether your issue is space, power or cost, you’ll be empowered in 2009 with a whole new set of much more relevant benchmarks to see where you stand. SPECPower, VMmark and others are just scratching the surface of what will be a renaissance in data center metrics.
7. DAS is HOT. SAN is NOT. Okay, okay. The SAN isn’t going anywhere. But there will be a lot more choices in 2009 that flip the economics of storage on their head and put server admins in more control of their storage needs. Last night I was browsing for some home storage backup and came across a deal for a 1TB home back-up for $149. Buying your first TB in a traditional SAN will set you back $30 to $50k. (Calvin Z's going to kill me). Basically, storage is delivered by drives. Shouldn’t you be able to pile up all the drives you have, DAS or otherwise and carve up that capacity how you see fit? Check out some of the cool stuff we can do now with LeftHand’s innovations and you’ll see what we mean.
8. Virtual infrastructure is HOT. Virtual machines are NOT. Or said another way, “Virtualization is dead! Long live virtualization”. 2009 will shift priorities from optimizing server capacity with virtual machines to looking for new opportunities at the server edge to extend the savings and consolidation to the network, management and storage realms. Virtual infrastructure will be the new mantra and managing it, coordinating it and aligning it to the business will be the key. In 2009 more people will think differently about infrastructure as service and something that you simply carve up and allocate capacity based on your demands. It aligns to you, not the other way around. With this in place, automation starts getting real too!
9. Dynamic Core Utilization is HOT. Multi-core apps running one application is NOT. This one almost fell to runner up status simply because it was a mouthful and a little geeky, but we needed 9 things. Seriously though, the flexibility to adjust core utilization to match a workload has been a long time coming for the x86 world. As 2009 starts to move us beyond quad-core processors, it makes no sense to continue the old, tried-and-true practice of one app per server.
The runners up:
Dynamic Power Capping is HOT. Power Face Plates are NOT. This one can be summarized above with “Knowing vs. Guessing”. The low hanging fruit for reducing power consumption is just about gone. It’s going to take more intelligence and coordination at the rack, row and datacenter level. The ability to reclaim data center capacity simply be allocating only the power you need makes too much sense to not make our list.
Converged Fabrics are HOT. Silo’d I/O traffic is NOT. Talk about an oldie but a goody, this one might start to make it over this hump this year. Aggregation of I/O to a single physical layer ushers in a whole new opportunity to simplify and cut big costs. See most of the bigger trends we mentioned above and you see that this one is key.
Industry standard gear for Telco is HOT. Telco-only gear is NOT. Ah, one of the last bastions of proprietary gear. This one has been predicted so often, it fell to runner-up. But the slow march continues and we think 2009 will speed things up a lot as more telcos start to see the benefits of gear like blades and standard rack servers on their balance sheet.
Battery back-up at the rack is HOT. UPS rooms are NOT. Nice idea that just doesn’t make as much sense as folks thought. Hogs of data center floor space, budget and that nasty little 10% loss in efficiency make this one at least worth of the runner-up list.
Blades are HOT. Mainframes are NOT. Give me a break. We are the HP Blade Team. It just wouldn’t be an IT hot/not list without one little jab at poor Big Blue. ;-)
We’d love to hear from you. Tell us what you think about this list and share what’s on your HOT and NOT to-do list for 2009.