Mission Critical Computing Blog
Your source for the latest insights on HP Integrity, mission critical computing, and other relevant server and technology topics from the BCS team.

Measuring Uptime

How do you measure uptime? Is it becoming more important in your environment? Is downtime costing your company more or less today when compared to a  few years ago?


For many customers, the amount of downtime that they experience is increasing, often due to the complexity of new systems. In addition, the cost of downtime is also increasing, usually due to the businesses increased reliance on IT systems. In short, for many customers, downtime is a bigger issue than in the past.


 


Coming from a business that spends a lot of time working with mission-critical customers, I've seen some interesting changes over the past few years, especially where uptime measurement is concerned.


 


I've seen that with virtualization, many workloads that each have lower uptime requirements are consolidated onto fewer platforms. Often, this means that the uptime requirements for platform actually increase compared to the individual workloads. However, virtualization also provides benefits, such as moving workloads online which allows maintenance to be completed without bringing down the application -  a great way to reduce planned downtime.


 


I've also noticed that as systems get more complex, and vendors build in more availability into the applications, that the overall uptime of the application increases. However, the uptime of an individual node in a cluster may not be as high as a single node of the application. Why? Because the increased complexity of the cluster results in higher overall availability, but at times it sacrifices the ease of management, configuration, and maintenance that may be available in a single node version, resulting in more downtime.


 


So, how do you measure uptime in your environment? Do you measure it based on the uptime of the server? Does that change if you can move a virtual machine workload from one system to another to handle planned downtime?


 


Do you measure uptime based on the OS availability? I can move my virtualized workload from one server to another, and the OS stays running. This is wonderful, and definitely helps reduce planned downtime. If you are running a cluster of virtual machines, and the clustering only measures whether a server is running (for unplanned downtime) or if the administrator needs to manually start an online migration (for planned downtime), it is hard to get OS level availability or application level availability measurements.


 


Do you measure uptime based on the application availability? This is easy in a clustered environment when the cluster understand the applications, such as with HP Serviceguard . While this works well for mission critical applications, it does take some effort to get that level of application integration. And then, how do you measure uptime on a multi-node solution, such as Oracle RAC? Do you measure the uptime of each node, any of the nodes, or all of the nodes?


 


So, how do you measure the uptime of your environment, or do you use different measurements for different systems or parts of your environment? How do you navigate vendor uptime claims, especially since different solutions may offer similar claims (ex. 99.9% uptime), but often measure different things (ex. Application uptime versus physical server or virtual machine uptime)? Do your uptime measurements include  planned downtime for maintenance, or just unplanned downtime? Comments or thoughts on how this plays out in the real world are always appreciated.


 


Jacob

Search
Follow Us
HP Discover 2013

About the Author
  • Kirk Bresniker is the Vice President/Chief Technologist for HP Business Critical Systems where he has technical responsibility for all things Mission Critical, including HP-UX, NonStop and scalable x86 platforms. He joined HP in 1989 after graduating from Santa Clara University and has been an HP Fellow since 2008.
  • I’m the worldwide marketing manager for HP NonStop. I’ll be blogging and tweeting out news as it relates to NonStop solutions – you can find me here and on twitter at @CarolynatHP
  • Cynthia is part of the HP ExpertOne team. ExpertOne offers professional IT training and certifications from infrastructure refresh to areas that span across the datacenter like Cloud and Converged Infrastructure.
  • I have worked with NonStop systems since 1982. I am a Master Technologist for HP and am part of the IT SWAT organization, the Cloud SWAT and work with HP Labs. I report into the Enterprise Solutions and Architecture organization.
  • Joe Androlowicz is a Technical Communications and Marketing manager in HP’s NonStop Product Division. Joe is a 25 year journeyman in information systems design, instructional technologies and multimedia development. He left Apple Computer for Tandem Computers to help launch G03 and hasn’t looked back yet. He previously managed the program management team for the NonStop Education and Training Center and drove the development and growth of the NonStop Certification programs.
  • Hello! I am a social media manager for servers, so my posts will be geared towards HP server-related news & info.
  • HP Editor-Enterprise Group: ISS, BCS, Converged Infrastructure (CI), Converged Cloud, Converged App Systems (CAS), and ExpertOne
  • Luke Oda is a member of the HP's BCS Marketing team. With a primary focus on marketing programs that support HP's BCS portfolio. His interests include all things mission-critical and the continuing innovation that HP demonstrates across the globe.
  • I’m the Worldwide Product Marketing Manager for HP Serviceguard Solutions for Linux in BCS. I’ll be blogging about the latest news and enhancements as it relates to this product.
  • Greetings! I am on the HP Enterprise Group marketing team focused on Content Marketing for Business Critical Systems. Topics I am interested in include mission-critical computing, scale up x86, and Converged Infrastructure, Converged Systems.
  • As a Managing Consultant for HP’s Enterprise Solution & Architecture group, I collaborate with client business and IT senior management to understand, prioritize and architect advanced use of data and information, drawing insights required to make informed business decisions. My current focus leverages event-driven business intelligence design techniques and technologies to identify patterns, anticipate outcomes and proactively optimize business response creating a differentiated position in the marketplace for the client.
  • Vinay Gupta is an HP Distinguished Technologist and the NonStop Manageability Architect. He joined Tandem in 1994 after graduating from Indian Institute of Technology. He has worked on many NonStop manageability applications over time. He works across various groups within NonStop and HP to ensure consistency and interoperability in manageability interfaces and applications. He is also a member of DMTF workgroups.
  • Wendy Bartlett is a Distinguished Technologist in HP’s NonStop Enterprise Division, and focuses on dependability – security and availability - for the NonStop server line. She joined Tandem in 1978. Her other main area of interest is system architecture evolution. She has an M.S. degree in computer science from Stanford University.
Labels