Hyperscale Computing Blog
Learn more about relevant scale-out computing topics, including high performance computing solutions from the data center to cloud.

The new HP Apollo 6000 Server with more performance for your budget

Guest blog written by Stephen Howard, HP Servers, Hyperscale Business

The Apollo 6000 System provides performance for your budget with the ideal combination of performance, density and efficiency to deliver rack-scale IT services for lower TCO.

Interconnect technologies: InfiniBand, 1G Ethernet, 10G Ethernet

Interconnect is an integral part of any scale-out solutions.  The question often asked is something like "which interconnect should I use for my next scale-out project?"  Well, a simple answer is "it depends".  For many applications in the high performance computing (HPC) area, their performance highly depends on the latency and bandwidth of the interconnect fabric -- majority of MPI-based parallel applications are in this category.  InfiniBand has been proven to be the best choice for these applications.    In the latest released Top500 (June 2009),  I counted 151 entries with InfiniBand as the interconnect,  increased from 124 6 months ago.  Another interesting observation is about what server platforms were used to build these clusters -- I found that the HP BladeSystem c-Class is the most popular server platform with 37 entries, followed by IBM's pSeries with 17 entries.   In the last couple of years, we also see more and more cases in commercial applications where InfiniBand is being used as a transport interconnect to meet the low latency and scalability requirements, examples including HP ExaData storage server,  Market data systems deployed in Financial Services Industry, and applications in rendering applications.  


Not every application needs InfiniBand.  There are workloads, include some HPC workloads, that their performance is not sensitive to the latency of the underline interconnect.  For these application,  Gigabit Ethernet provides a very cost effective solution.    However, we see increased bandwidth requirement in many of these cases, partially driven by the adoption of server virtualization, or simply to match higher I/O requirement driven by significant performance increase of the multi-core servers.   10Gigabit Ethernet is becoming a natural choice to meet this higher bandwidth requirement, yet, allow people to stay within the technology they are familiar with. 


So, to summarize, the decision on the choice of interconnect should be based on the performance requirements of applications:  for majority of MPI parallel applications as well as those that require low latency and high bandwidth, InfiniBand is becoming the interconnect of choice.  And InfiniBand is also a technology that is very easy to use and manage. 


 

What is your monitoring interval?

While a lot of systems do in fact run some sort of automated monitoring tools such as sar, ganglia, cacti and others, I'm always surprised to hear that the monitoring intervals being used are on the order of 5-10 minutes long.  The positive note is that in a relatively small number of data samples you can see what your system has been doing over the course of days, weeks or even months.  In fact, if you're a sar user whose default interval is 10 minutes, a day's activity is shown in 144 samples.  However, what are you actually seeing?  If your network activity shows occasional blips at 50% load, how would you know it wasn't running at 100% for 5 minutes and 0% for the other 5?  When showing 10 minutes averages you're certainly seeing some form of activity, and if you tend to run long workloads perhaps the numbers are representative of what is typically happening, but what value is that data when something goes wrong?


I'd claim the only solution, assuming of course that you do want to be able to diagnose performance problems, is to collect your data at a much finer level of granularity, perhaps in the 1-10 second range.  Admittedly you have the same problem at any interval in that any events that are shorter than the interval could get averaged away,  but your chances are significantly better than in the 10 minute interval case.


However, this immediately raises 2 questions, the first of which is "how much is this going to cost in terms of overhead" and my answer is the classic response of "it depends".  But seriously, in most cases you won't even notice it.  Many monitoring tools will use less than 1% of the CPU and some even less than 0.1% but to be sure this isn't a problem you can always run some applications with and without monitoring enabled and see what the effect is.  If you're having intermittent problems, perhaps you enable even finer-grained monitoring until you can track them down.


The second question brings us back to a previous post I wrote about cluster vs local monitoring.  When you do monitor at lower levels of granularity on a large cluster, you're going to overwhelm any centralized data collector and in this case the only solution I can think of is to collect less central data.  In other words, can you configure your environment such that not all the data collected on an individual system is reported to the centralized monitor?  If so, now you can continue to centrally monitor your cluster, but when you feel you need more detailed data you can simply go to the individual systems on which it is being collected.


To my knowledge most monitoring environments can't do this!  Either they collect the data locally or they collect it centrally but not both.  However, there is a solution at the core of which is an open source tool I wrote a number of years ago called collectl.  It's primary focus is fine-grained local data collection in the 0.1% overhead range but it can also send a subset of that data over the network to a centralized monitoring system and in fact that is exactly what is done with HP's Cluster Management Tool, so with CMU you can centrally monitor multi-thousand node clusters and still get at the details when you need it.  More on CMU in a future post.


As Alanna said in a previous post of hers, I'll be at the HP Technical Forum next week doing a couple of presentations on collectl and will also be doing some demos of CMU in one of our booths.  If you'll be there and this topic interests you, be sure to attend one of my presentations or just track me down at the booth and we can discuss this in as much detail as you like.


-mark

Search
Showing results for 
Search instead for 
Do you mean 
Follow Us
Featured


About the Author(s)
  • I am a member of the Enterprise Group Global Marketing team blogging on topics of interest for HP Servers. Check out blog posts on all four Server blog sites-Reality Check, The Eye on Blades, Mission Critical Computing and Hyperscale Computing- for exciting news on the future of compute.
  • Hello! I am a social media manager for servers, so my posts will be geared towards HP server-related news & info.
  • HP Servers, Converged Infrastructure, Converged Systems and ExpertOne
  • WW responsibility for development of ROI and TCO tools for the entire ISS portfolio. Technical expertise with a financial spin to help IT show the business value of their projects.
  • Luke Oda is a member of the HP's BCS Marketing team. With a primary focus on marketing programs that support HP's BCS portfolio. His interests include all things mission-critical and the continuing innovation that HP demonstrates across the globe.
  • I’ll be blogging about the latest news and enhancements as it relates to HP Moonshot.
  • I am part of the integrated marketing team focused on HP Moonshot System and HP Scale-up x86 and Mission-critical solutions.
Labels
The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the Terms of Use and Rules of Participation.