Eye on Blades Blog: Trends in Infrastructure
Get HP BladeSystem news, upcoming event information, technology trends, and product information to stay up to date with what is happening in the world of blades.

Redundant ROMs far, far away

Situation: A glitch is causing unexpected system reboots. After much testing, you identify the problem.  A firmware patch should prevent it from recurring.   Luckily, you've already got the tools that will let you remotely "flash", or update, your firmware.


Complication: If your system glitches while you're remotely updating firmware, you won't be able to connect to it remotely anymore.  Oh...and your system is on another planet.


That's the firmware problem facing a NASA team right now.  The Mars Reconnaissance Orbiter has seen unexpected reboots, and engineers believe they've got a patch that could fix it.  However, they're worried that a mistake or unexpected reboot during the patch process might leave the satellite so confused it will stop transmitting its data.



ProLiant engineers have actually grappled with this very same problem, though a little closer to home.


Before I explain that, an aside: there's a cool connection between HP engineering and Mars spacecraft. Lossless compression technology developed by HP labs and used in HP's RGS software for workstations was used by NASA for transferring images from the Spirit Rover on Mars.


Here's two ways that ProLiant blades -- including the RGS-using ProLiant WS460c G6 workstation blade -- protect you from this "botched update" scenario:



1. Redundant ROMs - There are two ROM images stored on each blade.  One is a "primary" image, used to boot. The other is a "backup" image.  Here's a screenshot from RBSU showing the version numbers (dates, actually) of the primary and backup images on one blade.


 


When you flash a ROM, it actually overwrites the backup image, and then makes this image the new primary.  The original primary becomes the new backup. This hedges against both a new image being bad, and against the flash process failing to complete or corrupting the image.  (One reason a flash might fail: total loss of power during a flash.)


By the way, if both ROM images are valid, you can select which one you want to use at boot  time from RBSU.  Here's a short video showing that:



There's also a manual way described in the Maintenance and Server Guide to force a boot to the redundant image by setting some physical DIP switches inside the blade itself.


2. Bootblock - There's actually a third, non-flashable section of a ProLiant ROM.  This "boot block" section
includes a a disaster-recovery feature that lets the server flash a new ROM image, even if both of the existing ROM images are corrupted.


BIOS & firmware updates are often used to fix glitches, but HP (and presumably NASA) also add new features or enhancements too. We post release notes that describe all the fixes and enhancements added to each version.    Here's a recent one added to the BL460c G6.


For example, one enhancement in this latest version is a "boot override menu" (see screenshot below), displayed by hitting F11 during boot. It lets you specify a "one time" override of the RBSU boot order, so you can boot to some other device.  After booting that one time, the system will fall back to its original boot order settings.


 

Tips to Reduce Processor Latency


For some of financial and data-acquisition applications, it's more important to finish one calculation super-fast than a bunch of calculations slightly slower.  There's a group of HPC apps with a similar requirement:  two identical instructions need to have precisely the same latency, every time they're executed.


Real-Time Operating Systems (RTOS) can help address these two scenarios.  These OSes address latency in a number of ways; for example, by ditching device-polling and background cleanup tasks that that standard OS's normally do.


However, some features of modern industry-standard servers can hurt low- and consistant-latency computing.  For example, low-power processor modes might save power, but any such processor throttling can increase latency.  Another example would be management routines that consume CPU cycles, such as routines built into the BIOS of ProLiant server blades that occasionally use CPU cycles to track resource utilization and monitor correctable memory errors in the memory controller.


If you face these situations and have already gone with an RTOS, HP's got some settings in our RBSU (ROM BIOS Setup Utility) that can offer additional help.


Load up RBSU (accessed by pressing F9 while the system is booting), and change the following settings:
1) Set "ProLiant Power Regulator Mode" to "Static High Mode".
2) Disable processor c-state support. 
3) If you are running an application that is single-threaded, set "Processor Core Disable" to "One Core Enabled".
4) On Intel Xeon 5500-based servers (like the BL460c G6), disable "QPI Power Management", and ensure "Intel Turbo Boost Technology" is set to "Enabled".


If you want to go even further, there's a way to disable some of those periodic BIOS checks on processor utilization and correctable errors. For most G5 and G6 server blades, HP has a tool called conrep (provided with the Smart Start Scripting Tool Kit) that let you control these settings.


In the BL280c G6, BL460c G6, and BL490c G6, you can also disable those things straight from RBSU.  Hit "Control-A" within the RBSU, and some additional options will appear in the
"Service Options" menu.


Search
Follow Us


About the Author(s)
  • More than 25 years in the IT industry developing and managing marketing programs. Focused in emerging technologies like Virtualization, cloud and big data.
  • I work within EMEA ISS Central team and a launch manager for new products and general communications manager for EMEA ISS specific information.
  • Hello! I am a social media manager for servers, so my posts will be geared towards HP server-related news & info.
  • HP Servers, Converged Infrastructure, Converged Systems and ExpertOne
  • WW responsibility for development of ROI and TCO tools for the entire ISS portfolio. Technical expertise with a financial spin to help IT show the business value of their projects.
  • I am a member of the HP BladeSystem Portfolio Marketing team, so my posts will focus on all things blades and blade infrastructure. Enjoy!
  • Luke Oda is a member of the HP's BCS Marketing team. With a primary focus on marketing programs that support HP's BCS portfolio. His interests include all things mission-critical and the continuing innovation that HP demonstrates across the globe.
  • Global Marketing Manager with 15 years experience in the high-tech industry.
  • Network industry experience for more than 20 years - Data Center, Voice over IP, security, remote access, routing, switching and wireless, with companies such as HP, Cisco, Juniper Networks and Novell.
  • Greetings! I am on the HP Enterprise Group marketing team. Topics I am interested in include Converged Infrastructure, Converged Systems and Management, and HP BladeSystem.
Labels