- Channel HP
- :
- Enterprise Business Blogs
- :
- Servers
- :
- Hyperscale Computing Blog
- :
- SLURM: a Simple Resource Manager no more!
- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Email to a Friend
- Printer Friendly Page
- Report Inappropriate Content
SLURM: a Simple Resource Manager no more!
SLURM, the "Highly Scalable Resource Manager" from Lawrence Livermore National Laboratory, has come a long way. When we started using it for some of the products in our Unified Cluster Portfolio, SLURM had only a simple FIFO scheduler, no job accounting, and the ability to support perhaps a thousand nodes.
SLURM also provided a very clean architecture which allowed HP to contribute the first versions of job accounting for Linux clusters, support for multi-threaded and hyperthreaded architectures, complex job scheduling such as gang scheduling, and fine-grained allocation of system resources ("consumable resources").
SLURM version 2.0 has just been released, and what a powerhouse it is! Heterogeneous clusters, up to 65,000 nodes, resource limits, and job prioritization. Moe Jetty and Danny Auble, the primary authors of SLURM, discuss it on this podcast.
My compliments to the entire team!
- Mark as Read
- Mark as New
- Bookmark
- Highlight
- Email to a Friend
- Report Inappropriate Content
SLURM is an open-source resource manager designed for Linux clusters of all sizes. It provides three key functions. First it allocates exclusive and/or non-exclusive access to resources (computer nodes) to users for some duration of time so they can perform work. Second, it provides a framework for starting, executing, and monitoring work (typically a parallel job) on a set of allocated nodes. Finally, it arbitrates contention for resources by managing a queue of pending work.





