Slurm

From CMU -- Language Technologies Institute -- HPC Wiki
Revision as of 09:36, 18 July 2023 by 71.112.187.138 (talk)
Jump to navigation Jump to search


SLURM (Simple Linux Utility for Resource Management) is a job scheduler and resource management system commonly used in high-performance computing (HPC) environments. It allows users to submit and manage jobs on clusters or supercomputers. This guide provides a brief overview of SLURM and covers basic usage examples for sbatch and srun commands, along with common options for requesting resources such as memory, CPUs, and GPUs.

For more information on Slurm please consult the Slurm documentation

Introduction to SLURM

SLURM is a flexible and efficient framework for job submission and scheduling in HPC environments, enabling users to run parallel and distributed applications across multiple compute nodes in a coordinated manner. It serves as a job scheduler that queues, allocates, and launches both interactive and batch jobs.

Slurm provides two different types of jobs:

  • Interactive Jobs: Access a compute node like you would via ssh
  • Batch Jobs: Launch a job in the background

All jobs in the Slurm system are scheduled in a priority queue, meaning that heavy users may need to wait longer for their jobs to be launched. To manage jobs in the system, users can take advantage of several useful Slurm commands, including:

    • srun: Run a real time job, useful for launching interactive jobs
    • sbatch: Queue a batch job to be launched as available
    • scancel: Kill a running slurm job
    • sacct: View the history of your recently run jobs; Did they complete? fail?
    • sinfo: View the resources, nodes, gres on the slurm cluster
    • squeue: View the state of currently running jobs
    • sstat: Resource utilization by a particular job

These commands provide users with powerful tools to manage their jobs on the BABEL cluster and ensure that their work is completed efficiently and effectively. We will discuss these commands in more detail later.

Submitting Jobs

Submitting Jobs with sbatch

To submit a batch job using sbatch, create a shell script (e.g., job_script.sh) that contains the necessary commands and configurations for your job. Then, use the following command to submit the job:

 $ sbatch job_script.sh

SLURM will assign a unique job ID to your job and enqueue it for execution. You can monitor the status of your job using various SLURM commands like squeue or sacct.

Submitting Jobs with srun

For interactive or non-batch jobs, you can use the srun command. It allows you to execute commands directly on compute nodes. Here’s an example:

 $ srun -n 4 ./my_program

SLURM will assign a unique job ID to your job and enqueue it for execution. You can monitor the status of your job using various SLURM commands like squeue or sacct.

The -n option specifies the number of tasks you want to run. In the example above, we are running the my_program executable on four tasks.

Requesting Resources

SLURM provides options to request specific resources for your jobs, such as memory, CPUs, and GPUs.

Memory

To request a specific amount of memory for your job, use the --mem option with the desired value. For example, to request 8 GB of memory, use:

 $ #SBATCH --mem=8G

CPUs

SLURM allows you to request a specific number of CPUs for your job. Use the --cpus-per-task option to specify the number of CPUs needed. For example, to request 4 CPUs per task, use:

 $ #SBATCH --cpus-per-task=4

GPUs

If your job requires GPU resources, you can request them using the --gres option. For example, to request 2 GPUs, use:

 $ #SBATCH --gres=gpu:2

You can also specify the specific GPU type using the --gres option. For example, to request 2 NVIDIA V100 GPUs, use:

 $ #SBATCH --gres=gpu:v100:2

Sample Jobs

SBATCH

To submit a batch job using SBATCH, you need to create a job script file. Here's an example of a simple SBATCH job script:

#!/bin/bash
#SBATCH --job-name=myjob
#SBATCH --output=myjob.out
#SBATCH --error=myjob.err
#SBATCH --partition=compute
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=1
#SBATCH --time=1:00:00

# Your job commands go here
echo "Hello, World!"

In this example, the job script starts with specifying the job name, output file, error file, partition, number of nodes, tasks per node, CPUs per task, and maximum runtime. You can modify these parameters based on your job requirements. The actual commands of your job follow after the SBATCH directives.

To submit the job, use the following command:

sbatch myjob.sh

This command submits the job script file "myjob.sh" to the Slurm scheduler.

SRUN

If you prefer running jobs directly with SRUN without a batch job script, you can use the following command:

 srun --job-name=myjob --partition=compute --nodes=1 --ntasks-per-node=1 --cpus-per-task=1 --time=1:00:00 echo "Hello, World!"


In this example, the job name, partition, number of nodes, tasks per node, CPUs per task, and maximum runtime are specified as command-line options. The "echo" command is the actual job command that will be executed.

Feel free to modify the parameters and the job command according to your specific needs.

Tips and Tricks

Job Options

Here are some additional SLURM directives commonly which may make SLURM life more pleasant:

  • --output=/home/<username>/path/output_report-%j.out: Specifies the path and filename pattern for the job’s standard output and error logs. %j is a placeholder that will be replaced with the job ID.
  • --mail-type=END: Specifies the email notification types for job events. In this case, it is set to receive an email when the job ends.
  • --mail-user=<username>@andrew.cmu.edu: Specifies the email address where the job-related emails will be sent.

Here are other useful arguments for srun/sbatch:

  • -w, --nodelist: Nodes to run jobs (-x, --exclude is the opposite)
  • -t, --time: Time limit (D-HH:MM:SS, 0=infinity) [explained mode below in interactive jobs]
  • -o, --output: Output log files. It’s a good idea to flush output frequently to get timely output.

List Partitions

To get a list of available partitions in the cluster, you can use the following command:

sinfo

This command displays information about the partitions, including their names, node counts, and node states. It provides an overview of the available partitions that you can specify in your job submission.

List Nodes

To get a list of all nodes in the cluster, you can use the following command:

scontrol show nodes


This command provides detailed information about each node, including their names, states, CPU and memory information, and any associated partitions.

Node Details

To get specific information about a particular node, you can use the following command, replacing "node-name" with the actual name of the node:

scontrol show node node-name

This command displays detailed information about the specified node, including its state, CPU and memory information, and any associated partitions.

Partition Details

To get detailed information about a specific partition, you can use the following command, replacing "partition-name" with the actual name of the partition:

scontrol show partition partition-name

This command provides information about the specified partition, including its name, node range, state, and other properties.

By using these commands, you can gather essential information about the partitions and nodes in the Slurm cluster, which can be useful for job submission and understanding the cluster's current status.

Job Array Scheduling

Job Array Scheduling can be useful when you desire to run multiple jobs but want to use more than N resources at a time. This is a way to tell slurm to setup a queue for the jobs with only 10 being active at a time ?

 #!/bin/bash
 #SBATCH --array=1-50%10
 #SBATCH --max-concurrent=10
 
 #SBATCH [other options]
 
 # Your job commands here

Explanation of the options:

--array=1-50%10: This option defines the job array range from 1 to 50. The %10 specifies that a maximum of 10 tasks from the array can be running concurrently at any one time. Slurm will automatically manage the scheduling and execution of the array tasks based on the specified limit.

--max-concurrent=10: This option sets the maximum number of concurrently running tasks within the job array to 10. Slurm will ensure that only 10 tasks are active simultaneously, automatically scheduling the remaining tasks as earlier tasks complete.

By using the --array option with the specified range and the %10 syntax, combined with the --max-concurrent option, you can control the number of tasks running concurrently within the job array. Slurm will handle the scheduling and ensure that only the specified maximum number of tasks are active at any given time.

Submit your job within Python

Submitit is a lightweight tool for submitting Python functions for computation within a Slurm cluster. It basically wraps submission and provide access to results, logs and more

These are just a few common options for resource requests in SLURM. SLURM provides many more options for fine-grained resource management, job scheduling, and parallel computing. You can refer to the SLURM documentation for detailed information on all available options.

See Also

For more information on Slurm please consult the Slurm documentation, which provides detailed tutorials and resources for working with Slurm in an HPC environment.

Happy computing with SLURM!