Editing
BABEL
(section)
Jump to navigation
Jump to search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Submitting Jobs== === Resource Scheduler === [[Slurm]] 20.11.9 is used for job scheduling. There are '''2''' main ways to request resources: * '''Interactive:''' Use <code>srun</code> for jobs where you need direct interaction with the running task, often after using `salloc` for interactive sessions. * '''Batch:''' Use <code>sbatch</code> for jobs that can run without user interaction, typically for longer or resource-intensive tasks, submitting them to the Slurm queue for scheduled execution. Here's an overview of our main partitions: * '''debug''' <pre> Purpose: Quick, short jobs for testing and debugging. Max Time: 12 hours Default Time: 1 hour Max GPUs: 2 Max CPUs: 64 QoS: debug_qos Limitations: No array jobs </pre> * '''general''' <pre> Purpose: General, standard computing tasks. Max Time: 2 days Default Time: 6 hours Max GPUs: 8 Max CPUs: 128 QoS: normal Limitations: No interactive sessions | sbatch only </pre> * '''preempt''' <pre> Purpose: Long-running jobs that can be preempted for higher priority tasks. Max Time: 31 days Default Time: 3 hours Max GPUs: 24 Max CPUs: 256 QoS: preempt_qos Limitations: No interactive sessions | sbatch only </pre> * '''cpu''' <pre> Purpose: CPU-only computing tasks. Max Time: 2 days Default Time: 6 hours Max GPUs: 0 Max CPUs: 128 QoS: cpu_qos Limitations: No interactive sessions | sbatch only </pre> * '''array''' <pre> Purpose: Array jobs for parallel task execution. Max Time: 12 days Default Time: 6 hours Max GPUs: 8 Max CPUs: 256 QoS: array_qos Limitations: No interactive sessions | sbatch only </pre> === Partition Table === <pre> Name MaxTRESPU MaxJobsPU MaxSubmitPU MaxTRES MinTRES Preempt ----------- ------------ --------- ----------- -------- ----------- ---------------------- normal gres/gpu=8 10 50 cpu=128 gres/gpu=1 array_qos,preempt_qos preempt_qos gres/gpu=24 24 100 cpu=256 gres/gpu=1 debug_qos gres/gpu=2 10 12 cpu=64 preempt_qos cpu_qos gres/gpu=0 10 50 cpu=128 preempt_qos array_qos gres/gpu=8 100 10000 cpu=256 preempt_qos </pre> === Viewing Partition Details === To explore the full configuration of all partitions, use the <code>scontrol</code> command: * <code>scontrol show part</code>: Displays detailed information about all available partitions. For specifics on a particular partition, include its name: * <code>scontrol show part <partition_name></code>: Shows detailed settings for the specified partition (e.g., <code>scontrol show part debug</code>). For detailed information on how to use these partitions, see our documentation [[Slurm|here]]. === Viewing QoS Details === Each partition is associated with a specific QoS (e.g., <code>debug_qos</code>, <code>normal</code>, <code>preempt_qos</code>), which defines rules such as maximum resource usage and preemption behavior. To view QoS information associated with your user account: sacctmgr show user $USER withassoc format=User,Account,DefaultQOS,QOS%4
Summary:
Please note that all contributions to CMU -- Language Technologies Institute -- HPC Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Project:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Navigation menu
Personal tools
Not logged in
Talk
Contributions
Create account
Log in
Namespaces
Page
Discussion
English
Views
Read
Edit
Edit source
View history
More
Search
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Tools
What links here
Related changes
Page information