Editing
FAQ
Jump to navigation
Jump to search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== FAQ on Babel Resource Allocation and Best Practices == ==='''Q: How are jobs prioritized by the scheduler?'''=== Generally, all users have equally priority. Though the debug, general, and long partitions do not have user or group-specific priority, each partition is ranked in priority from high to low. User fairshare is factored into scheduling. In some cases, some users may have higher-priority access; for example, research groups who have donated nodes may request dedicated partitions for priority access to (a subset of) those nodes or may request a dedicated node reserved exclusively for their research group. === '''Q: Do you have advice for long-running jobs?''' === # Make sure your code saves checkpoints frequently so that it can recover from being preempted. # Post on the <code>#babel-babble</code> Slack channel first to alert other users. # Consider running on the `long` partition. === '''Q: What should I do if I notice another user's jobs/files are disrupting usage of the cluster for others?''' === Please message the <code>babble-babel</code> channel, tagging the user with the problematic job as well as <code>@help-babel</code>. Remember to '''communicate with respect'''; most errors are honest mistakes. === '''Q: What should I do if a model requires more compute resources?``` === Try to allocate more GPUs when you start the Shell session on the assigned compute node. === '''Q: What does it mean if I get an error message saying 'Unable to contact Slurm controller'?''' === Something has gone horribly wrong. Contact the system administrators to resolve this problem. === '''Q: How does this relate to front-end and back-end development?''' === Deploy both front-end and back-end server on the same compute node to avoid port forwarding issues === '''Q: I have other questions which aren't answered here.''' === Reach out on the <code>babble-babel</code> Slack channel, tagging in <code>@help-babel</code>. If you discover an answer which may be useful to others, please feel free to add to this FAQ.
Summary:
Please note that all contributions to CMU -- Language Technologies Institute -- HPC Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Project:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Navigation menu
Personal tools
Not logged in
Talk
Contributions
Create account
Log in
Namespaces
Page
Discussion
English
Views
Read
Edit
Edit source
View history
More
Search
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Tools
What links here
Related changes
Page information