Better error messages when submitting jobs
September 18th, 2018 at 9:53 PM
Improvement
Scheduler
Sherlock now offers a better and more complete explanation when a job submission is rejected by the scheduler.
What does it look like?
In the most common cases, jobs that don’t meet the requirements for the partition they’re submitted to, will display a more detailed message.
For instance, submitting a job to the gpu
partition without requesting a GPU will look like this:
$ srun -p gpu --pty bash
srun: error:
=============================================================================
ERROR: missing GPU request, job not submitted
=============================================================================
Jobs submitted to the gpu partition must explicitly request GPUs, by using
the --gres option.
-----------------------------------------------------------------------------
srun: error: Unable to allocate resources: Job violates accounting/QOS policy (job submit limit, user's size and/or time limits)
We hope this will make things easier when a job is rejected at submission time, and help clarify some of the errors sent back by the scheduler.
Don’t hesitate to contact us if you have any feedback or suggestion.