ruse
is command-line tool developed by Jan Moren which facilitates measuring processes’ resource usage. It periodically measures the resource use of a process and its sub-processes, and can help users find out how much resource to allocate to their jobs. It will determine the actual memory, execution time and cores that individual programs or MPI applications need to request in their job submission options.ruse
will make it easier to write job resource requests , and allow users to get a better understanding of their applications’ behavior to take better advantage of Sherlock’s capabilities.As usual, if you have any question or comment, please don’t hesitate to reach out at [email protected].
]]>ruse
is command-line tool developed by Jan Moren which facilitates measuring processes’ resource usage. It periodically measures the resource use of a process and its sub-processes, and can help users find out how much resource to allocate to their jobs. It will determine the actual memory, execution time and cores that individual programs or MPI applications need to request in their job submission options.ruse
will make it easier to write job resource requests , and allow users to get a better understanding of their applications’ behavior to take better advantage of Sherlock’s capabilities.As usual, if you have any question or comment, please don’t hesitate to reach out at [email protected].
]]>A new version of the sh_dev
tool has been released, that leverages a recently-added Slurm feature.
Slurm 20.11 introduced a new “interactive step” , designed to be used with salloc
to automatically launch a terminal on an allocated compute node. This new type of job step resolves a number of problems with the previous interactive job approaches, both in terms of accounting and resource allocation.
In previous versions, launching an interactive job with srun --pty bash
would create a step 0, that was consuming resources, especially Generic Resources (GRES, ie. GPUs). Among other things, it made it impossible to use srun
within that allocation to launch subsequent steps. Any attempt would result in a “step creation temporarily disabled” error message.
Now, with this new feature, you can use salloc
to directly open a shell on a compute node. The new interactive step won’t consume any of the allocated resources, so you’ll be able to start additional steps with srun
within your allocation. sh_dev
(aka sdev
) has been updated to use interactive steps.
sh_dev
On the surface, nothing changes: you can continue to use sh_dev
exactly like before, to start an interactive session on one of the compute nodes dedicated to that task (the default), or on a node in any partition (which is particularly popular among node owners). You’ll be able to use the same options, with the same features (including X11 forwarding).
Under the hood, though, you’ll be leveraging the new interactive step automatically.
salloc
If you use salloc
on a regular basis, the main change is that the resulting shell will open on the first allocated node, instead of the node you ran salloc
on:
[kilian@sh01-ln01 login ~]$ salloc
salloc: job 25753490 has been allocated resources
salloc: Granted job allocation 25753490
salloc: Nodes sh02-01n46 are ready for job
[kilian@sh02-01n46 ~] (job 25753490) $
If you want to keep that initial shell on the submission host, you can simply specify a command as an argument, and the resulting command will continue to be executed as the calling user on the calling host:
[kilian@sh01-ln01 login ~]$ salloc bash
salloc: job 25752889 has been allocated resources
salloc: Granted job allocation 25752889
salloc: Nodes sh02-01n46 are ready for job
[kilian@sh01-ln01 login ~] (job 25752889) $
srun
If you’re used to run srun —pty bash
to get a shell on a compute node, you can continue to do so (as long as you don’t intend to run additional steps within the allocation).
But you can also just type salloc
, get a more usable shell, and save 60% in keystrokes!
Happy computing! And as usual, please feel free to reach out if you have comments or questions.
]]>A new version of the sh_dev
tool has been released, that leverages a recently-added Slurm feature.
Slurm 20.11 introduced a new “interactive step” , designed to be used with salloc
to automatically launch a terminal on an allocated compute node. This new type of job step resolves a number of problems with the previous interactive job approaches, both in terms of accounting and resource allocation.
In previous versions, launching an interactive job with srun --pty bash
would create a step 0, that was consuming resources, especially Generic Resources (GRES, ie. GPUs). Among other things, it made it impossible to use srun
within that allocation to launch subsequent steps. Any attempt would result in a “step creation temporarily disabled” error message.
Now, with this new feature, you can use salloc
to directly open a shell on a compute node. The new interactive step won’t consume any of the allocated resources, so you’ll be able to start additional steps with srun
within your allocation. sh_dev
(aka sdev
) has been updated to use interactive steps.
sh_dev
On the surface, nothing changes: you can continue to use sh_dev
exactly like before, to start an interactive session on one of the compute nodes dedicated to that task (the default), or on a node in any partition (which is particularly popular among node owners). You’ll be able to use the same options, with the same features (including X11 forwarding).
Under the hood, though, you’ll be leveraging the new interactive step automatically.
salloc
If you use salloc
on a regular basis, the main change is that the resulting shell will open on the first allocated node, instead of the node you ran salloc
on:
[kilian@sh01-ln01 login ~]$ salloc
salloc: job 25753490 has been allocated resources
salloc: Granted job allocation 25753490
salloc: Nodes sh02-01n46 are ready for job
[kilian@sh02-01n46 ~] (job 25753490) $
If you want to keep that initial shell on the submission host, you can simply specify a command as an argument, and the resulting command will continue to be executed as the calling user on the calling host:
[kilian@sh01-ln01 login ~]$ salloc bash
salloc: job 25752889 has been allocated resources
salloc: Granted job allocation 25752889
salloc: Nodes sh02-01n46 are ready for job
[kilian@sh01-ln01 login ~] (job 25752889) $
srun
If you’re used to run srun —pty bash
to get a shell on a compute node, you can continue to do so (as long as you don’t intend to run additional steps within the allocation).
But you can also just type salloc
, get a more usable shell, and save 60% in keystrokes!
Happy computing! And as usual, please feel free to reach out if you have comments or questions.
]]>