A new tool to help optimize job resource requirementsIt’s not always easy to determine the right amount of resources to request for a computing job. Making sure that the application will have enough resources to run properly, but avoiding over-requests that would make the jobs spend too much
ClusterShell on SherlockEver wondered how your jobs were doing while they were running? Keeping a eye on a log file is nice, but what if you could quickly gather process lists, usage metrics and other data points from all the nodes your multi-node jobs are running
Job #1, again!This is not the first time, we’ve been through this already (not so long ago, actually) but today, the Slurm job id counter was reset and went from job #67043327 back to job #1.
Keep up to date with software updatesTo help users stay on top of software changes on Sherlock, we’ve recently introduced a new software updates RSS feed. It’s available from the Sherlock software list page, and you can directly add it to your RSS reader of choice. And if
A new interactive step in SlurmA new version of the sh_dev tool has been released, that leverages a recently-added Slurm feature. Slurm 20.11 introduced a new“interactive step”, designed to be used with salloc to automatically launch a terminal on an allocated compute
3.3 PFlops: Sherlock hits expansion milestoneSherlock is a traditional High-Performance Computing cluster in many aspects. But unlike most of similarly-sized clusters where hardware is purchased all at once, and refreshed every few years, it is in constant evolution. Almost like a
Tracking NFS problems down to the SFP levelWhen NFS problems turn out to be... not NFS problems at all.
Sherlock factsEver wondered how many compute nodes is Sherlock made of? Or how many users are using it? Or how many Infiniband cables link it all together? Well, wonder no more: head to the Sherlock facts page and see for yourself! > hint: there are...
New Sherlock on-boarding sessionsOne of the most requested improvements around Sherlock services, that came out of our recent user survey, was for more documentation and more training. This is why, to help new users get familiar with Sherlock's computing environment...