Sherlock changelog

Storage quota units change: TB to TiB

by Kilian Cavalotti, Technical Lead & Architect, HPC
Following in Oak footsteps, we’re excited to announce that Sherlock is adopting a new unit of measure for file system quotas. Starting today, we're transitioning from Terabytes (TB) to Tebibytes (TiB) for all storage allocations on
Improvement
Data

Sherlock 4.0 is coming!

by Kilian Cavalotti, Technical Lead & Architect, HPC
New
Hardware
We are thrilled to announce that the next generation of Stanford's High-Performance Computing cluster is just around the corner. Mark your calendars for August 29, as we prepare to unveil Sherlock 4.0! Building on the success of previous

Sherlock goes full flash

by Stéphane Thiell & Kilian Cavalotti, Research Computing Team
Data
Hardware
Improvement
What could be more frustrating than anxiously waiting for your computing job to finish? Slow I/O that makes it take even longer is certainly high on the list. But not anymore! Fir, Sherlock’s scratch file system, has just undergone a major

Instant lightweight GPU instances are now available

by Kilian Cavalotti, Technical Lead & Architect, HPC
New
Hardware
We know that getting access to GPUs on Sherlock can be difficult and feel a little frustrating at times. Which is why we are excited to announce the immediate availability of our new instant lightweight GPU instances!

A new tool to help optimize job resource requirements

by Kilian Cavalotti, Technical Lead & Architect, HPC
It’s not always easy to determine the right amount of resources to request for a computing job. Making sure that the application will have enough resources to run properly, but avoiding over-requests that would make the jobs spend too much
Documentation
Scheduler
Improvement

ClusterShell on Sherlock

by Kilian Cavalotti, Technical Lead & Architect, HPC
Software
New
Ever wondered how your jobs were doing while they were running? Keeping a eye on a log file is nice, but what if you could quickly gather process lists, usage metrics and other data points from all the nodes your multi-node jobs are running

Job #1, again!

by Kilian Cavalotti, Technical Lead & Architect, HPC
This is not the first time, we’ve been through this already (not so long ago, actually) but today, the Slurm job id counter was reset and went from job #67043327 back to job #1.
Event
Scheduler

A new interactive step in Slurm

by Kilian Cavalotti, Technical Lead & Architect, HPC
A new version of the sh_dev tool has been released, that leverages a recently-added Slurm feature. Slurm 20.11 introduced a new“interactive step”, designed to be used with salloc to automatically launch a terminal on an allocated compute
Improvement
Scheduler

Tracking NFS problems down to the SFP level

by Kilian Cavalotti
Blog
Data
Hardware
When NFS problems turn out to be... not NFS problems at all.