timestamp1679706261451A new tool to help optimize job resource requirementsby Kilian Cavalotti, Technical Lead & Architect, HPCIt’s not always easy to determine the right amount of resources to request for a computing job. Making sure that the application will have enough resources to run properly, but avoiding over-requests that would make the jobs spend too much
timestamp1677204000000SRCF is expandingby Kilian Cavalotti, Technical Lead & Architect, HPCMaintenanceIn order to bring up a new building that will increase data center capacity, a full SRCF power shutdown is planned for late June 2023. It’s expected to last about a week, and Sherlock will be unavailable during that time.
timestamp1670036242756ClusterShell on Sherlockby Kilian Cavalotti, Technical Lead & Architect, HPCSoftwareNewEver wondered how your jobs were doing while they were running? Keeping a eye on a log file is nice, but what if you could quickly gather process lists, usage metrics and other data points from all the nodes your multi-node jobs are running
timestamp1667700685989Job #1, again!by Kilian Cavalotti, Technical Lead & Architect, HPCThis is not the first time, we’ve been through this already (not so long ago, actually) but today, the Slurm job id counter was reset and went from job #67043327 back to job #1.
timestamp1635528575955Keep up to date with software updatesby Kilian Cavalotti, Technical Lead & Architect, HPCTo help users stay on top of software changes on Sherlock, we’ve recently introduced a new software updates RSS feed. It’s available from the Sherlock software list page, and you can directly add it to your RSS reader of choice. And if
timestamp16174080000003.3 PFlops: Sherlock hits expansion milestoneby Kilian Cavalotti, Technical Lead & Architect, High Performance ComputingHardwareEventSherlock is a traditional High-Performance Computing cluster in many aspects. But unlike most of similarly-sized clusters where hardware is purchased all at once, and refreshed every few years, it is in constant evolution. Almost like a
timestamp1612549200000Tracking NFS problems down to the SFP levelby Kilian CavalottiBlogDataHardwareWhen NFS problems turn out to be... not NFS problems at all.
timestamp1600452000001New GPU options in the Sherlock catalogby Kilian CavalottiToday, we're introducing the latest generation of GPU accelerators in the Sherlock catalog: the NVIDIA A100 Tensor Core GPU. Each A100 GPU features 9.7 TFlops of double-precision (FP64) performance, up to 312 TFlops for deep-learning...
timestamp1589822580001New Sherlock on-boarding sessionsby Kilian CavalottiOne of the most requested improvements around Sherlock services, that came out of our recent user survey, was for more documentation and more training. This is why, to help new users get familiar with Sherlock's computing environment...