timestamp1679706261451A new tool to help optimize job resource requirementsby Kilian Cavalotti, Technical Lead & Architect, HPCIt’s not always easy to determine the right amount of resources to request for a computing job. Making sure that the application will have enough resources to run properly, but avoiding over-requests that would make the jobs spend too much
timestamp1670036242756ClusterShell on Sherlockby Kilian Cavalotti, Technical Lead & Architect, HPCSoftwareNewEver wondered how your jobs were doing while they were running? Keeping a eye on a log file is nice, but what if you could quickly gather process lists, usage metrics and other data points from all the nodes your multi-node jobs are running
timestamp1667700685989Job #1, again!by Kilian Cavalotti, Technical Lead & Architect, HPCThis is not the first time, we’ve been through this already (not so long ago, actually) but today, the Slurm job id counter was reset and went from job #67043327 back to job #1.
timestamp1635528575955Keep up to date with software updatesby Kilian Cavalotti, Technical Lead & Architect, HPCTo help users stay on top of software changes on Sherlock, we’ve recently introduced a new software updates RSS feed. It’s available from the Sherlock software list page, and you can directly add it to your RSS reader of choice. And if
timestamp16174080000003.3 PFlops: Sherlock hits expansion milestoneby Kilian Cavalotti, Technical Lead & Architect, High Performance ComputingHardwareEventSherlock is a traditional High-Performance Computing cluster in many aspects. But unlike most of similarly-sized clusters where hardware is purchased all at once, and refreshed every few years, it is in constant evolution. Almost like a
timestamp1612549200000Tracking NFS problems down to the SFP levelby Kilian CavalottiBlogDataHardwareWhen NFS problems turn out to be... not NFS problems at all.
timestamp1589227740001Job #1by Kilian CavalottiIf you've been submitting jobs on Sherlock over the last couple days, you probably noticed something different about your your job ids... They lost a couple digits! If you submitted a job last week, its job id was likely in the 67,000...