Sherlock changelog

An update about our plans to retire Sherlock 2.0

We wanted to share an important update about the future of some of Sherlock’s oldest compute nodes, in light of some of the more recent and worsening political and economic conditions.

As many of you know, we had planned to retire the oldest generation of nodes, the Sherlock 2.0-generation nodes, by September 30, 2025.

Sherlock 2.0 was initially started in 2017, and all the infrastructure — management servers, networking equipment, and compute nodes — have now largely exceeded their expected 4-year lifetime. We're very proud to have been able to continue running that computing infrastructure for more than twice its planned expectancy. Now, 8 years later, we're at a point where it poses a significant number of challenges, and the risks associated with hardware reliability, operating costs, and computing efficiency are starting to outweigh the benefits.

To minimize those risks, we were planning to retire this generation of the cluster by the end of this year. Our primary motivation was to (a) improve energy efficiency to reduce operating costs, and (b) ensure the ongoing stability and reliability of the platform (aging hardware is more prone to failures and more difficult to repair, which could lead to larger, more impactful outages).

However, given the ongoing large-scale reductions in government funding and unpredictable nature of the current political climate, we realize that retiring computing resources from the cluster would only add to the difficult situation that many Stanford researchers already face.

This is why we’ve decided to postpone the Sherlock 2.0 retirement plan for the foreseeable future. We know how critical access to compute resources is for research, and our priority is to keep Sherlock a useful and valuable resource to the Stanford community.

We have taken additional measures to prolong the usable life of Sherlock 2.0 and further extended usage of these compute nodes. To minimize the risks associated with running older hardware, we've started sourcing additional spare parts and equipment so we could respond quickly in case of hardware failures, and we are implementing other contingency plans to work around what couldn't be fixed or replaced.

Our mission at Stanford Research Computing is to support research, enable discovery, and help advance science. Thank you for your understanding and continued partnership, and for being part of the Sherlock community. We're here to help, and we will continue to provide updates as the situation evolves.