urn:noticeable:projects:bYyIewUV308AvkMztxixSherlock changelogwww.sherlock.stanford.edu2023-11-16T02:21:28.317ZCopyright © SherlockNoticeablehttps://storage.noticeable.io/projects/bYyIewUV308AvkMztxix/newspages/GtmOI32wuOUPBTrHaeki/01h55ta3gs1vmdhtqqtjmk7m4z-header-logo.pnghttps://storage.noticeable.io/projects/bYyIewUV308AvkMztxix/newspages/GtmOI32wuOUPBTrHaeki/01h55ta3gs1vmdhtqqtjmk7m4z-header-logo.png#8c1515urn:noticeable:publications:tkzeo34ezqhztdmSbO5B2023-11-16T02:00:00Z2023-11-16T02:21:28.317ZA brand new Sherlock OnDemand experienceStanford Research Computing is proud to unveil Sherlock OnDemand 3.0, a cutting-edge enhancement to its computing and data storage resources, revolutionizing user interaction and efficiency.<p>Following a long tradition of <a href="https://news.sherlock.stanford.edu/publications/sherlock-on-demand?utm_source=noticeable&amp;utm_campaign=sherlock.a-brand-new-sherlock-ondemand-experience&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.tkzeo34ezqhztdmSbO5B&amp;utm_medium=newspage" rel="noopener nofollow" target="_blank" title="Sherlock OnDemand">announces</a> and <a href="https://news.sherlock.stanford.edu/publications/sherlock-goes-container-native?utm_source=noticeable&amp;utm_campaign=sherlock.a-brand-new-sherlock-ondemand-experience&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.tkzeo34ezqhztdmSbO5B&amp;utm_medium=newspage" rel="noopener nofollow" target="_blank" title="Sherlock goes container native">releases</a> during the <a href="https://supercomputing.org/?utm_source=noticeable&amp;utm_campaign=sherlock.a-brand-new-sherlock-ondemand-experience&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.tkzeo34ezqhztdmSbO5B&amp;utm_medium=newspage" rel="noopener nofollow" target="_blank" title="SuperComputing conference">SuperComputing</a> conference, and while <a href="https://sc23.supercomputing.org/?utm_source=noticeable&amp;utm_campaign=sherlock.a-brand-new-sherlock-ondemand-experience&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.tkzeo34ezqhztdmSbO5B&amp;utm_medium=newspage" rel="noopener nofollow" target="_blank" title="SC23">SC23</a> is underway in Denver CO, <strong>Stanford Research Computing is proud to unveil Sherlock OnDemand 3.0,</strong> a cutting-edge enhancement to its computing and data storage resources, revolutionizing user interaction and efficiency. <br><br><strong>The upgraded Sherlock OnDemand is available immediately at </strong><a href="https://ondemand.sherlock.stanford.edu?utm_source=noticeable&amp;utm_campaign=sherlock.a-brand-new-sherlock-ondemand-experience&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.tkzeo34ezqhztdmSbO5B&amp;utm_medium=newspage" rel="noopener nofollow" target="_blank" title="Sherlock OnDemand"><strong>https://ondemand.sherlock.stanford.edu</strong></a> </p><p></p><figure><img src="https://storage.noticeable.io/projects/bYyIewUV308AvkMztxix/publications/tkzeo34ezqhztdmSbO5B/01hfaqwynskjp4v9s198vs7ppg-image.png" alt="" loading="lazy" title=""></figure><p></p><p><span style="color: var(--text-primary);">This new release brings a host of transformative changes. A lot happened under the hood, but the visible changes are significant as well.</span></p><p><strong><span style="color: var(--text-primary);">Infrastructure upgrades:</span></strong></p><ul><li><p><strong><span style="color: var(--tw-prose-bold);">A new URL:</span></strong> Sherlock OnDemand is now accessible at <a href="https://ondemand.sherlock.stanford.edu?utm_source=noticeable&amp;utm_campaign=sherlock.a-brand-new-sherlock-ondemand-experience&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.tkzeo34ezqhztdmSbO5B&amp;utm_medium=newspage" rel="noopener nofollow" target="_blank"><span style="color: rgba(41,100,170,var(--tw-text-opacity));">https://ondemand.sherlock.stanford.edu</span></a>, in line<span style="color: rgb(15, 15, 15);"> with our other instances, for a more homogeneous </span>user experience across Research Computing systems. The previous URL will still work for a time, and redirections will be progressively deployed to ease the transition.</p></li><li><p><strong><span style="color: var(--tw-prose-bold);">New engine, same feel:</span></strong> a lot of internal components have undergone substantial updates, but the familiar interface remains intact, ensuring a seamless transition for existing users.</p></li><li><p><strong><span style="color: var(--tw-prose-bold);">Streamlined authentication:</span></strong> Sherlock OnDemand now uses <a href="https://openid.net/?utm_source=noticeable&amp;utm_campaign=sherlock.a-brand-new-sherlock-ondemand-experience&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.tkzeo34ezqhztdmSbO5B&amp;utm_medium=newspage" rel="noopener nofollow" target="_blank" title="OpenID">OIDC</a> via the Stanford central Identity Provider instead of SAML, resulting in a lighter, more robust configuration for enhanced security.</p></li><li><p><strong><span style="color: var(--tw-prose-bold);">Enhanced Performance:</span></strong> expect a more responsive interface and improved reliability with the eradication of 422 HTTP errors.</p></li></ul><h2><strong><span style="color: var(--text-primary);">User-centric features:</span></strong></h2><ul><li><p><strong><span style="color: var(--tw-prose-bold);">Expanded file access:</span></strong> all your <a href="https://uit.stanford.edu/service/oak-storage?utm_source=noticeable&amp;utm_campaign=sherlock.a-brand-new-sherlock-ondemand-experience&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.tkzeo34ezqhztdmSbO5B&amp;utm_medium=newspage" rel="noopener nofollow" target="_blank" title="Oak">Oak</a> groups, are now conveniently listed in the embedded file browser for easier and more comprehensive access to your data. And if you have <code>rclone</code> remotes already configured on Sherlock, you’'ll find them there as well!</p></li><li><p><strong><span style="color: var(--tw-prose-bold);">Effortless support tickets:</span></strong> you can now send support tickets directly from the OnDemand interface, which will automatically include contextual information about your interactive sessions, to simply issue resolution.</p></li><li><p><strong><span style="color: var(--tw-prose-bold);">New interactive apps:</span></strong> In addition to the existing apps, VS Code server, MATLAB, and JupyterLab join the platform, offering expanded functionalities, like the ability of loading and unloading of modules within JupyterLab directly.<br><em>Yes, you read that right: we now have <strong>VS Code</strong> and <strong>MATLAB</strong> in Sherlock OnDemand!</em><br>The RStudio app has also been rebuilt from the ground up, providing a much better and reliable experience.</p><p style="text-align: center;"></p><figure><img src="https://storage.noticeable.io/projects/bYyIewUV308AvkMztxix/publications/tkzeo34ezqhztdmSbO5B/01hfaxb83938p5dwqfxj532jp6-image.png" alt="" loading="lazy" title=""></figure><p></p></li><li><p><strong><span style="color: var(--tw-prose-bold);">Customizable working directories:</span></strong> users can now select a working directory across all interactive apps, for easier customization of their work environment.</p></li></ul><p><span style="color: var(--text-primary);">For more details and guidance on using the new features, check out the updated documentation at </span><a href="https://www.sherlock.stanford.edu/docs/user-guide/ondemand/.?utm_source=noticeable&amp;utm_campaign=sherlock.a-brand-new-sherlock-ondemand-experience&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.tkzeo34ezqhztdmSbO5B&amp;utm_medium=newspage" rel="noopener nofollow" target="_blank"><span style="color: var(--text-primary);">https://www.sherlock.stanford.edu/docs/user-guide/ondemand/.</span></a><span style="color: var(--text-primary);"><br></span><strong><span style="color: var(--text-primary);"><br>This update delivers a brand new computing experience, designed to empower you in your work. </span></strong><span style="color: var(--text-primary);">Sherlock OnDemand 3.0 marks a significant milestone in optimizing user access to computing resources, lowering the barrier to entry for new users, and empowering researchers with an unparalleled computing environment. We're excited to see how it will enhance your productivity and efficiency, so dive into this transformative experience today and elevate your computing endeavors to new heights with Sherlock OnDemand 3.0!<br><br>And as usual, if you have any question, comment or suggestion, don’t hesitate to reach out at </span><a href="mailto:[email protected]" rel="noopener nofollow" target="_blank" title="support"><span style="color: var(--text-primary);">[email protected]</span></a><span style="color: var(--text-primary);">. </span></p>Kilian Cavalotti[email protected]urn:noticeable:publications:yYBxYUSUYLiw2D6qzR0S2023-05-12T22:30:44.259Z2023-05-12T22:32:58.168ZFinal hours announced for the June 2023 SRCF downtimeAs previously announced, the Stanford Research Computing Facility (SRCF), where Sherlock is hosted, will be powered off during the last week of June, in order to safely bring up power to the new SRCF2 datacenter. Sherlock will not be<p>As <a href="https://news.sherlock.stanford.edu/publications/srcf-is-expanding?utm_source=noticeable&amp;utm_campaign=sherlock.final-hours-announced-for-the-june-2023-srcf-downtime&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.yYBxYUSUYLiw2D6qzR0S&amp;utm_medium=newspage" target="_blank" title="SRCF is expanding">previously announced</a>, the Stanford Research Computing Facility (SRCF), where Sherlock is hosted, will be powered off during the last week of June, in order to safely bring up power to the new SRCF2 datacenter.</p><blockquote><p><strong>Sherlock will not be available for login, to submit jobs or to access files</strong> from <strong>Saturday June 24th, 2023 at 00:00 PST</strong> to <strong>Monday July 3rd, 2023 at 18:00 PST.</strong></p></blockquote><p>Jobs will stop running and access to login nodes will be closed at 00:00 PST on Saturday, June 24th, to allow sufficient time for shutdown and pre-downtime maintenance tasks on the cluster, before the power actually goes out. If everything goes according to plan, and barring issues or delays with power availability, access will be restored on Monday, July 3rd at 18:00 PST.</p><p>We will use this opportunity to perform necessary maintenance operations on Sherlock that can’t be done while jobs are running, which will avoid having to schedule a whole separate downtime. Sherlock will go offline in advance of the actual electrical shutdown to ensure that all equipment is properly powered off and minimize the risks of disruption and failures when power is restored.<br><br>A reservation will be set in the scheduler for the duration of the downtime: if you submit a job on Sherlock and the time you request exceeds the time remaining until the start of the downtime, your job will be queued until the maintenance is over, and the <code>squeue</code> command will report a status of <code>ReqNodeNotAvailable</code> (“Required Node Not Available”).</p><p><em>The hours leading up to a downtime are an excellent time to submit shorter, smaller jobs that can complete before the maintenance begins: as the queues drain there will be many nodes available, and your wait time may be shorter than usual.<br><br></em>As previously mentioned, in anticipation of this week-long downtime, we encourage all users to plan their work accordingly, and ensure that they have contingency plans in place for their computing and data accessibility needs during that time. <strong>If you have important data that you need to be able to access while Sherlock is down, we strongly recommend that you start transferring your data to off-site storage systems ahead of time, to avoid last-minute complications.</strong> Similarly, if you have deadlines around the time of the shutdown that require computation results, make sure to anticipate those and submit your jobs to the scheduler as early as possible.<br><br>We understand that this shutdown will have a significant impact for users who rely on Sherlock for their computing and data processing needs, and we appreciate your cooperation and understanding as we work to improve our Research Computing infrastructure.<br><br>For help transferring data, any questions or concerns, please do not hesitate to reach out to <a href="mailto:[email protected]" rel="noopener nofollow" target="_blank">[email protected]</a>.</p>Kilian Cavalotti[email protected]urn:noticeable:publications:MARmnxM2JHvznq8MaK6q2022-12-14T17:27:18.657Z2022-12-14T17:27:26.687ZMore free compute on Sherlock!We’re thrilled to announce that the free and generally available normal partition on Sherlock is getting an upgrade! With the addition of 24 brand new SH3_CBASE.1 compute nodes, each featuring one AMD EPYC 7543 Milan 32-core CPU and 256 GB<p>We’re thrilled to announce that the free and generally available <code>normal</code> partition on Sherlock is getting an upgrade!<br><br>With the addition of 24 brand new <a href="https://www.sherlock.stanford.edu/docs/orders/?h=cbase&amp;utm_source=noticeable&amp;utm_campaign=sherlock.more-free-compute-on-sherlock&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.MARmnxM2JHvznq8MaK6q&amp;utm_medium=newspage#configurations" rel="noopener nofollow" target="_blank" title="Sherlock node configurations">SH3_CBASE.1</a> compute nodes, each featuring one <a href="https://www.amd.com/en/products/cpu/amd-epyc-7543?utm_source=noticeable&amp;utm_campaign=sherlock.more-free-compute-on-sherlock&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.MARmnxM2JHvznq8MaK6q&amp;utm_medium=newspage" rel="noopener nofollow" target="_blank" title="AMD EPYC 7543">AMD EPYC 7543</a> Milan 32-core CPU and 256 GB of RAM, Sherlock users now have 768 more CPU cores at there disposal. Those new nodes will complete the existing 154 compute nodes and 4,032 core in that partition, for a <strong>new total of 178 nodes and 4,800 CPU cores.</strong><br><br>The <code>normal</code> partition is Sherlock’s shared pool of compute nodes, which is available <a href="https://www.sherlock.stanford.edu/?utm_source=noticeable&amp;utm_campaign=sherlock.more-free-compute-on-sherlock&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.MARmnxM2JHvznq8MaK6q&amp;utm_medium=newspage#how-much-does-it-cost" rel="noopener nofollow" target="_blank" title="Sherlock cost">free of charge</a> to all Stanford Faculty members and their research teams, to support their wide range of computing needs. <br><br>In addition to this free set of computing resources, Faculty can supplement these shared nodes by <a href="https://www.sherlock.stanford.edu/docs/orders/?utm_source=noticeable&amp;utm_campaign=sherlock.more-free-compute-on-sherlock&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.MARmnxM2JHvznq8MaK6q&amp;utm_medium=newspage" rel="noopener nofollow" target="_blank" title="Purchasing Sherlock compute nodes">purchasing additional compute nodes</a>, and become Sherlock owners. By investing in the cluster, PI groups not only receive exclusive access to the nodes they purchased, but also get access to all of the other owner compute nodes when they're not in use, thus giving them access to the <a href="https://www.sherlock.stanford.edu/docs/tech/facts/?utm_source=noticeable&amp;utm_campaign=sherlock.more-free-compute-on-sherlock&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.MARmnxM2JHvznq8MaK6q&amp;utm_medium=newspage" rel="noopener nofollow" target="_blank" title="Sherlock facts">whole breadth of Sherlock resources</a>, currently over over 1,500 compute nodes, 46,000 CPU cores and close to 4 PFLOPS of computing power.<br><br>We hope that this new expansion of the <code>normal</code> partition, made possible thanks to additional funding provided by the University Budget Group as part of the FY23 budget cycle, will help support the ever-increasing computing needs of the Stanford research community, and enable even more breakthroughs and discoveries.<br><br>As usual, if you have any question or comment, please don’t hesitate to reach out at <a href="mailto:[email protected]" rel="noopener" target="_blank">[email protected]</a>.<br><br><br><br></p>Kilian Cavalotti[email protected]urn:noticeable:publications:Hdh5qDe3icyS6vJXdQpt2021-11-30T17:00:00Z2021-11-30T18:27:25.812ZFrom Rome to Milan, a Sherlock catalog updateIt’s been almost a year and a half since we first introduced Sherlock 3.0 and its major new features: brand new CPU model and manufacturer, 2x faster interconnect, much larger and faster node-local storage, and more! We’ve now reached an<p>It’s been almost a year and a half since we first <a href="https://news.sherlock.stanford.edu/publications/sherlock-3-0-is-here?utm_source=noticeable&amp;utm_campaign=sherlock.from-rome-to-milan-a-sherlock-catalog-update&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.Hdh5qDe3icyS6vJXdQpt&amp;utm_medium=newspage" rel="noopener nofollow" target="_blank" title="Sherlock 3.0">introduced Sherlock 3.0</a> and its major new features: brand new CPU model and manufacturer, 2x faster interconnect, much larger and faster node-local storage, and more! We’ve now reached an inflexion point in Sherlock’s current generation and it’s time to update the hardware configurations available for purchase in the <a href="https://www.sherlock.stanford.edu/catalog?utm_source=noticeable&amp;utm_campaign=sherlock.from-rome-to-milan-a-sherlock-catalog-update&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.Hdh5qDe3icyS6vJXdQpt&amp;utm_medium=newspage" rel="noopener nofollow" target="_blank" title="Sherlock catalog">Sherlock catalog</a>.<br><br>So today, <strong>we’re introducing a new Sherlock catalog refresh</strong>, a Sherlock 3.5 of sorts.</p><h1>The new catalog</h1><p>So, what changes? What stays the same?<br>In a nutshell, you’ll continue to be able to purchase the existing node types that you’re already familiar with:</p><p><strong>CPU configurations:</strong></p><ul><li><p><code>CBASE</code>: base configuration ($)</p></li><li><p><code>CPERF</code>: high core-count configuration ($$)</p></li><li><p><code>CBIGMEM</code>: large-memory configuration ($$$$)</p></li></ul><p><strong>GPU configurations</strong></p><ul><li><p><code>G4FP32</code>: base GPU configuration ($$)</p></li><li><p><code>G4TF64</code>: HPC GPU configuration ($$$)</p></li><li><p><code>G8TF64</code>: best-in-class GPU configuration ($$$$)</p></li></ul><p>But they now come with better and faster components!<br><br><em>To avoid confusion, the configuration names in the catalog will be suffixed with a index to indicate the generational refresh, but will keep the same global denomination. For instance, the previous <code>SH3_CBASE</code> configuration is now replaced with a <code>SH3_CBASE.1</code> configuration that still offers 32 CPU cores and 256 GB of RAM.</em></p><h2>A new CPU generation</h2><p>The main change in the existing configuration is the introduction of the new <a href="https://www.amd.com/en/processors/epyc-7003-series?utm_source=noticeable&amp;utm_campaign=sherlock.from-rome-to-milan-a-sherlock-catalog-update&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.Hdh5qDe3icyS6vJXdQpt&amp;utm_medium=newspage" rel="noopener nofollow" target="_blank" title="AMD EPYC™ 7003 Series Processors">AMD 3rd Gen EPYC Milan</a> CPUs. In addition to the advantages of the previous Rome CPUs, this new generation brings:</p><ul><li><p>a new micro-architecture (Zen3)</p></li><li><p>a ~20% performance increase in instructions completed per clock cycle (IPC)</p></li><li><p>enhanced memory performance, with a unified 32 MB L3 cache</p></li><li><p>improved CPU clock speeds</p></li></ul><p></p><figure><img src="https://storage.noticeable.io/projects/bYyIewUV308AvkMztxix/publications/Hdh5qDe3icyS6vJXdQpt/01h55ta3gsmvbh16hzp4z34xt9-image.jpg" alt="" loading="lazy" title=""></figure><p></p><p>More specifically, for Sherlock, the following CPU models are now used:</p><table><tbody><tr><th data-colwidth="105"><p>Model</p></th><th><p>Sherlock 3.0 (Rome)</p></th><th><p>Sherlock 3.5 (Milan)</p></th></tr><tr><td data-colwidth="105"><p><code>CBASE</code></p></td><td><p>1× 7502 (32-core, 2.50GHz)</p></td><td><p>1× 7543 (32-core, 2.75GHz)</p></td></tr><tr><td data-colwidth="105"><p><code>CPERF</code></p></td><td><p>2× 7742 (64-core, 2.25GHz)</p></td><td><p>2× 7763 (64-core, 2.45GHz)</p></td></tr><tr><td data-colwidth="105"><p><code>CBIGMEM</code></p></td><td><p>2× 7502 (32-core, 2.50GHz)</p></td><td><p>2× 7543 (32-core, 2.75GHz)</p></td></tr><tr><td data-colwidth="105"><p><code>G4FP32</code></p></td><td><p>1× 7502 (32-core, 2.50GHz)</p></td><td><p>1× 7543 (32-core, 2.75GHz)</p></td></tr><tr><td data-colwidth="105"><p><code>G4TF64</code></p></td><td><p>2× 7502 (32-core, 2.50GHz)</p></td><td><p>2× 7543 (32-core, 2.75GHz)</p></td></tr><tr><td data-colwidth="105"><p><code>G8TF64</code></p></td><td><p>2× 7742 (64-core, 2.25GHz)</p></td><td><p>2× 7763 (64-core, 2.45GHz)</p></td></tr></tbody></table><p>In addition to IPC and L3 cache improvements, the new CPUs also bring a frequency boost that will provide a substantial performance improvement.<br></p><h2>New GPU options</h2><p>On the GPU front, the two main changes are the re-introduction of the <code>G4FP32</code> model, and the doubling of GPU memory all across the board.<br><br>GPU memory is quickly becoming the constraining factor for training deep-learning models that keep increasing in size. Having large amounts of GPU memory is now key for running medical imaging workflows, computer vision models, or anything that requires processing large images.</p><p>The entry-level <code>G4FP32</code> model is back in the catalog, with a new <a href="NVIDIA A40" rel="noopener nofollow" target="_blank" title="https://www.nvidia.com/en-us/data-center/a40/">NVIDIA A40 GPU</a> in an updated <code>SH3_G4FP32.2</code> configuration. The A40 GPU not only provides higher performance than the previous model it replaces, but it also comes with twice as much GPU memory, with a whopping 48GB of GDDR6.<br></p><figure><img src="https://storage.noticeable.io/projects/bYyIewUV308AvkMztxix/publications/Hdh5qDe3icyS6vJXdQpt/01h55ta3gsf1ph3715608pjxhk-image.png" alt="" loading="lazy" title=""></figure><p></p><p>The higher-end <code>G4TF64</code> and <code>G8TF64</code> models have also been updated with newer AMD CPUs, as well as updated versions of the <a href="https://www.nvidia.com/en-us/data-center/a100/?utm_source=noticeable&amp;utm_campaign=sherlock.from-rome-to-milan-a-sherlock-catalog-update&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.Hdh5qDe3icyS6vJXdQpt&amp;utm_medium=newspage" rel="noopener nofollow" target="_blank" title="NVIDIA A100">NVIDIA A100 GPU</a>, now each featuring a massive 80GB of HBM2e memory.<br></p><figure><img src="https://storage.noticeable.io/projects/bYyIewUV308AvkMztxix/publications/Hdh5qDe3icyS6vJXdQpt/01h55ta3gsx86d48cxj1sx0zf1-image.png" alt="" loading="lazy" title=""></figure><p></p><h1>Get yours today!</h1><p>For more details and pricing, please check out the <a href="https://www.sherlock.stanford.edu/catalog?utm_source=noticeable&amp;utm_campaign=sherlock.from-rome-to-milan-a-sherlock-catalog-update&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.Hdh5qDe3icyS6vJXdQpt&amp;utm_medium=newspage" rel="noopener" target="_blank">Sherlock catalog</a> <em>(SUNet ID required)</em>.<br><br>If you’re interested in <a href="https://www.sherlock.stanford.edu/docs/orders/?utm_source=noticeable&amp;utm_campaign=sherlock.from-rome-to-milan-a-sherlock-catalog-update&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.Hdh5qDe3icyS6vJXdQpt&amp;utm_medium=newspage" rel="noopener nofollow" target="_blank" title="purchasing process">getting your own compute nodes</a> on Sherlock, all the new configurations are available for purchase today, and can be ordered online though the <a href="https://www.sherlock.stanford.edu/order?utm_source=noticeable&amp;utm_campaign=sherlock.from-rome-to-milan-a-sherlock-catalog-update&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.Hdh5qDe3icyS6vJXdQpt&amp;utm_medium=newspage" rel="noopener" target="_blank">Sherlock order form</a> <em>(SUNet ID required)</em>.<br><br>As usual, please don’t hesitate to <a href="mailto:[email protected]" rel="noopener" target="_blank">reach out</a> if you have any questions!</p>Kilian Cavalotti[email protected]urn:noticeable:publications:NGxi6lYLPRYFL9aZSN8O2020-11-05T01:49:00.001Z2020-11-05T02:26:57.373ZSH3_G4FP32 nodes are back in the catalog!A new GPU option is available in the Sherlock catalog... again! After a period of unavailability and a transition between GPU generations, where previous models were retired while new ones were not available yet, we're pleased to...<p><strong>A new GPU option is available in the <a href="https://www.sherlock.stanford.edu/catalog?utm_source=noticeable&amp;utm_campaign=sherlock.sh-3-g-4-fp-32-nodes-are-back-in-the-catalog&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.NGxi6lYLPRYFL9aZSN8O&amp;utm_medium=newspage" target="_blank" rel="noopener">Sherlock catalog</a>… again!</strong></p> <p>After a period of unavailability and a transition between GPU generations, where previous models were retired while new ones were not available yet, we’re pleased to announce that the entry-level GPU node configuration is now back in the catalog. With a vengeance!</p> <p>Built around the same platform as the previous <code>SH3_G4FP32</code> generation, the new <strong><code>SH3_G4FP32.1</code></strong> model features:</p> <ul> <li>32 CPU cores</li> <li>256 GB of memory</li> <li>2TB of local NVMe scratch space</li> <li>4x <a href="https://www.nvidia.com/en-us/geforce/graphics-cards/30-series/rtx-3090/?utm_source=noticeable&amp;utm_campaign=sherlock.sh-3-g-4-fp-32-nodes-are-back-in-the-catalog&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.NGxi6lYLPRYFL9aZSN8O&amp;utm_medium=newspage" target="_blank" rel="noopener">GeForce RTX 3090</a> GPUs, each featuring 24GB of GPU memory</li> <li>a 200GB/s Infiniband HDR interface</li> </ul> <p><img src="https://storage.noticeable.io/projects/bYyIewUV308AvkMztxix/publications/NGxi6lYLPRYFL9aZSN8O/01h55ta3gsa6gy01b3kh67raaf-image.png" alt="rtx3090.png"></p> <p>Particularly well-suited for applications that don’t require full double-precision computations (FP64), the top-of-the-line RTX 3090 GPU is based on the latest NVIDIA <a href="https://www.nvidia.com/en-us/data-center/nvidia-ampere-gpu-architecture/?utm_source=noticeable&amp;utm_campaign=sherlock.sh-3-g-4-fp-32-nodes-are-back-in-the-catalog&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.NGxi6lYLPRYFL9aZSN8O&amp;utm_medium=newspage" target="_blank" rel="noopener">Ampere</a> architecture and provides what’s probably the best performance/cost ratio on the market today for those use cases, and delivers almost twice the performance of the previous generations on many ML/AI workloads, as well as a significant boost for Molecular Dynamics and CryoEM applications.</p> <p>For more details and pricing, please check out the <a href="https://www.sherlock.stanford.edu/catalog?utm_source=noticeable&amp;utm_campaign=sherlock.sh-3-g-4-fp-32-nodes-are-back-in-the-catalog&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.NGxi6lYLPRYFL9aZSN8O&amp;utm_medium=newspage" target="_blank" rel="noopener">Sherlock catalog</a> <em>(SUNet ID required)</em>, and if you’re interested in purchasing your own compute nodes for Sherlock, the new <code>SH3_G4FP32.1</code> configuration is available for purchase today, and can be ordered online though the <a href="https://www.sherlock.stanford.edu/order?utm_source=noticeable&amp;utm_campaign=sherlock.sh-3-g-4-fp-32-nodes-are-back-in-the-catalog&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.NGxi6lYLPRYFL9aZSN8O&amp;utm_medium=newspage" target="_blank" rel="noopener">Sherlock order form</a> <em>(SUNet ID required)</em>.</p> Kilian Cavalotti[email protected]urn:noticeable:publications:GiqpTJFx844GaHTJX1kP2020-08-20T00:42:00.001Z2020-08-20T00:44:03.173ZSherlock 3.0 is here!It's been a long, long, way too long of a wait, but despite a global pandemic, heatwaves, thunderstorms, power shutoffs, fires and smoke, it's finally here! Today, we're very excited to announce the immediate availability of Sherlock 3...<p>It’s been a long, long, way too long of a wait, but despite a global pandemic, heatwaves, thunderstorms, power shutoffs, fires and smoke, it’s finally here!</p> <p><strong>Today, we’re very excited to announce the immediate availability of <em>Sherlock 3.0</em>, the third generation of the Sherlock cluster.</strong></p> <h2>What is Sherlock 3.0?</h2> <p>First, let’s take a quick step back for context.</p> <p>The Sherlock cluster is built around core Infiniband fabrics, which connect compute nodes together and allow them to work as a single entity. As we expand Sherlock over time, more compute nodes are added to the cluster, and when a core fabric reaches capacity, a new one needs to be spun up. This is usually a good opportunity to refresh the compute node hardware characteristics, as well as continue expanding and renewing ancillary equipment and services, such as login nodes, DTNs, storage systems, etc. The collection of compute and service nodes connected to the same Infiniband fabric constitutes a sort of island, or <em>generation</em>, that could live on its own, but is actually an integral part of the greater, unified Sherlock cluster.</p> <p>So far, since its inception in 2014, Sherlock has grown over two generations of nodes: the first one built around an FDR (56Gb/s) Infiniband fabric, and the second one, started in 2017, around an EDR (100Gb/s) fabric.</p> <p>Late last year, that last EDR fabric reached capacity, and after a long and multifactorial hiatus, today, we’re introducing the third generation of Sherlock, architectured around a new Infiniband fabric, and a completely refreshed compute node offering.</p> <h2>What does it look like?</h2> <p>Sherlock still looks like a bunch of black boxes with tiny lights, stuffed in racks 6ft high, and with an insane number of cables going everywhere.</p> <p>But in more technical details, Sherlock 3.0 features:</p> <ul> <li><p><strong>a new, faster interconnect | Infiniband HDR, 200Gb/s</strong><br> The new interconnect provides more bandwidth and lower latency to all the new nodes on Sherlock, for either inter-node communication in large parallel MPI applications, or for accessing the <code>$SCRATCH</code> and <code>$OAK</code> parallel file systems.<br> <em>Sherlock is one of the first HPC clusters in the world to provide 200Gb/s to the nodes.</em></p></li> <li><p><strong>new and faster processors | AMD 2nd generation EPYC (Rome) CPUs</strong><br> To take advantage of the doubled inter-node bandwidth, a brand new generation of CPUs was required, to provide enough internal bandwidth between the CPUs and the network interfaces. The <a href="https://www.amd.com/en/processors/epyc-7002-series?utm_source=noticeable&amp;utm_campaign=sherlock.sherlock-3-0-is-here&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.GiqpTJFx844GaHTJX1kP&amp;utm_medium=newspage" target="_blank" rel="noopener">AMD Rome CPUs</a> are actually the first (and currently still the only) x86 CPU model to provide PCIe Gen4 connectivity, that enables faster local and remote I/O, and that can unlock 200Gb/s network speeds.<br> Those CPUs are also faster, draw less power, and provide more cores per socket than the ones found in the previous generations of Sherlock nodes, with a minimum of 32 CPU cores per node.</p></li> <li><p><strong>more (and faster) internal storage | 2TB NVMe per node</strong><br> Sherlock 3.0 nodes now each feature a minimum of 2TB of local NVMe storage (over 10x the previous amount), for applications that are particularly sensitive to IOPS rates.</p></li> <li><p><strong>refreshed <code>$HOME</code> storage</strong><br> More nodes means more computing power, but it also means more strain on the shared infrastructure. To absorb it, we’ve also refreshed and expanded the storage cluster that supports the <code>$HOME</code> and <code>$GROUP_HOME</code> storage spaces, to provide higher bandwidth, more IOPS, and better availability.</p></li> <li><p><strong>more (and faster) login and DTN nodes</strong><br> Sherlock 3.0 also feature 8 brand new login nodes, that are part of the <code>login.sherlock.stanford.edu</code> login pool, and each feature a pair of AMD 7502 CPUs (for a total of 64 cores) and 512 GB of RAM. As well as a new pair of dedicated <a href="https://www.sherlock.stanford.edu/docs/storage/data-transfer/?utm_source=noticeable&amp;utm_campaign=sherlock.sherlock-3-0-is-here&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.GiqpTJFx844GaHTJX1kP&amp;utm_medium=newspage#data-transfer-nodes-dtns" target="_blank" rel="noopener">Data Transfer Nodes (DTNs)</a></p></li> <li><p><strong>refreshed and improved infrastructure</strong><br> The list would be too long to go through exhaustively, but between additional service nodes to better scale the distributed cluster management infrastructure, improved Ethernet topology between the racks, and a refreshed hardware framework for the job scheduler, all the aspects of Sherlock have been rethought and improved.</p></li> </ul> <h2>What does it change for me?</h2> <p>In terms of habits and workflows: nothing. You don’t have to change anything and can continue to use Sherlock exactly the way you’ve been using it so far.</p> <p>Sherlock is still a single cluster, with the same:</p> <ul> <li>single point of entry at <code>login.sherlock.stanford.edu</code>,</li> <li>single and ubiquitous data storage space (you can still access all of your data on all the file systems, from all the nodes in the cluster),</li> <li>single application stack (you can load the same module and run the same software on all Sherlock nodes).</li> </ul> <p>But it now features a third island, with a new family of compute nodes.</p> <p>One thing you’ll probably notice pretty quickly is that your pending times in queue for the <code>normal</code>, <code>bigmem</code> and <code>gpu</code> partitions have been dropping. Considerably.</p> <p>This is because, thanks to the generous sponsorship of the <a href="https://provost.stanford.edu?utm_source=noticeable&amp;utm_campaign=sherlock.sherlock-3-0-is-here&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.GiqpTJFx844GaHTJX1kP&amp;utm_medium=newspage" target="_blank" rel="noopener">Stanford Provost</a>, we’ve been able to add the following resources to Sherlock’s public partitions:</p> <table> <thead> <tr><th>partition</th><th>#nodes</th><th>node specs</th></tr> </thead> <tbody> <tr><td><code>normal</code></td><td>72</td><td>32-core (1x <a href="https://www.amd.com/en/products/cpu/amd-epyc-7502?utm_source=noticeable&amp;utm_campaign=sherlock.sherlock-3-0-is-here&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.GiqpTJFx844GaHTJX1kP&amp;utm_medium=newspage#product-specs" target="_blank" rel="noopener">7502</a>) w/ 256GB RAM</td></tr> <tr><td><code>normal</code></td><td>2</td><td>128-core (2x <a href="https://www.amd.com/en/products/cpu/amd-epyc-7742?utm_source=noticeable&amp;utm_campaign=sherlock.sherlock-3-0-is-here&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.GiqpTJFx844GaHTJX1kP&amp;utm_medium=newspage#product-specs" target="_blank" rel="noopener">7742</a>) w/ 1TB RAM</td></tr> <tr><td><code>bigmem</code></td><td>1</td><td>64-core (2x <a href="https://www.amd.com/en/products/cpu/amd-epyc-7502?utm_source=noticeable&amp;utm_campaign=sherlock.sherlock-3-0-is-here&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.GiqpTJFx844GaHTJX1kP&amp;utm_medium=newspage#product-specs" target="_blank" rel="noopener">7502</a>) w/ 4TB RAM</td></tr> <tr><td><code>gpu</code></td><td>16</td><td>32-core (1x <a href="https://www.amd.com/en/products/cpu/amd-epyc-7502P?utm_source=noticeable&amp;utm_campaign=sherlock.sherlock-3-0-is-here&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.GiqpTJFx844GaHTJX1kP&amp;utm_medium=newspage#product-specs" target="_blank" rel="noopener">7502P</a>) w/ 256GB RAM and 4x <a href="https://www.nvidia.com/en-us/geforce/graphics-cards/rtx-2080-ti/?utm_source=noticeable&amp;utm_campaign=sherlock.sherlock-3-0-is-here&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.GiqpTJFx844GaHTJX1kP&amp;utm_medium=newspage#specs" target="_blank" rel="noopener">RTX 2080 Ti</a> GPUs</td></tr> <tr><td><code>gpu</code></td><td>2</td><td>32-core (1x <a href="https://www.amd.com/en/products/cpu/amd-epyc-7502P?utm_source=noticeable&amp;utm_campaign=sherlock.sherlock-3-0-is-here&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.GiqpTJFx844GaHTJX1kP&amp;utm_medium=newspage#product-specs" target="_blank" rel="noopener">7502P</a>) w/ 256GB RAM and 4x <a href="https://www.nvidia.com/en-us/data-center/v100/?utm_source=noticeable&amp;utm_campaign=sherlock.sherlock-3-0-is-here&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.GiqpTJFx844GaHTJX1kP&amp;utm_medium=newspage#specs" target="_blank" rel="noopener">V100S</a> GPUs</td></tr> <tr><td><strong>Total</strong></td><td><strong>93</strong></td><td><strong>3,200 cores, 30TB RAM, 72 GPUs</strong></td></tr> </tbody> </table> <p>Those new Sherlock 3.0 nodes are adding over twice the existing computing power available for free to every Sherlock user in the <code>normal</code>, <code>bigmem</code> and <code>gpu</code> partitions.</p> <h3>How can I use the new nodes?</h3> <p>It’s easy! You can keep submitting your jobs as usual, and the scheduler will automatically try to pick the new nodes that satisfy your request requirements if they’re available.</p> <p>If you want to target the new nodes specifically, take a look at the output of <code>sh_node_feat</code>: all the new nodes have features defined that allow the scheduler to specifically select them when your job requests particular constraints.</p> <p>For instance, if you want to select nodes:</p> <ul> <li>with HDR IB connectivity, you can use <code>-C IB:HDR</code></li> <li>with AMD Rome CPUs, you can use <code>-C CPU_GEN:RME</code></li> <li>with 7742 CPUs, you can use <code>-C CPU_SKU:7742</code></li> <li>with Turing GPUs, you can use <code>-C GPU_GEN:TUR</code></li> </ul> <h2>Can I get more of it?</h2> <p>Absolutely! And we’re ready to take orders today.</p> <p>If you’re interested in getting your own compute nodes on Sherlock, we’ve assembled a catalog of select configurations that you can choose from, and worked very hard with our vendors to maintain comparable price ranges with our previous generation offerings.</p> <p>You’ll find the detailed configuration and pricing in the <em>Sherlock Compute Nodes Catalog</em>, and we’ve also prepared an <em>Order Form</em> that you can use to provide the required information to purchase those nodes</p> <ul> <li><p><strong>Sherlock catalog</strong><br> <a href="http://www.sherlock.stanford.edu/docs/overview/orders/catalog?utm_source=noticeable&amp;utm_campaign=sherlock.sherlock-3-0-is-here&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.GiqpTJFx844GaHTJX1kP&amp;utm_medium=newspage" target="_blank" rel="noopener">http://www.sherlock.stanford.edu/docs/overview/orders/catalog</a></p></li> <li><p><strong>Order form</strong><br> <a href="http://www.sherlock.stanford.edu/docs/overview/orders/form?utm_source=noticeable&amp;utm_campaign=sherlock.sherlock-3-0-is-here&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.GiqpTJFx844GaHTJX1kP&amp;utm_medium=newspage" target="_blank" rel="noopener">http://www.sherlock.stanford.edu/docs/overview/orders/form</a></p></li> </ul> <p>For complete details about the purchasing process, please take a look at<br> <a href="https://www.sherlock.stanford.edu/docs/overview/orders/?utm_source=noticeable&amp;utm_campaign=sherlock.sherlock-3-0-is-here&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.GiqpTJFx844GaHTJX1kP&amp;utm_medium=newspage" target="_blank" rel="noopener">https://www.sherlock.stanford.edu/docs/overview/orders/</a> and as usual,<br> please let us know if you have any questions.</p> <hr> <p>Finally, we wanted to sincerely thank every one of you for your patience while we were working on bringing up this new cluster generation, in an unexpectedly complicated global context. We know it’s been a very long wait, but hopefully it will have been worthwhile.</p> <p>Happy computing and don’t hesitate to <a href="mailto:[email protected]" target="_blank" rel="noopener">reach out</a>!</p> <p><em>Oh, and <a href="https://www.sherlock.stanford.edu/docs/overview/introduction/?utm_source=noticeable&amp;utm_campaign=sherlock.sherlock-3-0-is-here&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.GiqpTJFx844GaHTJX1kP&amp;utm_medium=newspage#user-community" target="_blank" rel="noopener">Sherlock is on Slack now</a>, so feel free to come join us there too!</em></p> Kilian Cavalotti[email protected]urn:noticeable:publications:g1onSmTqpRkG7fxCZ0iB2020-03-16T17:06:00.001Z2020-03-16T17:25:58.581ZSherlock joins the fight against COVID-19At SRCC, we've been monitoring the COVID-19 situation very closely, and we are following all of the University guidance and recommendations regarding social distancing and workplace adaptations. While we're continuing to operate our...<p>At <a href="https://srcc.stanford.edu?utm_source=noticeable&amp;utm_campaign=sherlock.sherlock-joins-the-fight-against-covid-19&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.g1onSmTqpRkG7fxCZ0iB&amp;utm_medium=newspage" target="_blank" rel="noopener">SRCC</a>, we’ve been monitoring the COVID-19 situation very closely, and we are following all of the <a href="https://healthalerts.stanford.edu?utm_source=noticeable&amp;utm_campaign=sherlock.sherlock-joins-the-fight-against-covid-19&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.g1onSmTqpRkG7fxCZ0iB&amp;utm_medium=newspage" target="_blank" rel="noopener">University guidance and recommendations</a><sup class="footnote-ref"><a href="#fn1" id="fnref1">[1]</a></sup> regarding social distancing and workplace adaptations. While we’re continuing to operate our services normally<sup class="footnote-ref"><a href="#fn2" id="fnref2">[2]</a></sup>, we’ve implemented changes to minimize health risks: we’ve moved our office hours sessions online, and started online consultation appointments.</p> <p>But we wanted to do more, mobilize our forces and try to contribute with the resources we have, at our own level.</p> <p>This is why, starting immediately, <strong>we’ll be dedicating a portion of Sherlock’s resources to research projects related to COVID-19.</strong> If you’re working on such a project and need additional resources on Sherlock, please contact us at <a href="mailto:[email protected]" target="_blank" rel="noopener">[email protected]</a>, with a quick description of your work, and we’ll coordinate access to those resources for you and your research team.</p> <p>We would also like to call on Sherlock owners: if you have dedicated compute nodes on Sherlock, and would like to contribute some of those resources to this essential computing effort, please reach out!</p> <p>Our mission is to support research, and with this initiative, we hope Sherlock will help time-critical projects to get to results faster, and contribute to the fight against this coronavirus.</p> <hr class="footnotes-sep"> <section class="footnotes"> <ol class="footnotes-list"> <li id="fn1" class="footnote-item"><p>We strongly encourage everyone to regularly consult and follow recommendations provided on <a href="https://healthalerts.stanford.edu?utm_source=noticeable&amp;utm_campaign=sherlock.sherlock-joins-the-fight-against-covid-19&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.g1onSmTqpRkG7fxCZ0iB&amp;utm_medium=newspage" target="_blank" rel="noopener">https://healthalerts.stanford.edu</a>, especially the latest <a href="https://healthalerts.stanford.edu/2020/03/14/guidance-for-the-research-environment?utm_source=noticeable&amp;utm_campaign=sherlock.sherlock-joins-the-fight-against-covid-19&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.g1onSmTqpRkG7fxCZ0iB&amp;utm_medium=newspage" target="_blank" rel="noopener">Guidance for the research environment</a>. <a href="#fnref1" class="footnote-backref">↩</a></p> </li> <li id="fn2" class="footnote-item"><p>for more information about SRCC’s continuity of support, please see <a href="https://srcc.stanford.edu/news/srcc-continuity-support-during-covid-19?utm_source=noticeable&amp;utm_campaign=sherlock.sherlock-joins-the-fight-against-covid-19&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.g1onSmTqpRkG7fxCZ0iB&amp;utm_medium=newspage" target="_blank" rel="noopener">https://srcc.stanford.edu/news/srcc-continuity-support-during-covid-19</a>. <a href="#fnref2" class="footnote-backref">↩</a></p> </li> </ol> </section> Kilian Cavalotti[email protected]urn:noticeable:publications:IARfWlFT8IPjMRyh74t12019-12-03T23:30:00.001Z2019-12-03T23:33:14.769ZMore scratch space for everyone!Today, we're super excited to announce several major changes to the /scratch filesystem on Sherlock. What's /scratch already? /scratch is Sherlock's temporary, parallel and high-performance filesystem. It's available from all the...<p>Today, we’re super excited to announce several major changes to the <code>/scratch</code> filesystem on Sherlock.</p> <h2>What’s <code>/scratch</code> already?</h2> <p><code>/scratch</code> is Sherlock’s temporary, parallel and high-performance filesystem. It’s available from all the compute nodes in the cluster, and is aimed at storing temporary data, like raw job output, intermediate files, or unprocessed results.</p> <p>All the details about <code>/scratch</code> can be found in the Sherlock documentation, at <a href="https://www.sherlock.stanford.edu/docs/storage/filesystems/?utm_source=noticeable&amp;utm_campaign=sherlock.more-scratch-space-for-everyone&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.IARfWlFT8IPjMRyh74t1&amp;utm_medium=newspage#scratch" target="_blank" rel="noopener">https://www.sherlock.stanford.edu/docs/storage/filesystems/#scratch</a></p> <h2>A brand new storage system</h2> <p>So first of all, Sherlock’s <code>/scratch</code> now uses a brand new underlying storage system: it’s newer, faster and better that the old system in many ways that are described in much more details <a href="https://news.sherlock.stanford.edu/posts/a-new-scratch?utm_source=noticeable&amp;utm_campaign=sherlock.more-scratch-space-for-everyone&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.IARfWlFT8IPjMRyh74t1&amp;utm_medium=newspage" target="_blank" rel="noopener">in this other post</a>.</p> <p>But to sum it up, using newer and faster hardware, the new <code>/scratch</code> storage system is twice as large, dramatically accelerate small files access and metadata operations, and enables new filesystem features for better overall performance.</p> <p>If you’d like to take advantage of that new system and are wondering what you need to benefit from its improved performance, the answer is pretty simple: nothing! Your data is already there: if you’re using <code>$SCRATCH</code> or <code>$GROUP_SCRATCH</code> today, you don’t have to do anything, you’re already using the new storage system.</p> <p>How did that happen? You can read all about it in <a href="https://news.sherlock.stanford.edu/posts/a-new-scratch-is-here?utm_source=noticeable&amp;utm_campaign=sherlock.more-scratch-space-for-everyone&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.IARfWlFT8IPjMRyh74t1&amp;utm_medium=newspage" target="_blank" rel="noopener">that post I mentioned above</a>.</p> <h2>More space for everyone!</h2> <p>Now, some things don’t change, but others do. We’re really excited to announce that starting today, every user on Sherlock gets <del>twice</del> <del>thrice</del> 🎉✨ <strong>five times</strong>✨🎈 the amount of storage that was previously offered.</p> <p>Yep, that’s right, starting today, <strong>every user</strong> on Sherlock gets <strong>100TB</strong> in <code>$SCRATCH</code>. And because sharing is caring, <strong>each group</strong> gets an additional <strong>100TB</strong> to share data in <code>$GROUP_SCRATCH</code>.</p> <p>But wait, there’s more.</p> <p>Because we know ownership-based user and group quotas were confusing at times, we’re moving away from them and are adopting a new, directory-based quota system. That means that all the files that are under a given <code>$SCRATCH</code> directory, and only them, will be accounted for in the quota usage, no matter which user and group owns them. It will makes finding files that count towards a given quota much easier.</p> <p>Previously, with ownership-based accounting, a user with data in both her own <code>$SCRATCH</code> folder and in <code>$GROUP_SCRATCH</code> would see the sum of all those files’ size counted against both her user quota and the group quota. Plus, the group quota was <em>de facto</em> acting as a cap for all the users in the same group, which was penalizing for groups with more members.</p> <p>Now, data in a user’s <code>$SCRATCH</code> and <code>$GROUP_SCRATCH</code> are accounted for independently, and they’re cumulative. Meaning that no matter how many members a group counts, each user will be able to use the same amount of storage, and won’t be impacted by what others in the group use.</p> <p>Here what things looks like, more visually (and to scale!):<br> <img src="https://docs.google.com/drawings/d/e/2PACX-1vQrfxu9oTcz6Ilbta3X1BTF9fGMlQLul77ftTpbRFvwLGnrwhNlIjRUvqDcfYiSA80ARN6rtCkU1lFW/pub?w=1317&amp;h=881" alt="scratch quotas"></p> <ul> <li>before, individual ownership-based user quota (in blue) were limited by the overarching group quota (in purple).</li> <li>now, each user can use up to their quota limit, without being impacted by others, <em>and</em> an additional 100TB is available for the group to share data among group members.</li> </ul> <p>So not only individual quota values have been increased, but the change in quota type also means that the cumulative usable space in <code>/scratch</code> by each group will be much larger than before.</p> <h2>A new retention period</h2> <p>With that increase in space, we’re also updating the retention period on <code>/scratch</code> to 90 days. And because we don’t want to affect files that have been created less than 3 months ago, this change will not take effect immediately.</p> <p><strong>Starting Feb.3, 2020, all files stored in <code>/scratch</code> that have not been modified in the last 90 days will automatically be deleted from the filesystem.</strong></p> <p>This is in alignment with the vast majority of other computing centers, and a way to emphasize the temporary nature of the filesystem: <code>/scratch</code> is really designed to store temporary data, and provide high-performance throughput for parallel I/O.</p> <p>For long-term storage of research data, we always recommend using <a href="https://www.sherlock.stanford.edu/docs/storage/filesystems/?utm_source=noticeable&amp;utm_campaign=sherlock.more-scratch-space-for-everyone&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.IARfWlFT8IPjMRyh74t1&amp;utm_medium=newspage#oak" target="_blank" rel="noopener">Oak</a>, which is also directly available from all compute nodes on Sherlock (you’ll find all the details about Oak at <a href="https://oak-storage.stanford.edu?utm_source=noticeable&amp;utm_campaign=sherlock.more-scratch-space-for-everyone&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.IARfWlFT8IPjMRyh74t1&amp;utm_medium=newspage" target="_blank" rel="noopener">https://oak-storage.stanford.edu</a>). Data can freely be moved between <code>/scratch</code> and Oak at very high throughput rates. We can suggest optimized solutions for this, so please don’t hesitate to reach out if you have any question.</p> <h2>TL;DR</h2> <p>Today, we’re announcing:</p> <ol> <li>a brand new storage system for <code>/scratch</code> on Sherlock</li> <li>a quota increase to 100TB for each user in <code>$SCRATCH</code> and each group in <code>$GROUP_SCRATCH</code></li> <li>the move to directory-based quotas for easier accounting of space utilization, and for allowing each user to reach their <code>$SCRATCH</code> quota</li> <li>a new 90-day retention period for all files in <code>/scratch</code>, starting Feb. 3, 2020</li> </ol> <p>All those changes have been reflected in the documentation at <a href="https://www.sherlock.stanford.edu/docs/storage/filesystems/?utm_source=noticeable&amp;utm_campaign=sherlock.more-scratch-space-for-everyone&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.IARfWlFT8IPjMRyh74t1&amp;utm_medium=newspage" target="_blank" rel="noopener">https://www.sherlock.stanford.edu/docs/storage/filesystems/</a></p> <p>We hope those changes will enable more possibilities for computing on Sherlock, by allowing storage of larger datasets and running larger simulations.</p> <p>As usual, if you have any question or comment, please don’t hesitate to let us know at <a href="mailto:[email protected]" target="_blank" rel="noopener">[email protected]</a>.</p> Kilian Cavalotti[email protected]urn:noticeable:publications:rW4rSZPoLiD3lXmUAPAA2019-10-21T22:24:00.001Z2019-10-21T22:44:51.528ZCall for scientific awesomeness!Every fall for the last 30 years, the HPC community has gathered at SC, the International Conference for High Performance Computing Networking, Storage, and Analysis, and will meet again this year, for the SC19 conference, in Denver, CO...<p>Every fall for the last 30 years, the HPC community has gathered at <a href="https://http://supercomputing.org?utm_source=noticeable&amp;utm_campaign=sherlock.call-for-scientific-awesomeness&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.rW4rSZPoLiD3lXmUAPAA&amp;utm_medium=newspage" target="_blank" rel="noopener">SC</a>, the International Conference for High Performance Computing Networking, Storage, and Analysis, and will meet again this year, for the <a href="https://sc19.supercomputing.org?utm_source=noticeable&amp;utm_campaign=sherlock.call-for-scientific-awesomeness&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.rW4rSZPoLiD3lXmUAPAA&amp;utm_medium=newspage" target="_blank" rel="noopener">SC19 conference</a>, in Denver, CO (Nov. 17-22).</p> <p><a href="https:/srcc.stanford.edu" target="_blank" rel="noopener">Stanford Research Computing</a> and <a href="https://slac.stanford.edu?utm_source=noticeable&amp;utm_campaign=sherlock.call-for-scientific-awesomeness&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.rW4rSZPoLiD3lXmUAPAA&amp;utm_medium=newspage" target="_blank" rel="noopener">SLAC</a> will be at the conference, to exchange with our peers at other institutions, meet vendors and present our activities in support of the Stanford/SLAC research community.</p> <p>This is always an excellent opportunity to present real-world utilization of our computing resources and showcase amazing research projects that those resources enable.</p> <p>If you’d like to have your work featured in the Stanford/SLAC booth, and would be willing to provide visual content (pictures or videos), please feel free contact us at <a href="mailto:[email protected]" target="_blank" rel="noopener">[email protected]</a> with a few words about your research projects that were enabled by having computing and/or storage platforms at Stanford available to you. We’ll be very happy to showcase your work during this international conference and show the world what you all do.</p> Kilian Cavalotti[email protected]urn:noticeable:publications:VcpyH1Z8ZMI2suBVz9TT2019-03-20T00:21:00.001Z2019-03-20T17:26:34.022ZOut with the old, in with the new!It's been a little over five years since Sherlock first entered production and ran its first jobs. A lot of time has passed, and with it, Sherlock grew ten-fold, its user base exploded and technology trends evolved significantly. The...<p>It’s been a little over five years since Sherlock first entered production and ran its first jobs. A lot of time has passed, and with it, Sherlock grew ten-fold, its user base exploded and technology trends evolved significantly.</p> <h2>The old</h2> <p>The original Sherlock nodes are still running jobs, but they’re incrementally moving towards irrelevance as they lack the features, the performance and the power efficiency of the most recently added compute nodes.</p> <p>This is why we’re starting the process of gradually retiring the original Sherlock nodes. This will mainly affect the Sherlock public tiers in the <code>normal</code>, <code>gpu</code> and <code>bigmem</code> partition, but also some of the earliest owner partitions (we’ve started contacting the owners of those nodes).</p> <p>The most noticeable effect will be that the size of those partitions will start shrinking, and over time, those nodes and their 5-year old <a href="https://en.wikipedia.org/wiki/Ivy_Bridge_(microarchitecture)?utm_source=noticeable&amp;utm_campaign=sherlock.out-with-the-old-in-with-the-new&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.VcpyH1Z8ZMI2suBVz9TT&amp;utm_medium=newspage" target="_blank" rel="noopener">Ivy Bridge</a> generation CPUs will progressively disappear.</p> <p>To compensate for these retirements, we’re also introducing new nodes to the <code>normal</code> partition.</p> <h2>The new</h2> <p>The new nodes are based on the latest Intel <a href="https://en.wikipedia.org/wiki/Skylake_(microarchitecture)?utm_source=noticeable&amp;utm_campaign=sherlock.out-with-the-old-in-with-the-new&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.VcpyH1Z8ZMI2suBVz9TT&amp;utm_medium=newspage" target="_blank" rel="noopener">Skylake</a> microarchitecture and have the following specifications:</p> <ul> <li>dual-socket <a href="https://ark.intel.com/content/www/us/en/ark/products/120473/intel-xeon-gold-5118-processor-16-5m-cache-2-30-ghz.html?utm_source=noticeable&amp;utm_campaign=sherlock.out-with-the-old-in-with-the-new&amp;utm_content=publication+link&amp;utm_id=bYyIewUV308AvkMztxix.GtmOI32wuOUPBTrHaeki.VcpyH1Z8ZMI2suBVz9TT&amp;utm_medium=newspage" target="_blank" rel="noopener">Intel Xeon Gold 5118</a> (2x 12-core, 24 core per node)</li> <li>192GB of memory (RAM)</li> <li>200GB local SSD (for <code>$L_SCRATCH</code>)</li> <li>EDR Infiniband (100Gbps) connectivity to the Sherlock fabric.</li> </ul> <p>You’ll be able to request those nodes specifically by using job submission constraints and adding the following flag to your job submission options: <code>-C "CPU_GEN:SKX"</code></p> <p>To see the list of all the available node features and characteristics that can be requested in the <code>normal</code> partition, you can run:</p> <pre><code class="hljs language-shell"><span class="hljs-meta">$</span><span class="bash"> sh_node_feat -p normal</span> CPU_FRQ:2.30GHz CPU_FRQ:2.40GHz CPU_FRQ:2.60GHz CPU_GEN:BDW CPU_GEN:HSW CPU_GEN:IVB CPU_GEN:SKX CPU_SKU:5118 CPU_SKU:E5-2640v3 CPU_SKU:E5-2640v4 CPU_SKU:E5-2650v2 </code></pre> <p>And as always, please don’t hesitate to let us know if you have any question, by reaching out to <a href="mailto:[email protected]" target="_blank" rel="noopener">[email protected]</a>.</p> Kilian Cavalotti[email protected]