More scratch space for everyone!

Today, we’re super excited to announce several major changes to the /scratch filesystem on Sherlock.

What’s /scratch already?

/scratch is Sherlock’s temporary, parallel and high-performance filesystem. It’s available from all the compute nodes in the cluster, and is aimed at storing temporary data, like raw job output, intermediate files, or unprocessed results.

All the details about /scratch can be found in the Sherlock documentation, at https://www.sherlock.stanford.edu/docs/storage/filesystems/#scratch

A brand new storage system

So first of all, Sherlock’s /scratch now uses a brand new underlying storage system: it’s newer, faster and better that the old system in many ways that are described in much more details in this other post.

But to sum it up, using newer and faster hardware, the new /scratch storage system is twice as large, dramatically accelerate small files access and metadata operations, and enables new filesystem features for better overall performance.

If you’d like to take advantage of that new system and are wondering what you need to benefit from its improved performance, the answer is pretty simple: nothing! Your data is already there: if you’re using $SCRATCH or $GROUP_SCRATCH today, you don’t have to do anything, you’re already using the new storage system.

How did that happen? You can read all about it in that post I mentioned above.

More space for everyone!

Now, some things don’t change, but others do. We’re really excited to announce that starting today, every user on Sherlock gets twice thrice 🎉✨ five times✨🎈 the amount of storage that was previously offered.

Yep, that’s right, starting today, every user on Sherlock gets 100TB in $SCRATCH. And because sharing is caring, each group gets an additional 100TB to share data in $GROUP_SCRATCH.

But wait, there’s more.

Because we know ownership-based user and group quotas were confusing at times, we’re moving away from them and are adopting a new, directory-based quota system. That means that all the files that are under a given $SCRATCH directory, and only them, will be accounted for in the quota usage, no matter which user and group owns them. It will makes finding files that count towards a given quota much easier.

Previously, with ownership-based accounting, a user with data in both her own $SCRATCH folder and in $GROUP_SCRATCH would see the sum of all those files’ size counted against both her user quota and the group quota. Plus, the group quota was de facto acting as a cap for all the users in the same group, which was penalizing for groups with more members.

Now, data in a user’s $SCRATCH and $GROUP_SCRATCH are accounted for independently, and they’re cumulative. Meaning that no matter how many members a group counts, each user will be able to use the same amount of storage, and won’t be impacted by what others in the group use.

Here what things looks like, more visually (and to scale!):
scratch quotas

  • before, individual ownership-based user quota (in blue) were limited by the overarching group quota (in purple).
  • now, each user can use up to their quota limit, without being impacted by others, and an additional 100TB is available for the group to share data among group members.

So not only individual quota values have been increased, but the change in quota type also means that the cumulative usable space in /scratch by each group will be much larger than before.

A new retention period

With that increase in space, we’re also updating the retention period on /scratch to 90 days. And because we don’t want to affect files that have been created less than 3 months ago, this change will not take effect immediately.

Starting Feb.3, 2020, all files stored in /scratch that have not been modified in the last 90 days will automatically be deleted from the filesystem.

This is in alignment with the vast majority of other computing centers, and a way to emphasize the temporary nature of the filesystem: /scratch is really designed to store temporary data, and provide high-performance throughput for parallel I/O.

For long-term storage of research data, we always recommend using Oak, which is also directly available from all compute nodes on Sherlock (you’ll find all the details about Oak at https://oak-storage.stanford.edu). Data can freely be moved between /scratch and Oak at very high throughput rates. We can suggest optimized solutions for this, so please don’t hesitate to reach out if you have any question.

TL;DR

Today, we’re announcing:

  1. a brand new storage system for /scratch on Sherlock
  2. a quota increase to 100TB for each user in $SCRATCH and each group in $GROUP_SCRATCH
  3. the move to directory-based quotas for easier accounting of space utilization, and for allowing each user to reach their $SCRATCH quota
  4. a new 90-day retention period for all files in /scratch, starting Feb. 3, 2020

All those changes have been reflected in the documentation at https://www.sherlock.stanford.edu/docs/storage/filesystems/

We hope those changes will enable more possibilities for computing on Sherlock, by allowing storage of larger datasets and running larger simulations.

As usual, if you have any question or comment, please don’t hesitate to let us know at srcc-support@stanford.edu.