What is a Resource?

Resources are the generic name for datasets, models, or any other files that can be made available to a run. Spell keeps these organized for you in a remote filesystem.

Viewing your resources

To look through your available resources, use spell ls or go to web.spell.run/resources:

$ spell ls
-        -              public
-        -              uploads
-        -              runs

Initially, you’ll see three directories: public , uploads, and runs. Read more about each of these resource types below.

Public

The public directory contains pre-packaged versions of commonly used datasets and a few pre-trained models. We try to keep the most commonly used datasets and models in public. If there’s something you’d like to see included, please let us know using spell feedback or emailing us at support@spell.run. See here for how to use Spell feedback.

Within public, the resources are organized in a series of directories based on category of the dataset.

$ spell ls public
-        -              audio
-        -              face
-        -              finance
-        -              image
-        -              models
-        -              social
-        -              text
-        -              tutorial
-        -              video
$ spell ls public/video
-        -              bdd100k (Berkeley DeepDrive...

To learn more about how to use public datasets and models in your Spell runs and workspaces, read about Mounting Resources below.

Uploads

In order to get the best performance from Spell, we recommend uploading your datasets before using them within a run or Jupyter workspace. While it is possible to mount a dataset to a run by including it in your Git repository (more on how Spell uses Git repositories), this is not recommended.

Note

Datasets over 1GB must be uploaded to Spell. Large datasets cannot be mounted to a run by including them in your Git repository.

To upload a dataset, use the spell upload command. The upload command has as a required parameter the name of the directory on your local drive that contains your dataset. There is also an optional --name argument for re-naming your dataset. The full command is spell upload directory-with-dataset [--name new-name].

An example:

$ mkdir new_upload
$ echo "creating a new_upload" > new_upload/file1 && echo "running on spell" > new_upload/file2
$ spell upload new_upload
Total upload size: 40B
Uploading to uploads/new_upload  [####################################]  100%
Upload of new_upload (/new_upload) to 'uploads/new_upload' complete.
$ spell ls uploads/new_upload
23       May 11 16:01   file1
17       May 11 16:01   file2

The dataset is now uploaded to your uploads folder which you can view using spell ls uploads or going to web.spell.run/resources/uploads.

Runs

The runs directory keeps the saved outputs from all of your runs, listed by run id. If a run doesn’t generate any outputs or if it was removed using spell rm, it won’t be listed.

$ spell ls runs
    -        Jan 31 17:02   1
    -        Jan 31 17:27   4

To view resources within a run, use spell ls with the path. This will return a table with the size of the resource, the date created, and the name of the directory or file.

$ spell ls runs/4
34       Jan 31 17:06   images
210      Jan 31 17:20   data

S3/GS buckets

Public buckets

All users can mount public S3/GS buckets into a run. For an example of how to do this, go to Mounting Resources below.

Private buckets

Organizations on the Teams plan are also able to mount private S3/GS buckets. See Cluster Bucket Management.

Mounting Resources

You can mount any of your resources into a run or Jupyter workspace.

To use the mount option, you must specify the path to your dataset. Optionally, you can also change the name you use to reference the dataset by specifying an alias after the colon :new-name.

$ spell run -m uploads/my-dataset:dataset "python main.py"

In the above command, we mounted the dataset at the path uploads/my-dataset and gave it the alias dataset as the reference for our run.

In this example, we mount the public dataset at public/audio/css10 and give it the alias audio-data.

$ spell run -m public/audio/css10:audio-data "python main.py"

In the next example, we mount monthly rainfall figures from the public SILO climate dataset and give it the alias data.

$ spell run -m s3://silo-open-data/annual/monthly_rain:data "python main.py"

Downloading Resources

Any resource can also be downloaded with the spell cp command. For example:

$ spell cp runs/4
✔ Copied 10 files