What is a Resource?
Resources are the generic name for datasets, models, or any other files that can be made available to a run. Spell keeps these organized for you in a remote filesystem.
Viewing your resources
To look through your available resources, use
spell ls or go to web.spell.run/resources:
$ spell ls - - public - - uploads - - runs
Initially, you’ll see three directories:
runs. Read more about each of these resource types below.
public directory contains pre-packaged versions of commonly used datasets and a few pre-trained models. We try to keep the most commonly used datasets and models in public. If there’s something you’d like to see included, please let us know using
spell feedback or emailing us at firstname.lastname@example.org. See here for how to use Spell feedback.
Within public, the resources are organized in a series of directories based on category of the dataset.
$ spell ls public - - audio - - face - - finance - - image - - models - - social - - text - - tutorial - - video
$ spell ls public/video - - bdd100k (Berkeley DeepDrive...
To learn more about how to use public datasets and models in your Spell runs and workspaces, read about Mounting Resources below.
In order to get the best performance from Spell, we recommend uploading your datasets before using them within a run or Jupyter workspace. While it is possible to mount a dataset to a run by including it in your Git repository (more on how Spell uses Git repositories), this is not recommended.
Datasets over 1GB must be uploaded to Spell. Large datasets cannot be mounted to a run by including them in your Git repository.
To upload a dataset, use the
spell upload command. The upload command has as a required parameter the name of the directory on your local drive that contains your dataset. There is also an optional
--name argument for re-naming your dataset. The full command is
spell upload directory-with-dataset [--name new-name].
$ mkdir new_upload
$ echo "creating a new_upload" > new_upload/file1 && echo "running on spell" > new_upload/file2
$ spell upload new_upload Total upload size: 40B Uploading to uploads/new_upload [####################################] 100% Upload of new_upload (/new_upload) to 'uploads/new_upload' complete.
$ spell ls uploads/new_upload 23 May 11 16:01 file1 17 May 11 16:01 file2
The dataset is now uploaded to your uploads folder which you can view using
spell ls uploads or going to web.spell.run/resources/uploads.
runs directory keeps the saved outputs from all of your runs, listed by run id. If a run doesn’t generate any outputs or if it was removed using
spell rm, it won’t be listed.
$ spell ls runs - Jan 31 17:02 1 - Jan 31 17:27 4
To view resources within a run, use
spell ls with the path. This will return a table with the size of the resource, the date created, and the name of the directory or file.
$ spell ls runs/4 34 Jan 31 17:06 images 210 Jan 31 17:20 data
All users can mount public S3/GS buckets into a run. For an example of how to do this, go to Mounting Resources below.
Organizations on the Teams plan are also able to mount private S3/GS buckets. See Cluster Bucket Management.
You can mount any of your resources into a run or Jupyter workspace.
To use the mount option, you must specify the path to your dataset. Optionally, you can also change the name you use to reference the dataset by specifying an alias after the colon
$ spell run -m uploads/my-dataset:dataset "python main.py"
In the above command, we mounted the dataset at the path
uploads/my-dataset and gave it the alias
dataset as the reference for our run.
In this example, we mount the public dataset at
public/audio/css10 and give it the alias
$ spell run -m public/audio/css10:audio-data "python main.py"
In the next example, we mount monthly rainfall figures from the public SILO climate dataset and give it the alias
$ spell run -m s3://silo-open-data/annual/monthly_rain:data "python main.py"
Any resource can also be downloaded with the
spell cp command. For example:
$ spell cp runs/4 ✔ Copied 10 files