What is a Resource?
Resources are the generic name for datasets, models, or any other files that can be made available to a run. Spell keeps these organized for you in a remote filesystem.
Viewing your resources
To look through your available resources, use spell ls
or go to web.spell.run/resources:
$ spell ls
- - public
- - uploads
- - runs
Initially, you’ll see three directories: public
, uploads
, and runs
. Read more about each of these resource types below.
Public
The public
directory contains pre-packaged versions of commonly used datasets and a few pre-trained models. We try to keep the most commonly used datasets and models in public. If there’s something you’d like to see included, please let us know using spell feedback
or emailing us at support@spell.run. See here for how to use Spell feedback.
Within public, the resources are organized in a series of directories based on category of the dataset.
$ spell ls public
- - audio
- - face
- - finance
- - healthcare
- - image
- - models
- - social
- - text
- - tutorial
- - video
$ spell ls public/video
- - bdd100k (Berkeley DeepDrive...
To learn more about how to use public datasets and models in your Spell runs and workspaces, read about Mounting Resources below.
Uploads
In order to get the best performance from Spell, we recommend uploading your datasets before using them within a run or Jupyter workspace. While it is possible to mount a dataset to a run by including it in your Git repository (more on how Spell uses Git repositories), this is not recommended.
Note
Datasets over 1GB must be uploaded to Spell. Large datasets cannot be mounted to a run by including them in your Git repository.
To upload a dataset, use the spell upload
command. The upload command has as a required parameter the name of the directory on your local drive that contains your dataset. There is also an optional --name
argument for re-naming your dataset. The full command is spell upload directory-with-dataset [--name new-name]
.
An example:
$ mkdir new_upload
$ echo "creating a new_upload" > new_upload/file1 && echo "running on spell" > new_upload/file2
$ spell upload new_upload
Total upload size: 40B
Uploading to uploads/new_upload [####################################] 100%
Upload of new_upload (/new_upload) to 'uploads/new_upload' complete.
$ spell ls uploads/new_upload
23 May 11 16:01 file1
17 May 11 16:01 file2
The dataset is now uploaded to your uploads folder which you can view using spell ls uploads
or going to web.spell.run/resources/uploads.
Runs
The runs
directory keeps the saved outputs from all of your runs, listed by run id. If a run doesn’t generate any outputs or if it was removed using spell rm
, it won’t be listed.
$ spell ls runs
- Jan 31 17:02 1
- Jan 31 17:27 4
To view resources within a run, use spell ls
with the path. This will return a table with the size of the resource, the date created, and the name of the directory or file.
$ spell ls runs/4
34 Jan 31 17:06 images
210 Jan 31 17:20 data
S3/GS buckets
Public buckets
All users can mount public S3/GS buckets into a run. For an example of how to do this, go to Mounting Resources below.
Private buckets
Organizations on the Teams plan are also able to mount private S3/GS buckets. See Cluster Bucket Management.
Mounting Resources
You can mount any of your resources into a run or Jupyter workspace.
To use the mount option, you must specify the path to your dataset. Optionally, you can also change the name you use to reference the dataset by specifying an alias after the colon :new-name
.
$ spell run -m uploads/my-dataset:dataset "python main.py"
In the above command, we mounted the dataset at the path uploads/my-dataset
and gave it the alias dataset
as the reference for our run.
In this example, we mount the public dataset at public/audio/css10
and give it the alias audio-data
.
$ spell run -m public/audio/css10:audio-data "python main.py"
In the next example, we mount monthly rainfall figures from the public SILO climate dataset and give it the alias data
.
$ spell run -m s3://silo-open-data/annual/monthly_rain:data "python main.py"
Downloading Resources
Any resource can also be downloaded with the spell cp
command. For example:
$ spell cp runs/4
✔ Copied 10 files