Metrics

Runs on Spell can record metrics over the course of their execution. Spell will display these metrics on the Run page in the Spell web console.

Hardware metrics

All Spell runs monitor and track CPU, Memory, Network, and GPU (if available) metrics for every run. The data is logged every 10 seconds, so runs that complete in less then 10 seconds might not have any data. The metrics are available on the Run page in the web console.

Image of the Spell console run page with hardware metrics.

Framework metrics

Spell will capture certain framework metrics automatically.

If your run uses Keras, some common metrics like Loss and Accuracy will be tracked automatically.

If your run uses Tensorflow, any metrics that are logged via tf.summary will be captured automatically. Note that you must write the summaries to a file using tf.summary.FileWriter in order for Spell to read the metrics. The official TensorFlow documentation has a good example of using TensorFlow's summary API, found here.

Image of the Spell console run page with tensorflow metrics.

Custom user metrics

From within a run, to submit a Custom User Metric simply import spell.metrics as metrics and use the metrics API.

import spell.metrics as metrics
import time
for i in range(10):
  metrics.send_metric("My Linear Metric", i, i)
  metrics.send_metric("My Squared Metric", i**2, i)
  metrics.send_metric("My Text Metric", "Some text at index {}".format(i))
  time.sleep(1)

The first argument is the name of the metric. It can be any string and any calls to send_metric() within the same run with the same name will be graphed or logged together on the Run page. The second argument is the value of the metric which can be either a Number or a String. The third argument is an optional x value that must be strictly increasing for a given metric name, and defaults to one more than the previous x value. Metrics submitted with a smaller x value than an earlier metric submission for the same name will instead get the default x value.

Click here for the Python API documentation on metrics.

Image of the Spell console run page with custom user metrics.

Note

There is a limit of 50 unique metric names per run, and 1 value per second per metric name.

(Advanced) Getting metrics using the Python API

In addition to being able to read metrics on the Run Details page on the Spell website, you can also use the Python API to to read and interact with metrics.

import spell.client
import csv

client = spell.client.from_environment()
my_run = client.runs.get(123)
metrics = my_run.metrics("Important Metric 2")

with open("important_metric.csv", "w", newline='') as csv_file:
    csv.writer(csv_file).writerows(metrics)

The result of this code will be a new file important_metric.csv containing 3 columns: timestamp, index, and the value.