Quickstart

Run your first experiment on VESSL

To run your first experiment on VESSL AI, first sign up for a free account and add an organization. Organization is a shared working environment where you can find team assets such as datasets, models, and experiments.

While we are on the web console, let’s also add a project called "mnist". As you will see later, Project serves as a central repository equipped with a dashboard and visualizations for all of your experiments.

2. Install VESSL AI Client

VESSL AI comes with a powerful CLI and Python SDK useful for managing ML assets and workflow. Install VESSL AI Client on your local machine using pip install.

pip install vessl

In this guide, we will be using an example from our GitHub. Git clone the repository.

git clone https://github.com/vessl-ai/examples
cd examples

Let's access the project we created using your first VESSL AI CLI command. The command will guide you to a page that grants CLI access and configure your default organization and project.

vessl configure \
  --organization quickstart \
  --project mnist

3. Run an Experiment

Now that we have specified the project and obtained CLI access, you will run your first experiment on VESSL AI. On a local machine, this is as simple as running a python script.

# Install requirements and run VESSL experiments from local machine
pip install -r mnist/keras/requirements.txt && python mnist/keras/main.py --output-path=output --checkpoint-path=output/checkpoint --save-model --save-image

You can also run experiments using VESSL AI's managed clusters by using the vessl run command. The command will upload your current directory and run command on the cluster asynchronously. You can use vessl experiment create command instead of vessl run to specify detailed options (e.g. volume mounts) in one line.

vessl run "pip install -r mnist/keras/requirements.txt && python mnist/keras/main.py --save-model --save-image"

$ vessl experiment create --upload-local-file .:/root/local

[?] Cluster: aws-apne2-prod1
 > aws-apne2-prod1

[?] Resource: v1.cpu-0.mem-1
 > v1.cpu-0.mem-1
   v1.cpu-2.mem-6
   v1.cpu-2.mem-6.spot
   v1.cpu-4.mem-13
   v1.cpu-4.mem-13.spot

[?] Image URL: public.ecr.aws/vessl/kernels:py36.full-cpu
 > public.ecr.aws/vessl/kernels:py36.full-cpu
   public.ecr.aws/vessl/kernels:py37.full-cpu
   public.ecr.aws/vessl/kernels:py36.full-cpu.jupyter
   public.ecr.aws/vessl/kernels:py37.full-cpu.jupyter
   tensorflow/tensorflow:1.14.0-py3
   tensorflow/tensorflow:1.15.5-py3
   tensorflow/tensorflow:2.0.4-py3
   tensorflow/tensorflow:2.2.1-py3
   
[?] Command: cd local && pip install -r mnist/keras/requirements.txt && python mnist/keras/main.py --save-model --save-image

You should specify --upload-local-file option to upload your current directory. If you want to link a github repo instead of upload files from local, see this.

vessl experiment create \
  --cluster aws-apne2-prod1 \
  --resource v1.cpu-0.mem-1 \
  --image-url public.ecr.aws/vessl/kernels:py36.full-cpu \
  --upload-local-file .:/root/local
  --command 'cd local && pip install -r mnist/keras/requirements.txt && python mnist/keras/main.py --save-model --save-image'

You should specify --upload-local-file option to upload your current directory. If you want to link a github repo instead of upload files from local, see this.

Once the command completes, you will be given a link to Experiments. The experiment page stores logs, visualizations, and files specific to the experiment.

This metrics and images of the experiment was made possible by calling the init() and log() function from our Python SDK, which you can use in your code by simply importing the library as shown in the example code.

import vessl

# Initialize new experiment via VESSL SDK 
vessl.init(organization="quickstart", project="mnist")

# Train function and log metrics to VESSL

def train(model, device, train_loader, optimizer, epoch, start_epoch):
    model.train()
    loss = 0
    for batch_idx, (data, label) in enumerate(train_loader):
        ...

    # Logging loss metrics to VESSL
    vessl.log(
        step=epoch + start_epoch + 1,
        payload={'loss': loss.item()}
    )

4. Track and visualize experiments

When you click the project name on the navigation bar, you will be guided back to the project page. Under each tab, you can explore VESSL's main features:

Experiments – unified dashboard for tracking experiments
Tracking – visualization of model performance and system metrics
Sweeps – scalable hyperparameter optimization
Models – a repository for versioned models

5. Develop state-of-the-art models on VESSL AI

Let's try building a model with the resources and datasets of your choice. Under Datasets, you can mount and manage datasets from local or cloud storage.

Let's move over to Workspaces where you can configure a custom environment for Jupyter Notebooks with SSH. You can use either VESSL AI's managed cluster with spot instance support or your own custom clusters.

Launch a Juypter Notebook. Here, you will find an example Notebook which introduces how you can integrate local experiments with VESSL AI to empower your research workflow.

Next Step

Now that you are familiar with the overall workflow of VESSL AI, explore additional features available on our platform and start building!

Use our Sweep to automate model tuning.
Use distributed experiment to take full advantage of your GPUs.
Explore organization settings to set up and manage on-cloud or on-premise clusters.

PreviousOverview NextEnd-to-end Guides

Last updated 3 years ago

1. Sign up and create a Project

2. Install VESSL AI Client

3. Run an Experiment

4. Track and visualize experiments

5. Develop state-of-the-art models on VESSL AI

Next Step