[DEV] VESSL Docs
  • Welcome to VESSL Docs!
  • GETTING STARTED
    • Overview
    • Quickstart
    • End-to-end Guides
      • CLI-driven Workflow
      • SDK-driven Workflow
  • USER GUIDE
    • Organization
      • Creating an Organization
      • Organization Settings
        • Add Members
        • Set Notifications
        • Configure Clusters
        • Add Integrations
        • Billing Information
    • Project
      • Creating a Project
      • Project Overview
      • Project Repository & Project Dataset
    • Clusters
      • Cluster Integrations
        • Fully Managed Cloud
        • Personal Laptops
        • On-premise Clusters
        • Private Cloud (AWS)
      • Cluster Monitoring
      • Cluster Administration
        • Resource Specs
        • Access Control
        • Quotas and Limits
        • Remove Cluster
    • Dataset
      • Adding New Datasets
      • Managing Datasets
      • Tips & Limitations
    • Experiment
      • Creating an Experiment
      • Managing Experiments
      • Experiment Results
      • Distributed Experiments
      • Local Experiments
    • Model Registry
      • Creating a Model
      • Managing Models
    • Sweep
      • Creating a Sweep
      • Sweep Results
    • Workspace
      • Creating a Workspace
      • Exploring Workspaces
      • SSH Connection
      • Downloading / Attaching Datasets
      • Running a Server Application
      • Tips & Limitations
      • Building Custom Images
    • Serve
      • Quickstart
      • Serve Web Workflow
        • Monitoring Dashboard
        • Service Logs
        • Service Revisions
        • Service Rollouts
      • Serve YAML Workflow
        • YAML Schema Reference
    • Commons
      • Running Spot Instances
      • Volume Mount
  • API REFERENCE
    • What is the VESSL CLI/SDK?
    • CLI
      • Getting Started
      • vessl run
      • vessl cluster
      • vessl dataset
      • vessl experiment
      • vessl image
      • vessl model
      • vessl organization
      • vessl project
      • vessl serve
      • vessl ssh-key
      • vessl sweep
      • vessl volume
      • vessl workspace
    • Python SDK
      • Integrations
        • Keras
        • TensorBoard
      • Utilities API
        • configure
        • vessl.init
        • vessl.log
          • vessl.Image
          • vessl.Audio
        • vessl.hp.update
        • vessl.progress
        • vessl.upload
        • vessl.finish
      • Dataset API
      • Experiment API
      • Cluster API
      • Image API
      • Model API
        • Model Serving API
      • Organization API
      • Project API
      • Serving API
      • SSH Key API
      • Sweep API
      • Volume API
      • Workspace API
    • Rate Limits
  • TROUBLESHOOTING
    • GitHub Issues
    • VESSL Flare
Powered by GitBook
On this page
  • Requirements
  • 1. Experiment — Build a baseline model
  • 2. Sweep — Optimize hyperparameters
  • 3. Model Registry — Update and store the best model
  1. GETTING STARTED
  2. End-to-end Guides

SDK-driven Workflow

PreviousCLI-driven WorkflowNextOrganization

Last updated 3 years ago

The document below covers the process of creating an image classification model with the MNIST dataset using the . Once again, we will

  • Build a baseline machine learning model with .

  • Optimize hyperparameters using .

  • Update and store models on .

You can follow along the same guide on the as well.

Requirements

To follow this guide, you should first have the following setup.

  • — a dedicated organization for you or your team

  • — a space for your machine learning model and mounted datasets

  • — Python SDK and CLI to manage ML workflows and resources on VESSL

If you have not created an Organization or a Project, first follow the instructions on the .

1. Experiment — Build a baseline model

1-1. Configure your default organization and project

Let's start by configuring the client with the default organization and project we have created earlier. This is done by executing .

import vessl

organization_name = "YOUR_ORGANIZATION_NAME"
project_name = "YOUR_PROJECT_NAME"
vessl.configure(
    organization_name=organization_name, 
    project_name=project_name
)

1-2. Create and mount a dataset

dataset = vessl.create_dataset(
  dataset_name="vessl-mnist", 
  is_public=True,
  external_path="s3://savvihub-public-apne2/mnist"
)

dataset

1-3. Create a machine learning experiment

github_repo = "https://github.com/vessl-ai/examples.git"
experiment = vessl.create_experiment(
    cluster_name="aws-apne2-prod1",
    kernel_resource_spec_name="v1.cpu-4.mem-13",
    kernel_image_url="public.ecr.aws/vessl/kernels:py36.full-cpu",
    dataset_mounts=[f"/input:{dataset.name}"],
    start_command=f"git clone {github_repo} && pip install -r examples/mnist/keras/requirements.txt && python examples/mnist/keras/main.py --save-model --save-image",
)

1-4. View experiment results

experiment = vessl.read_experiment(
    experiment_number=experiment.name
)

The metrics summary of the experiment is stored as a Python dictionary. You can check the latest metrics using metrics_summary.latest as follows.

experiment.metrics_summary.latest["accuracy"].value

1-5. Create a model

model_repository = vessl.create_model_repository(
    name="tutorial-mnist",
)

Then, run vessl.create_model() with the name and ID of the destination repository and experiment we just created.

model = vessl.create_model(
  repository_name=model_repository.name, 
  experiment_id=experiment.id,
  model_name="v0.0.1",
)

2. Sweep — Optimize hyperparameters

sweep_objective = vessl.SweepObjective(
    type="maximize",        # target object (either to minimize or maximize the metric)
    goal="0.99",            # target metric name as defined and logged using `vessl.log()`
    metric="val_accuracy",  # target metric value
)

Next, define the search space of parameters. In this example, the optimizer is a categorical type and the option values are listed as an array. The batch_size is an int value and the search space is set using max, min, and step.

parameters = [
  vessl.SweepParameter(
    name="optimizer", 
    type="categorical",  # int, double, categorical
    range=vessl.SweepParameterRange(
      list=["adam", "sgd", "adadelta"]
    )
  ), 
  vessl.SweepParameter(
    name="batch_size",
    type="int",  # int, double, categorical
    range=vessl.SweepParameterRange(
      max="256",
      min="64",
      step="8",
    )
  )
]
sweep = vessl.create_sweep(
    objective=sweep_objective,
    max_experiment_count=4,
    parallel_experiment_count=2,
    max_failed_experiment_count=2,
    algorithm="random",  # grid, random, bayesian 
    parameters=parameters,
    dataset_mounts=[f"/input:{dataset.name}"],
    cluster_name=experiment.kernel_cluster.name,                     # same as the experiment  
    kernel_resource_spec_name=experiment.kernel_resource_spec.name,  # same as the experiment
    kernel_image_url=experiment.kernel_image.image_url,              # same as the experiment
    start_command=experiment.start_command,                          # same as the experiment
)

You can get the details of the sweep by calling the variable or by visiting the web console.

3. Model Registry — Update and store the best model

best_experiment = vessl.get_best_sweep_experiment(sweep_name=sweep.name)
best_experiment = vessl.read_experiment(experiment_number=experiment.name)
model = vessl.create_model(
  repository_name="tutorial-mnist", 
  experiment_id=best_experiment.id,
  model_name="v0.0.2",
)

You can view the performance of your model by using vessl.read_model() and specifying the model repository followed by the model number.

vessl.read_model(
    repository_name="tutorial-mnist",
    model_number="2",
)

We have looked at the overall workflow of using the VESSL Client SDK. We can also repeat the same process using the client CLI or through Web UI. Now, try this guide with your own code and dataset.

You can always re-configure your organization and project by calling anytime.

To create a on VESSL, run . Let's create a dataset from the public AWS S3 dataset we have prepared: s3://savvihub-public-apne2/mnist. You can check that your dataset was created successfully by executing the dataset's variable name.

To create an , use . Let's run an experiment using VESSL's managed clusters. First, specify the and options. Then, specify the image URL — in this case, we are pulling a Docker image from . Next, we are going to mount the dataset we have created previously. Finally, let's specify the that will be executed in the experiment container. Here, we will use the MNIST Keras example from our .

Note that you can also with your so you don't have to git clone every time you create an experiment. For more information about these features, please refer to the page.

The experiment may take a few minutes to complete. You can get the details of the experiment, including its status, by using .

In VESSL, you can create a . First, let's start by creating a model repository using vessl.create_model_repository() and specifying the repository name.

So far, we ran a single machine learning and saved it as a inside a model repository. In this section, we will use a to find the optimal hyperparameter value.

First, configure sweep_objective with the target metric name and target value. Note that the metric must be a logged to VESSL using .

Initiate hyperparameter searching using . You can see in the code below that the options for cluster, resource, image, dataset, and command options has been set similar to the vessl experiment create explained above.

Now that we have run several experiments using , let's find the optimal experiment. returns the experiment information with the best metric value set in sweep_objective. In this example, this will return the details of the experiment with the maximum val_accuracy.

Using the output of best_experiment, let's create a v0.0.2 model with .

vessl.configure()
dataset
vessl.create_dataset()
experiment
vessl.create_experiment()
cluster
resource
VESSL's Amazon ECR Public Gallery
start command
GitHub repository
integrate a GitHub repository
project
project repository & project dataset
vessl.read_experiment()
experiment
model
sweep
vessl.log()
vessl.create_sweep()
sweep
vessl.get_best_sweep_experiment()
vessl.create_model()
VESSL client SDK
Experiments
Sweep
Model registry
🔗 notebook we created on Google Colab
Organization
Project
VESSL Client
end-to-end guides
vessl.configure()
model from a completed experiment