[DEV] VESSL Docs
  • Welcome to VESSL Docs!
  • GETTING STARTED
    • Overview
    • Quickstart
    • End-to-end Guides
      • CLI-driven Workflow
      • SDK-driven Workflow
  • USER GUIDE
    • Organization
      • Creating an Organization
      • Organization Settings
        • Add Members
        • Set Notifications
        • Configure Clusters
        • Add Integrations
        • Billing Information
    • Project
      • Creating a Project
      • Project Overview
      • Project Repository & Project Dataset
    • Clusters
      • Cluster Integrations
        • Fully Managed Cloud
        • Personal Laptops
        • On-premise Clusters
        • Private Cloud (AWS)
      • Cluster Monitoring
      • Cluster Administration
        • Resource Specs
        • Access Control
        • Quotas and Limits
        • Remove Cluster
    • Dataset
      • Adding New Datasets
      • Managing Datasets
      • Tips & Limitations
    • Experiment
      • Creating an Experiment
      • Managing Experiments
      • Experiment Results
      • Distributed Experiments
      • Local Experiments
    • Model Registry
      • Creating a Model
      • Managing Models
    • Sweep
      • Creating a Sweep
      • Sweep Results
    • Workspace
      • Creating a Workspace
      • Exploring Workspaces
      • SSH Connection
      • Downloading / Attaching Datasets
      • Running a Server Application
      • Tips & Limitations
      • Building Custom Images
    • Serve
      • Quickstart
      • Serve Web Workflow
        • Monitoring Dashboard
        • Service Logs
        • Service Revisions
        • Service Rollouts
      • Serve YAML Workflow
        • YAML Schema Reference
    • Commons
      • Running Spot Instances
      • Volume Mount
  • API REFERENCE
    • What is the VESSL CLI/SDK?
    • CLI
      • Getting Started
      • vessl run
      • vessl cluster
      • vessl dataset
      • vessl experiment
      • vessl image
      • vessl model
      • vessl organization
      • vessl project
      • vessl serve
      • vessl ssh-key
      • vessl sweep
      • vessl volume
      • vessl workspace
    • Python SDK
      • Integrations
        • Keras
        • TensorBoard
      • Utilities API
        • configure
        • vessl.init
        • vessl.log
          • vessl.Image
          • vessl.Audio
        • vessl.hp.update
        • vessl.progress
        • vessl.upload
        • vessl.finish
      • Dataset API
      • Experiment API
      • Cluster API
      • Image API
      • Model API
        • Model Serving API
      • Organization API
      • Project API
      • Serving API
      • SSH Key API
      • Sweep API
      • Volume API
      • Workspace API
    • Rate Limits
  • TROUBLESHOOTING
    • GitHub Issues
    • VESSL Flare
Powered by GitBook
On this page
  • Quickstart
  • Use VESSL Serve on Web Console
  • Use VESSL Serve with YAML interface
  • Use VESSL Serve with Python SDK
  1. USER GUIDE

Serve

PreviousBuilding Custom ImagesNextQuickstart

Last updated 1 year ago

Deploying a server to host ML models in a production environment requires careful planning to ensure they run smoothly, stay available, and can handle increased demands. This can be particularly challenging for ML engineers or small backend teams who might not be deeply familiar with complex backend setups.

VESSL Serve is an essential tool for deploying models developed in VESSL, or even your custom models as inference servers. VESSL Serve not only makes inference easy but also offers features like:

  • Keeping track of activities like logs, system metrics and model performance metrics

  • Automatically adjusting their size based on demand (resource usage)

  • Split the traffic sent to models for easier Canary testing

  • Roll out a new version of model to production without downtime

VESSL Serve simplifies the process of setting up ML services that are reliable, adaptable, and can handle varying workloads.

Quickstart

Follow along our quickstart guide to get started with VESSL Serve.

Use VESSL Serve on Web Console

Navigate to the [Servings] section found in the global navigation bar. You will find an inventory of all the servings you have previously created.

For an in-depth guide on creating a new serving, refer to the [Serving Web Workflow] section.

Use VESSL Serve with YAML interface

Alternatively, you have the option to declaratively define serving instances using YAML manifests. This approach enables you to exercise version control, such as Git, for managing versions and configuration settings of these serving instances.

message: VESSL Serve example
image: quay.io/vessl-ai/kernels:py38-202308150329
resources:
  name: v1.cpu-2.mem-6
run: vessl model serve mnist-example 19 --install-reqs
autoscaling:
  min: 1
  max: 3
  metric: cpu
  target: 60
ports:
  - port: 8000
    name: fastapi
    type: http

For more details on how to create a serving, see [Serving YAML workflow] section.

Use VESSL Serve with Python SDK

See section in API Reference for more details.

Serve Web Workflow
Serve YAML Workflow
VESSL Serve
Quickstart
Servings section on the global navigation bar