Deploying a model

Now that the model is accessible in storage and saved in the portable ONNX format, you can use an OpenShift AI model server to deploy it as an API.

OpenShift AI now has two options for model serving:

  • multi-model serving-platform

  • single-model serving-platform

Review the descriptions available in the interface.

Procedure

  1. In the OpenShift AI dashboard

  2. Navigate to Models and model servers.

  3. Under Single model serving platform, click Deploy model.

    Add Server
  4. In the form:

    1. Fill out the Model Name with the value fraud.

    2. Select the Serving runtime, OpenVINO Model Server.

    3. Select the Model framework, onnx - 1.

    4. Set the Model server replicas to 1.

    5. Select the Model Server size, Lab Custom Small.

    6. Select the Existing data connection: My Storage

    7. Enter the path to your uploaded model: models/fraud.

      Deploy model form
      The path does not include 1/model.onnx. The OpenVINO model server expects the format to include the integer version as part of the subpath.
  5. Click Deploy.

  6. Wait for the model to deploy and for the Status to show a green checkmark.

    Model status
  7. This might take a little while if the cluster is particularly busy, but should be less than 2 minutes.