Deploying a model

Now that the model is accessible in storage, you can use deploy it as an API.

Procedure

In the OpenShift AI dashboard
Navigate to Models and model servers.
Click Deploy model.
In the form:
1. Fill out the Model Name with the value flan-t5-small.
2. Select the Serving runtime, Text Generation Inference Service.
3. Select the Model framework, pytorch.
4. Set the Model server replicas to 1.
5. Select the Model Server size, Lab Custom Small.
6. Select the Existing data connection: My Storage
7. Enter the path to your uploaded model: models/flan-t5-small
Click Deploy.
Wait for the model to deploy and for the Status to show a green checkmark.
This will probably take two or three minutes.

At this point, the model should be served, and we now just need to confirm it responds to queries.