Storing data with connections
Add connections to workbenches if you want to connect your project to data inputs and object storage buckets. A connection is a resource that contains the configuration parameters needed to connect to a data source or data sink, such as an AWS S3 object storage bucket.
For this workshop, you need two S3-compatible object storage buckets, such as Ceph, Minio, or AWS S3. You can use your own storage buckets or run a provided script that creates the following local Minio storage buckets for you:
-
My Storage - Use this bucket for storing your models and data. You can reuse this bucket and its connection for your notebooks and model servers.
-
Pipelines Artifacts - Use this bucket as storage for your pipeline artifacts. A pipeline artifacts bucket is required when you create a pipeline server. For this workshop, create this bucket to separate it from the first storage bucket for clarity.
Also, you must create a connection to each storage bucket. You have two options for this workshop, depending on whether you want to use your own storage buckets or use a script to create local Minio storage buckets:
-
If you want to use your own S3-compatible object storage buckets, create connections to them as described in Creating connections to your own S3-compatible object storage.
-
If you want to run a script that installs local MinIO storage buckets and creates connections to them, follow the steps in Running a script to install local object storage buckets and create connections.
While it is possible for you to use one storage bucket for both purposes (storing models and data as well as storing pipeline artifacts), this tutorial follows best practice and uses separate storage buckets for each purpose. |