Storing data with data connections
Add data connections to workbenches if you want to connect your project to data inputs and object storage buckets. A data connection is a resource that contains the configuration parameters needed to connect to an object storage bucket.
For this workshop, you need two S3-compatible object storage buckets, such as Ceph, Minio, or AWS S3. You can use your own storage buckets or run a provided script that creates the following local Minio storage buckets for you:
-
My Storage - Use this bucket for storing your models and data. You can reuse this bucket and its connection for your notebooks and model servers.
-
Pipelines Artifacts - Use this bucket as storage for your pipeline artifacts. A pipeline artifacts bucket is required when you create a pipeline server. For this workshop, create this bucket to separate it from the first storage bucket for clarity.
Also, you must create a data connection to each storage bucket. You have two options for this workshop, depending on whether you want to use your own storage buckets or use a script to create local Minio storage buckets:
-
If you want to use your own S3-compatible object storage buckets, create data connections to them as described in Creating data connections to your own S3-compatible object storage.
-
If you want to run a script that installs local Minio storage buckets and creates data connections to them, follow the steps in Running a script to install local object storage buckets and create data connections.
While it is possible for you to use one storage bucket for both purposes (storing models and data as well as storing pipeline artifacts), this tutorial follows best practice and uses separate storage buckets for each purpose. |