Storing data with connections

Add connections to workbenches if you want to connect your project to data inputs and object storage buckets. A connection is a resource that contains the configuration parameters needed to connect to a data source or data sink, such as an AWS S3 object storage bucket.

For this workshop, you need two S3-compatible object storage buckets, such as Ceph, Minio, or AWS S3. You can use your own storage buckets or run a provided script that creates the following local Minio storage buckets for you:

  • My Storage - Use this bucket for storing your models and data. You can reuse this bucket and its connection for your notebooks and model servers.

  • Pipelines Artifacts - Use this bucket as storage for your pipeline artifacts. A pipeline artifacts bucket is required when you create a pipeline server. For this workshop, create this bucket to separate it from the first storage bucket for clarity.

Also, you must create a connection to each storage bucket. You have two options for this workshop, depending on whether you want to use your own storage buckets or use a script to create local Minio storage buckets:

While it is possible for you to use one storage bucket for both purposes (storing models and data as well as storing pipeline artifacts), this tutorial follows best practice and uses separate storage buckets for each purpose.