Install

In this chapter, we will install Kubeflow on Amazon EKS cluster. If you don’t have an EKS cluster, please follow instructions from getting started guide and then launch your EKS cluster using eksctl chapter

Ensure you have OIDC provider attached to the cluster

First check if the EKS cluster has OIDC provider attched by running the command:

eksctl get cluster eksworkshop-eksctl -o json | jq -r .[0].Identity

It the output of the previous command is blank, create an OIDC provider and associate it with for your EKS cluster with the following command:

eksctl utils associate-iam-oidc-provider --cluster ${CLUSTER_NAME} \
--region ${CLUSTER_REGION} --approve

Increase cluster size

We need more resources for completing the Kubeflow chapter of the EKS Workshop. First, we’ll increase the size of our cluster to 8 nodes

export NODEGROUP_NAME=$(eksctl get nodegroups --cluster eksworkshop-eksctl -o json | jq -r '.[0].Name')
eksctl scale nodegroup --cluster eksworkshop-eksctl --name $NODEGROUP_NAME --nodes 8 --nodes-max 8

Scaling the nodegroup will take 2 - 3 minutes.

Install Kubeflow on Amazon EKS

Install Kustomize

Warning: Kubeflow is not compatible with the latest versions of of kustomize 4.x. This is due to changes in the order that resources are sorted and printed. Please see kubernetes-sigs/kustomize#3794 and kubeflow/manifests#1797. We know that this is not ideal and are working with the upstream kustomize team to add support for the latest versions of kustomize as soon as we can.

curl --silent --location "https://github.com/kubernetes-sigs/kustomize/releases/download/v3.2.0/kustomize_3.2.0_linux_amd64" -o /tmp/kustomize
sudo chmod +x /tmp/kustomize && sudo mv -v /tmp/kustomize /usr/local/bin

Clone the repository

Clone the awslabs/kubeflow-manifests and the kubeflow/manifests repositories and check out the release branches of your choosing.

Substitute the value for KUBEFLOW_RELEASE_VERSION(e.g. v1.5.1) and AWS_RELEASE_VERSION(e.g. v1.5.1-aws-b1.0.0) with the tag or branch you want to use below. Read more about releases and versioning if you are unsure about what these values should be.

export KUBEFLOW_RELEASE_VERSION=v1.5.1
export AWS_RELEASE_VERSION=v1.5.1-aws-b1.0.0
git clone https://github.com/awslabs/kubeflow-manifests.git && cd kubeflow-manifests
git checkout ${AWS_RELEASE_VERSION}
git clone --branch ${KUBEFLOW_RELEASE_VERSION} https://github.com/kubeflow/manifests.git upstream

Build Manifests and install Kubeflow

There two options for installing Kubeflow official components and common services with kustomize.

  1. Single-command installation of all components under apps and common
  2. Multi-command, individual components installation for apps and common

Option 1 targets ease of deployment for end users. This lab we will use this option.
Option 2 targets customization and ability to pick and choose individual components.

Warning: In both options, we use a default email (user@example.com) and password (12341234). For any production Kubeflow deployment, you should change the default password by following the relevant section.


NOTE

kubectl apply commands may fail on the first try. This is inherent in how Kubernetes and kubectl work (e.g., CR must be created after CRD becomes ready). The solution is to re-run the command until it succeeds. For the single-line command, we have included a bash one-liner to retry the command.


Install with a single command

You can install all Kubeflow official components (residing under apps) and all common services (residing under common) using the following command:

while ! kustomize build deployments/vanilla | kubectl apply -f -; do echo "Retrying to apply resources"; sleep 30; done

Run below command to check the status

kubectl -n kubeflow get all

Installing Kubeflow and its toolset may take 2 - 3 minutes. Few pods may initially give Error or CrashLoopBackOff status. Give it some time, they will auto-heal and will come to Running state

Expand here to see the output

Once everything is installed successfully, you can access the Kubeflow Central Dashboard by logging into your cluster.

You can now start experimenting and running your end-to-end ML workflows with Kubeflow!