Skip to main content

Enabling Container Insights Using AWS Distro for OpenTelemetry

In this lab exercise, we'll explore how to enable CloudWatch Container Insights metrics with the ADOT Collector for an EKS cluster.

Now, let's create resources to allow the ADOT collector the permissions it needed. We'll start with the ClusterRole that gives the collector permissions to access the Kubernetes API:

~/environment/eks-workshop/modules/observability/container-insights/adot/clusterrole.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: otel-ci-role
rules:
- apiGroups: [""]
resources: ["pods", "nodes", "endpoints"]
verbs: ["list", "watch", "get"]
- apiGroups: ["apps"]
resources: ["replicasets"]
verbs: ["list", "watch", "get"]
- apiGroups: ["batch"]
resources: ["jobs"]
verbs: ["list", "watch"]
- apiGroups: [""]
resources: ["nodes/proxy"]
verbs: ["get"]
- apiGroups: [""]
resources: ["nodes/stats", "configmaps", "events"]
verbs: ["create", "get"]
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["update"]
- apiGroups: [""]
resources: ["configmaps"]
resourceNames: ["otel-container-insight-clusterleader"]
verbs: ["get", "update", "create"]
- apiGroups: ["coordination.k8s.io"]
resources: ["leases"]
verbs: ["create", "get", "update"]
- apiGroups: ["coordination.k8s.io"]
resources: ["leases"]
resourceNames: ["otel-container-insight-clusterleader"]
verbs: ["get", "update", "create"]

We'll use the managed IAM policy CloudWatchAgentServerPolicy to provide the collector with the IAM permissions it needs via IAM Roles for Service Accounts:

~$aws iam list-attached-role-policies \
--role-name eks-workshop-adot-collector-ci | jq .
{
  "AttachedPolicies": [
    {
      "PolicyName": "CloudWatchAgentServerPolicy",
      "PolicyArn": "arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy"
    }
  ]
}

This IAM role will be added to the ServiceAccount for the collector:

~/environment/eks-workshop/modules/observability/container-insights/adot/serviceaccount.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: adot-collector-ci
annotations:
eks.amazonaws.com/role-arn: ${ADOT_IAM_ROLE_CI}

Create the resources:

~$kubectl kustomize ~/environment/eks-workshop/modules/observability/container-insights/adot \
| envsubst | kubectl apply -f-
~$kubectl rollout status -n other daemonset/adot-container-ci-collector --timeout=120s

The specification for the collector is too long to show here, but you can view it like so:

~$kubectl -n other get opentelemetrycollector adot-container-ci

Let's break this down in to sections to get a better understanding of what has been deployed. This is the OpenTelemetry collector configuration:

~$kubectl -n other get opentelemetrycollector adot-container-ci -o jsonpath='{.spec.config}'

This is configuring an OpenTelemetry pipeline with the following structure:

This collector is also configured to run as a DaemonSet with a collector agent running on each node:

~$kubectl -n other get opentelemetrycollector adot-container-ci -o jsonpath='{.spec.mode}{"\n"}'

We can confirm that by inspecting the ADOT collector Pods collecting Container Insights metrics that are running:

~$kubectl get pods -n other
NAME                               READY   STATUS    RESTARTS   AGE
adot-container-ci-collector-5lp5g  1/1     Running   0          15s
adot-container-ci-collector-ctvgs  1/1     Running   0          15s

If the output of this command includes multiple pods in the Running state as shown (above), the collector is running and collecting metrics from the cluster. The collector creates a log group named aws/containerinsights/cluster-name/performance and sends the metric data as performance log events in EMF format.