Skip to main content

Scraping metrics using AWS Distro for OpenTelemetry

In this lab we'll be storing metrics in an Amazon Managed Service for Prometheus workspace which is already created for you. You should be able to see it in the console:

https://console.aws.amazon.com/prometheus/home#/workspaces

To view the workspace, click on the All Workspaces tab on the left control panel. Select the workspace that starts with eks-workshop and you can view several tabs under the workspace such as rules management, alert manager etc.

To gather the metrics from the Amazon EKS Cluster, we'll deploy a OpenTelemetryCollector custom resource. The ADOT operator running on the EKS cluster detects the presence of or changes of the this resource and for any such changes, the operator performs the following actions:

  • Verifies that all the required connections for these creation, update, or deletion requests to the Kubernetes API server are available.
  • Deploys ADOT collector instances in the way the user expressed in the OpenTelemetryCollector resource configuration.

Now, let's create resources to allow the ADOT collector the permissions it needed. We'll start with the ClusterRole that gives the collector permissions to access the Kubernetes API:

~/environment/eks-workshop/modules/observability/oss-metrics/adot/clusterrole.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: otel-prometheus-role
rules:
- apiGroups:
- ""
resources:
- nodes
- nodes/proxy
- services
- endpoints
- pods
verbs:
- get
- list
- watch
- apiGroups:
- extensions
resources:
- ingresses
verbs:
- get
- list
- watch
- nonResourceURLs:
- /metrics
verbs:
- get

We'll use the managed IAM policy AmazonPrometheusRemoteWriteAccess to provide the collector with the IAM permissions it needs via IAM Roles for Service Accounts:

~$aws iam list-attached-role-policies \
--role-name $EKS_CLUSTER_NAME-adot-collector | jq .
{
  "AttachedPolicies": [
    {
      "PolicyName": "AmazonPrometheusRemoteWriteAccess",
      "PolicyArn": "arn:aws:iam::aws:policy/AmazonPrometheusRemoteWriteAccess"
    }
  ]
}

This IAM role will be added to the ServiceAccount for the collector:

~/environment/eks-workshop/modules/observability/oss-metrics/adot/serviceaccount.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: adot-collector
annotations:
eks.amazonaws.com/role-arn: ${ADOT_IAM_ROLE}

Create the resources:

~$kubectl kustomize ~/environment/eks-workshop/modules/observability/oss-metrics/adot \
| envsubst | kubectl apply -f-
~$kubectl rollout status -n other deployment/adot-collector --timeout=120s

The specification for the collector is too long to show here, but you can view it like so:

~$kubectl -n other get opentelemetrycollector adot -o yaml

Let's break this down in to sections to get a better understanding of what has been deployed. This is the OpenTelemetry collector configuration:

~$kubectl -n other get opentelemetrycollector adot -o jsonpath='{.spec.config}' | yq

This is configuring an OpenTelemetry pipeline with the following structure:

This collector is also configured to run as a Deployment with one collector agent running:

~$kubectl -n other get opentelemetrycollector adot -o jsonpath='{.spec.mode}{"\n"}'

We can confirm that by inspecting the ADOT collector Pods that are running:

~$kubectl get pods -n other
NAME                              READY   STATUS    RESTARTS   AGE
adot-collector-6f6b8867f6-lpjb7   1/1     Running   2          11d