EKS Unchained with eBPF and Bottlerocket
How to install Cilium with kube-proxy replacement in EKS with managed node groups using eksctl.
Why?
Without getting into much into detail, Cilium is a highly scalable Kubernetes CNI. It allows the linux kernel to be aware the primitives of Kubernetes via eBPF. You can find an introduction to the concepts here. Cilium offers much in terms of performance as seen from tests done by Alibaba.
In addition it offers reliability with the introduction of Maglev, a load balancing algorithm that uses consistent hashing to manage backend connections. You may have read about it in a recent Cloudflare blog post. A simple picture outlining the core differences between traditional load balancing and maglev is shown below.
Don’t forget to mention security! Cilium can detect and block network attacks, as well as avoid them.
Why Bottlerocket?
Since Cilium relies on the kernel for interaction via eBPF, we need the newest stable kernel available. Upon looking through the AMIs available for EKS optimized Amazon Linux, I noticed that the latest available kernel was 4.14.209. That won’t even support kube-proxy replacement…ouch. After further reading, it seems the holdup in moving the Amazon Linux kernel to 5.4.x is an upstream bug. So I began sorting through the available AMIs usable in managed nodegroups hoping to find a better, and hopefully not hacky, solution. After a bit I came across Bottlerocket, which stated as a core component the following.
Minimal OS that includes the Linux kernel (5.4), system software, and containerd as the container runtime.
Good kernel and internal backing from AWS, that’ll do!
How?
So, this walkthrough requires some tools. Go ahead and install helm3, kubectl and eksctl. Take a look at my redacted eksctl template below:
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: YOUR_CLUSTER_NAME
region: YOUR_REGION
version: "1.18"
vpc:
clusterEndpoints:
publicAccess: true
privateAccess: false
id: YOUR_VPC_ID
cidr: "VPC_CIDR"
subnets:
public:
us-east-1a: #Change to your zones
id: YOUR_SUBNET_ID
cidr: "YOUR_CIDR"
us-east-1b:
id: YOUR_SUBNET_ID
cidr: "YOUR_CIDR"
private:
us-east-1a:
id: YOUR_SUBNET_ID
cidr: "YOUR_CIDR"
us-east-1b:
id: YOUR_SUBNET_ID
cidr: "YOUR_CIDR"nodeGroups:
- name: bottlerocket
amiFamily: Bottlerocket
instanceType: m5.large
desiredCapacity: 2
volumeSize: 120
privateNetworking: true
iam: #Tune the policies to your liking
withAddonPolicies:
imageBuilder: true
autoScaler: true
appMesh: true
albIngress: true
cloudWatch: true
xRay: true
cloudWatch:
clusterLogging:
enableTypes: ["audit", "authenticator", "controllerManager", "api", "scheduler"]
secretsEncryption:
keyARN: "YOUR_KMS_KEY_ARN"
Fill out a cluster config that better suits your needs. Then use eksctl to bring up a cluster without the nodegroups.
eksctl create cluster -f cluster-config.yml --without-nodegroup
This will take some time, grab a coffee. After you return to your terminal to find this finished, go ahead and delete the aws-node daemonset and the kube-proxy daemonset. This is to prevent conflicting behavior.
kubectl delete ds -n kube-system aws-nodekubectl delete ds -n kube-system kube-proxy
After a successful deletion, grab the api server ip address. This is used when installing Cilium.
export API_SERVER_IP=$(kubectl get ep kubernetes -o jsonpath='{$.subsets[0].addresses[0].ip}')
Go ahead and echo that back to sanity check.
echo $API_SERVER_IP
Use that with a cilium install.
helm install cilium cilium/cilium --version=1.9.1 --namespace kube-system --set eni=true --set ipam.mode=eni --set egressMasqueradeInterfaces=eth0 --set loadBalancer.algorithm=maglev --set hubble.enabled=true --set hubble.listenAddress=":4244" --set hubble.relay.enabled=true --set hubble.ui.enabled=true --set hubble.metrics.enabled="{dns,drop,tcp,flow,port-distribution,icmp,http}" --set kubeProxyReplacement="strict" --set k8sServiceHost=$API_SERVER_IP --set k8sServicePort=443
Thats a pretty big chunk of helm install so…
I’ll break that install down a bit.
--set eni=true --set ipam.mode=eni --set egressMasqueradeInterfaces=eth0
This informs Cilium to take over managing the IP address space via the EC2 api, as well as sets the default interface.
--set loadBalancer.algorithm=maglev
This instructs Cilium to use the maglev load balancing algorithm.
--set hubble.enabled=true --set hubble.listenAddress=":4244" --set hubble.relay.enabled=true --set hubble.ui.enabled=true --set hubble.metrics.enabled="{dns,drop,tcp,flow,port-distribution,icmp,http}"
This enables the hubble-ui and metrics collection.
--set kubeProxyReplacement="strict"
This flag enables the “full features” of Cilium based on you kernel version, and tells the cilium-agent to bail out if conditions aren’t correct for initialization.
--set k8sServiceHost=$API_SERVER_IP --set k8sServicePort=443
This tells Cilium how to communicate with the cluster api, as we deleted the proxy in an earlier step.
Alrighty then!
After a successful helm install message, go ahead and bring up the node groups.
eksctl create nodegroup -f cluster-config.yml
Watch the pods to ensure the install was successful.
kubectl get pods -A --watch