Zero to Mastery: Helm and Kubernetes with AKS Cluster
Welcome to this comprehensive learning guide designed to take you from a complete novice to a master of Helm and Kubernetes, specifically within the Azure Kubernetes Service (AKS) environment. This document will walk you through the essential concepts, practical examples, and advanced techniques required to successfully deploy, manage, and scale your applications from development to production.
1. Introduction to Helm and Kubernetes with AKS
What is Kubernetes?
Kubernetes (often abbreviated as K8s) is an open-source system for automating deployment, scaling, and management of containerized applications. It groups containers that make up an application into logical units for easy management and discovery. Kubernetes provides a platform for running and managing these containers in a highly available and resilient manner.
What is Helm?
Helm is the package manager for Kubernetes. Just like you use package managers like apt on Ubuntu or yum on CentOS to install software, Helm helps you install and manage applications on Kubernetes. Helm packages are called “Charts,” and they contain all the necessary resources and configurations to deploy an application or service to a Kubernetes cluster.
What is Azure Kubernetes Service (AKS)?
Azure Kubernetes Service (AKS) is a managed Kubernetes offering from Microsoft Azure. AKS simplifies the deployment, management, and scaling of Kubernetes clusters in the cloud. With AKS, Azure handles the underlying infrastructure (like control plane management and automatic upgrades), allowing you to focus on your applications rather than the operational overhead of Kubernetes itself. AKS integrates seamlessly with other Azure services for monitoring, security, and networking.
Why learn Helm and Kubernetes with AKS?
Learning Helm and Kubernetes with AKS offers significant benefits for modern application development and operations:
- Simplified Deployment and Management: Kubernetes automates many tasks related to container orchestration, while Helm streamlines the packaging and deployment of complex applications. AKS further simplifies Kubernetes management by handling the control plane.
- Scalability: Kubernetes allows you to scale your applications up or down based on demand, ensuring optimal resource utilization and performance.
- Portability: Kubernetes workloads can run on various cloud providers or on-premises, offering flexibility and avoiding vendor lock-in.
- Resilience: Kubernetes automatically handles failures by restarting containers, rescheduling pods, and ensuring application availability.
- Infrastructure as Code (IaC): Helm Charts enable you to define your application deployments as code, facilitating version control, collaboration, and consistent deployments across environments.
- Azure Ecosystem Integration: AKS integrates deeply with Azure services like Azure Container Registry (ACR), Azure Monitor, Azure Key Vault, and Azure DevOps (including GitHub Actions), providing a powerful and cohesive platform for cloud-native development.
- Industry Relevance: Kubernetes has become the de facto standard for container orchestration, and proficiency in AKS and Helm is highly sought after in the cloud and DevOps job markets.
Setting up your development environment
Before diving into Helm and Kubernetes, let’s set up your local machine and Azure environment.
Prerequisites:
- Azure Account: You’ll need an active Azure subscription. If you don’t have one, you can sign up for a free Azure account.
- Azure CLI: The Azure command-line interface (CLI) is essential for interacting with Azure resources.
- kubectl: The Kubernetes command-line tool (
kubectl) is used to run commands against Kubernetes clusters. - Helm CLI: The Helm client is used to manage Helm Charts.
- Git: For version control of your code and configurations.
- Code Editor: Visual Studio Code is highly recommended for Kubernetes and Helm development due to its rich extensions.
Step-by-step setup:
Log in to Azure CLI:
az loginFollow the browser prompts to authenticate.
Set your Azure subscription: If you have multiple subscriptions, set the one you want to use:
az account set --subscription "Your Subscription Name or ID"Install/Verify
kubectlandhelm: Ensure you have the latest versions.kubectl version --client helm version --clientIf not installed, refer to the installation links above.
Create an Azure Resource Group: A resource group is a logical container for your Azure resources.
az group create --name myAKSResourceGroup --location eastusYou can choose a different location closer to you.
Create an AKS Cluster: This command will create a basic AKS cluster. For production, you would add more configurations (e.g., availability zones, network settings).
az aks create --resource-group myAKSResourceGroup --name myAKSCluster --node-count 2 --generate-ssh-keys --enable-managed-identityThis command creates a cluster with two nodes and enables a managed identity, which is a best practice for AKS.
Get AKS cluster credentials: Configure
kubectlto connect to your new AKS cluster.az aks get-credentials --resource-group myAKSResourceGroup --name myAKSClusterNow you can interact with your AKS cluster using
kubectl. Test it:kubectl get nodesYou should see your two nodes listed.
2. Core Concepts and Fundamentals
This section will introduce you to the fundamental building blocks of Kubernetes and Helm.
2.1 Kubernetes Core Concepts
2.1.1 Pods
A Pod is the smallest deployable unit in Kubernetes. It represents a single instance of a running process in your cluster. Pods typically contain one or more containers (e.g., a Docker container). All containers within a Pod share the same network namespace, IP address, and storage volumes.
Key characteristics:
- Ephemeral: Pods are designed to be short-lived. If a Pod crashes or a node fails, Kubernetes will automatically create a new Pod.
- Single application instance: Generally, you run a single main application process per Pod, though sidecar containers (e.g., a logging agent or a proxy) can share the Pod.
- Shared resources: Containers in a Pod share the same network and can communicate via
localhost. They can also share storage volumes.
Code Example: Basic Nginx Pod Let’s create a simple Pod running an Nginx web server.
# nginx-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: my-nginx-pod
labels:
app: nginx
spec:
containers:
- name: nginx-container
image: nginx:latest
ports:
- containerPort: 80
To deploy this Pod:
kubectl apply -f nginx-pod.yaml
Check the status:
kubectl get pods
You should see my-nginx-pod in the Running state.
To access the Nginx web page, you’d typically need a Service (covered later). For now, let’s clean up.
kubectl delete -f nginx-pod.yaml
Exercise/Mini-Challenge:
- Create a Pod named
my-busybox-podusing thebusyboximage. - Have the
busyboxcontainer run a command that prints “Hello from BusyBox!” to standard output and then exits. - Check the logs of the Pod after it runs.
2.1.2 Deployments
Deployments are a higher-level abstraction in Kubernetes used to manage the lifecycle of Pods. They provide declarative updates for Pods and ReplicaSets, allowing you to define how many replicas of your application should be running and how to update them (e.g., rolling updates).
Key characteristics:
- Manage ReplicaSets: A Deployment manages ReplicaSets, which in turn ensure a specified number of Pod replicas are always running.
- Rolling Updates: Deployments enable zero-downtime application updates by gradually replacing old Pods with new ones.
- Rollbacks: If an update causes issues, you can easily roll back to a previous stable version.
- Desired State: You define the desired state of your application (e.g., 3 replicas of Nginx version 1.25), and Kubernetes works to maintain that state.
Code Example: Nginx Deployment Let’s create a Deployment for our Nginx application, ensuring 3 replicas are always running.
# nginx-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx-container
image: nginx:latest
ports:
- containerPort: 80
Deploy the application:
kubectl apply -f nginx-deployment.yaml
Check the status:
kubectl get deployments
kubectl get pods -l app=nginx
You should see 3 Nginx Pods running.
Now, let’s perform a rolling update by changing the Nginx image version.
# nginx-deployment-updated.yaml (modify image to nginx:1.25.3)
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx-container
image: nginx:1.25.3 # Updated image version
ports:
- containerPort: 80
Apply the update:
kubectl apply -f nginx-deployment-updated.yaml
Watch the rollout:
kubectl rollout status deployment/nginx-deployment
You’ll see the old Pods being terminated and new ones with nginx:1.25.3 being created.
Clean up:
kubectl delete -f nginx-deployment.yaml
Exercise/Mini-Challenge:
- Create a Deployment named
my-web-appwith 5 replicas using thehttpd:2.4image. - Verify that 5 Pods are running.
- Scale the deployment down to 2 replicas using
kubectl scale. - Perform a rolling update to
httpd:2.4.58-alpine. - After the update, try to roll back to the previous version.
2.1.3 Services
Services in Kubernetes enable network access to a set of Pods. Since Pods are ephemeral and their IP addresses can change, Services provide a stable network endpoint for your applications. They act as an abstraction layer, allowing other applications or external users to communicate with your Pods without knowing their individual IP addresses.
Key types of Services:
- ClusterIP (Default): Exposes the Service on an internal IP in the cluster. It’s only reachable from within the cluster.
- NodePort: Exposes the Service on each Node’s IP at a static port (the NodePort). Makes the Service accessible from outside the cluster.
- LoadBalancer: Exposes the Service externally using a cloud provider’s load balancer (e.g., Azure Load Balancer for AKS). This is the standard way to expose public-facing applications.
- ExternalName: Maps the Service to a DNS name, not to a selector. Used for external services.
Code Example: Nginx Deployment with LoadBalancer Service
Let’s expose our Nginx application to the internet using a LoadBalancer Service.
# nginx-deployment-service.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx-container
image: nginx:latest
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
selector:
app: nginx
ports:
- protocol: TCP
port: 80
targetPort: 80
type: LoadBalancer
Deploy the resources:
kubectl apply -f nginx-deployment-service.yaml
Get the external IP address of the Service (it might take a few minutes for the LoadBalancer to provision):
kubectl get service nginx-service
Look for the EXTERNAL-IP column. Once it’s assigned, you can access your Nginx application through that IP in your web browser.
Clean up:
kubectl delete -f nginx-deployment-service.yaml
Exercise/Mini-Challenge:
- Create a Deployment for a simple “hello-world” web application (you can use an image like
gcr.io/google-samples/node-hello:1.0). - Create a
NodePortService to expose this application on port 30080. - Access the application from your local machine using one of your AKS node’s IP addresses and the
NodePort. - Change the Service type to
LoadBalancerand observe the external IP.
2.1.4 Namespaces
Namespaces provide a mechanism for isolating groups of resources within a single Kubernetes cluster. They are crucial for organizing resources, managing access control, and preventing naming collisions in multi-tenant or large clusters.
Key characteristics:
- Resource Scoping: Resources in one namespace are logically isolated from resources in other namespaces.
- Access Control: You can apply Role-Based Access Control (RBAC) policies to specific namespaces, limiting who can access or modify resources within them.
- Default Namespaces: Kubernetes comes with default namespaces:
default,kube-system(for Kubernetes system components),kube-public, andkube-node-lease.
Code Example: Deploying into a Custom Namespace
First, create a new namespace:
kubectl create namespace dev-environment
Now, let’s deploy our Nginx Pod into this new namespace.
# nginx-pod-dev.yaml
apiVersion: v1
kind: Pod
metadata:
name: my-nginx-dev-pod
namespace: dev-environment # Specify the namespace
labels:
app: nginx
spec:
containers:
- name: nginx-container
image: nginx:latest
ports:
- containerPort: 80
Deploy the Pod:
kubectl apply -f nginx-pod-dev.yaml
Check pods in the dev-environment namespace:
kubectl get pods --namespace dev-environment
# or
kubectl get pods -n dev-environment
To see pods across all namespaces:
kubectl get pods --all-namespaces
Clean up:
kubectl delete -f nginx-pod-dev.yaml --namespace dev-environment
kubectl delete namespace dev-environment
Exercise/Mini-Challenge:
- Create two namespaces:
team-alphaandteam-beta. - Deploy a simple Deployment with 3 replicas of
nginxintoteam-alpha. - Deploy a simple Deployment with 2 replicas of
httpdintoteam-beta. - Verify that you can only see the
nginxpods when listing resources inteam-alphaandhttpdpods inteam-beta. - Try to delete a resource in
team-alphawhile your current context is set toteam-beta(without specifying-n team-alpha). Observe the error.
2.1.5 ConfigMaps and Secrets
ConfigMaps and Secrets are Kubernetes objects used to store configuration data and sensitive information, respectively. They allow you to decouple configuration from your application code, making your applications more portable and easier to manage.
ConfigMaps
ConfigMaps store non-confidential data in key-value pairs. They are useful for storing environment variables, command-line arguments, or configuration files.
Key characteristics:
- Non-sensitive data: Intended for general configuration, not sensitive information.
- Flexible consumption: Can be consumed as environment variables, command-line arguments, or files mounted into Pods.
Code Example: ConfigMap for Nginx configuration Let’s use a ConfigMap to provide a custom Nginx configuration.
# nginx-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: nginx-custom-config
data:
nginx.conf: |
server {
listen 80;
location / {
root /usr/share/nginx/html;
index index.html index.htm;
}
location /healthz {
return 200 'OK';
add_header Content-Type text/plain;
}
}
Create the ConfigMap:
kubectl apply -f nginx-configmap.yaml
Now, deploy an Nginx Pod that uses this ConfigMap. We’ll mount the nginx.conf as a file.
# nginx-pod-with-configmap.yaml
apiVersion: v1
kind: Pod
metadata:
name: nginx-config-pod
labels:
app: nginx-config
spec:
containers:
- name: nginx-container
image: nginx:latest
ports:
- containerPort: 80
volumeMounts:
- name: nginx-config-volume
mountPath: /etc/nginx/conf.d/
readOnly: true
volumes:
- name: nginx-config-volume
configMap:
name: nginx-custom-config
items:
- key: nginx.conf
path: default.conf # Mounts nginx.conf from ConfigMap as default.conf in the container
Deploy the Pod:
kubectl apply -f nginx-pod-with-configmap.yaml
You can now verify that the custom health endpoint /healthz works if you expose the Pod with a service.
Clean up:
kubectl delete -f nginx-pod-with-configmap.yaml
kubectl delete -f nginx-configmap.yaml
Secrets
Secrets are similar to ConfigMaps but are designed for storing sensitive information like passwords, API keys, and certificates. Kubernetes provides mechanisms to keep Secrets secure (though they are not encrypted at rest by default in all Kubernetes distributions; AKS offers encryption at rest for etcd).
Key characteristics:
- Sensitive data: Used for passwords, tokens, keys, etc.
- Base64 encoded: By default, Secrets are base64 encoded when stored in etcd, but this is not encryption. Anyone with API access can decode them.
- Mount as files or environment variables: Can be mounted as files into Pods or injected as environment variables. Mounting as files is generally preferred.
Code Example: Secret for a database password Let’s create a Secret for a mock database password.
# Create base64 encoded strings
echo -n 'mysecretpassword' | base64
# Output will be something like: bXlzZWNyZXRwYXNzd29yZA==
# db-secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: my-db-secret
type: Opaque # General-purpose Secret type
data:
DB_PASSWORD: bXlzZWNyZXRwYXNzd29yZA== # Base64 encoded 'mysecretpassword'
Create the Secret:
kubectl apply -f db-secret.yaml
Now, let’s deploy a Pod that consumes this Secret as an environment variable.
# app-pod-with-secret.yaml
apiVersion: v1
kind: Pod
metadata:
name: my-app-secret-pod
spec:
containers:
- name: my-app-container
image: busybox:latest
command: ["sh", "-c", "echo 'Application started...'; echo 'DB Password: '$DB_PASSWORD; sleep 3600"]
env:
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: my-db-secret
key: DB_PASSWORD
Deploy the Pod:
kubectl apply -f app-pod-with-secret.yaml
Check the logs to see the environment variable being used:
kubectl logs my-app-secret-pod
You should see “DB Password: mysecretpassword”.
Clean up:
kubectl delete -f app-pod-with-secret.yaml
kubectl delete -f db-secret.yaml
Exercise/Mini-Challenge (ConfigMaps and Secrets):
- Create a ConfigMap named
app-settingswith two keys:API_URLset tohttps://api.example.comandDEBUG_MODEset totrue. - Create an Opaque Secret named
api-token-secretwith a keyAPI_TOKENcontaining a base64 encoded mock API token (e.g.,my-super-secret-token). - Create a Pod that uses the
alpine/gitimage and mountsapp-settingsas environment variables andapi-token-secretas a file at/etc/secrets/token. - Inside the Pod, try to
catthe mounted secret file and print the environment variables.
2.2 Helm Core Concepts
2.2.1 Charts
A Helm Chart is a collection of files that describe a related set of Kubernetes resources. Think of it as a package for your Kubernetes application. Charts can deploy anything from a simple web app to a complex microservices architecture.
Structure of a Helm Chart: A typical Helm Chart has the following directory structure:
mychart/
Chart.yaml # A YAML file containing information about the chart
values.yaml # The default values for this chart's templates
charts/ # A directory containing any dependent charts (subcharts)
templates/ # A directory of templates that, when combined with values, generate Kubernetes manifest files
templates/NOTES.txt # A short plain text document describing the chart's deployment
Chart.yaml: This file contains metadata about the Chart, such as its name, version, and API version.
# mychart/Chart.yaml
apiVersion: v2
name: mywebapp
description: A Helm chart for my web application
version: 0.1.0 # Chart version
appVersion: "1.0.0" # Version of the application it deploys
values.yaml: This file defines the default configuration values for your application. These values can be overridden during deployment.
# mychart/values.yaml
replicaCount: 1
image:
repository: nginx
tag: latest
pullPolicy: IfNotPresent
service:
type: ClusterIP
port: 80
templates/: This directory contains Kubernetes manifest files (e.g., deployment.yaml, service.yaml) that are templated using Go template syntax. Helm processes these templates with the values provided.
Code Example: Creating a Simple Helm Chart Let’s create a basic Helm Chart for our Nginx application.
Create a new chart:
helm create mynginxappThis command generates a boilerplate chart structure.
Explore the generated files: Navigate into the
mynginxappdirectory and inspectChart.yaml,values.yaml, and the files intemplates/.Customize
values.yaml(optional for this example): For this basic deploy, the defaultvalues.yamlshould be fine. It will deploy an Nginx container.Install the chart:
helm install my-nginx-release mynginxapp/my-nginx-releaseis the name given to this specific deployment of the chart.Verify the deployment:
kubectl get pods -l app.kubernetes.io/instance=my-nginx-release kubectl get svc -l app.kubernetes.io/instance=my-nginx-releaseUpgrade the chart (e.g., change replica count): Edit
mynginxapp/values.yamland changereplicaCount: 1toreplicaCount: 3.Now, upgrade the release:
helm upgrade my-nginx-release mynginxapp/Verify the new replica count:
kubectl get pods -l app.kubernetes.io/instance=my-nginx-releaseRollback (if needed): If an upgrade goes wrong, you can roll back to a previous revision.
helm history my-nginx-release # Note the REVISION number of the previous stable release (e.g., 1) helm rollback my-nginx-release 1Uninstall the chart:
helm uninstall my-nginx-releaseThis removes all resources associated with the release.
Exercise/Mini-Challenge:
- Create a new Helm Chart named
my-backend-app. - Modify its
values.yamlto include animage.tagforubuntu:latestandcommandandargsto run a simplesleep 3600command. - Install the chart.
- Upgrade the chart to change the
image.tagtoubuntu:22.04. - List all installed Helm releases.
2.2.2 Templates and Values
Helm Charts are powerful because they use Go templating to create dynamic Kubernetes manifests.
- Templates: Files in the
templates/directory use Go template syntax (e.g.,{{ .Values.replicaCount }}) to inject values. - Values: Data supplied to the templates, primarily from
values.yaml, but also from--setflags on thehelm installorhelm upgradecommand, or separate-fvalues files.
Code Example: Customizing a Chart with Templates and Values
Let’s customize the mynginxapp chart to add an environment variable.
Edit
mynginxapp/templates/deployment.yaml: Find thecontainersection and add anenvblock:containers: - name: {{ .Chart.Name }} image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}" imagePullPolicy: {{ .Values.image.pullPolicy }} env: - name: MY_ENV_VAR value: "{{ .Values.myCustomEnvVar }}" # New environment variable ports: - name: http containerPort: 80 protocol: TCPEdit
mynginxapp/values.yaml: Add themyCustomEnvVarkey:replicaCount: 1 image: repository: nginx tag: latest pullPolicy: IfNotPresent service: type: ClusterIP port: 80 myCustomEnvVar: "Hello from Helm!" # Default value for the new env varInstall or upgrade the chart:
helm install my-nginx-custom mynginxapp/ # or if already installed helm upgrade my-nginx-custom mynginxapp/Verify the environment variable: Get the name of your Nginx pod:
kubectl get pods -l app.kubernetes.io/instance=my-nginx-custom -o custom-columns=NAME:.metadata.name --no-headersThen, exec into the pod and check environment variables:
POD_NAME=$(kubectl get pods -l app.kubernetes.io/instance=my-nginx-custom -o custom-columns=NAME:.metadata.name --no-headers) kubectl exec -it $POD_NAME -- env | grep MY_ENV_VARYou should see
MY_ENV_VAR=Hello from Helm!.
Clean up:
helm uninstall my-nginx-custom
Exercise/Mini-Challenge:
- In your
my-backend-appchart, add a newConfigMaptemplate (templates/configmap.yaml). - Define a key in
values.yamllikeconfig.message(e.g.,config.message: "Welcome to my app!"). - Have the
ConfigMapuse{{ .Values.config.message }}as its data. - Mount this ConfigMap as a file into your
my-backend-appPod and ensure the Pod’s command prints the content of the mounted file. - Install the chart and verify.
2.2.3 Releases and History
When you install a Helm Chart, it creates a Release. A release is an instance of a chart running in a Kubernetes cluster. Helm tracks the state of each release, including its configuration and revisions.
- Helm Releases: Each
helm installorhelm upgradeoperation creates a new revision for a release. - History: You can view the history of a release, including past configurations and statuses, using
helm history. This is crucial for debugging and understanding changes over time.
Code Example: (Covered in 2.2.1 Chart section with helm install, helm upgrade, helm history, helm rollback).
Exercise/Mini-Challenge:
- Install a simple chart (e.g.,
bitnami/nginxfrom a Helm repository). - Perform two upgrades, changing a different value each time (e.g.,
replicaCount, thenservice.type). - Check the
helm historyfor the release. - Rollback to the first revision.
3. Intermediate Topics
Now that you have a solid foundation, let’s explore more advanced aspects of Kubernetes and Helm with AKS.
3.1 Advanced Kubernetes Concepts
3.1.1 Ingress Controllers
While Services of type LoadBalancer expose a single application, Ingress manages external access to services in a cluster, typically HTTP/S. An Ingress resource defines rules for routing external HTTP/S traffic to internal cluster Services. An Ingress Controller (like Nginx Ingress Controller or Azure Application Gateway Ingress Controller - AGIC) is the actual component that watches the Ingress resources and acts upon them.
Why Ingress?
- Single IP for multiple services: Expose multiple services through a single external IP address.
- Host-based routing: Route traffic to different services based on the hostname (e.g.,
app1.example.comtoService A,app2.example.comtoService B). - Path-based routing: Route traffic based on URL paths (e.g.,
/apitoService A,/webtoService B). - SSL/TLS termination: Handle SSL certificates at the Ingress layer.
Code Example: Deploying Nginx Ingress Controller and an Ingress Resource
First, let’s install the Nginx Ingress Controller using Helm:
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update
helm install nginx-ingress ingress-nginx/ingress-nginx \
--namespace ingress-basic --create-namespace \
--set controller.replicaCount=2 \
--set controller.nodeSelector."kubernetes\.io/os"=linux \
--set defaultBackend.nodeSelector."kubernetes\.io/os"=linux
This will deploy the Nginx Ingress Controller and a LoadBalancer Service to expose it. Get the external IP of the Ingress Controller:
kubectl get svc -n ingress-basic nginx-ingress-ingress-nginx-controller
Note down the EXTERNAL-IP.
Now, let’s deploy a simple Nginx application and an Ingress resource to route traffic to it.
# app-with-ingress.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: webapp-deployment
labels:
app: webapp
spec:
replicas: 2
selector:
matchLabels:
app: webapp
template:
metadata:
labels:
app: webapp
spec:
containers:
- name: webapp-container
image: nginx:latest
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: webapp-service
spec:
selector:
app: webapp
ports:
- protocol: TCP
port: 80
targetPort: 80
type: ClusterIP # Internal service
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: webapp-ingress
annotations:
# Use the Nginx Ingress Controller
kubernetes.io/ingress.class: nginx
spec:
rules:
- host: myapp.example.com # Replace with your desired hostname or IP
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: webapp-service
port:
number: 80
Important: For myapp.example.com to work, you would typically configure your DNS provider to point myapp.example.com to the EXTERNAL-IP of your Nginx Ingress Controller LoadBalancer. For local testing, you can modify your hosts file (e.g., /etc/hosts on Linux/macOS or C:\Windows\System32\drivers\etc\hosts on Windows) to map the IP to the hostname.
Deploy the application and Ingress:
kubectl apply -f app-with-ingress.yaml
Now, if you access http://myapp.example.com (or the IP if you used hosts file), you should see the Nginx welcome page.
Clean up:
kubectl delete -f app-with-ingress.yaml
helm uninstall nginx-ingress --namespace ingress-basic
Exercise/Mini-Challenge:
- Install the Nginx Ingress Controller (if not already installed).
- Deploy two different web applications (e.g.,
nginxandhttpd), each with its ownClusterIPService. - Create an Ingress resource that routes traffic to
nginx.example.comto the Nginx service andhttpd.example.comto the Httpd service. - Verify routing using
curlor by modifying yourhostsfile.
3.1.2 Persistent Volumes and Persistent Volume Claims
Containers are ephemeral by nature, meaning any data stored inside them is lost when the container restarts or is deleted. For stateful applications (like databases), you need a way to store data persistently. Kubernetes addresses this with Persistent Volumes (PVs) and Persistent Volume Claims (PVCs).
- Persistent Volume (PV): A piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned by a storage class. It’s a cluster resource, independent of a Pod’s lifecycle.
- Persistent Volume Claim (PVC): A request for storage by a user. It consumes PV resources. Pods then use PVCs to access the storage.
In AKS, Azure provides dynamic provisioning of storage through Storage Classes. When you create a PVC, AKS can automatically provision an Azure Disk or Azure Files resource.
Code Example: Nginx with Persistent Storage Let’s deploy an Nginx application that serves content from a Persistent Volume.
# nginx-pvc-deployment.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: nginx-pvc
spec:
accessModes:
- ReadWriteOnce # Can be mounted as read-write by a single node
resources:
requests:
storage: 1Gi # Request 1 Gigabyte of storage
storageClassName: default # Use the default StorageClass for Azure Disks
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-pv-deployment
labels:
app: nginx-pv
spec:
replicas: 1
selector:
matchLabels:
app: nginx-pv
template:
metadata:
labels:
app: nginx-pv
spec:
volumes:
- name: nginx-persistent-storage
persistentVolumeClaim:
claimName: nginx-pvc
containers:
- name: nginx-container
image: nginx:latest
ports:
- containerPort: 80
volumeMounts:
- name: nginx-persistent-storage
mountPath: /usr/share/nginx/html # Mount the PV to serve HTML content
lifecycle:
postStart: # Populate some content after container starts
exec:
command: ["/bin/sh", "-c", "echo '<h1>Hello from Persistent Volume!</h1>' > /usr/share/nginx/html/index.html"]
---
apiVersion: v1
kind: Service
metadata:
name: nginx-pv-service
spec:
selector:
app: nginx-pv
ports:
- protocol: TCP
port: 80
targetPort: 80
type: LoadBalancer
Deploy the resources:
kubectl apply -f nginx-pvc-deployment.yaml
Check the PVC and Deployment status. Once the LoadBalancer gets an IP, access it to see “Hello from Persistent Volume!”.
Even if you delete and recreate the nginx-pv-deployment (but not the PVC), the content will persist.
kubectl delete deployment nginx-pv-deployment
# Wait for pods to terminate
kubectl apply -f nginx-pvc-deployment.yaml # Deploy again, the data is still there
Clean up:
kubectl delete -f nginx-pvc-deployment.yaml
Important: Deleting the PVC will also delete the underlying Azure Disk unless the reclaimPolicy of the StorageClass is set to Retain. The default StorageClass in AKS usually has Delete.
Exercise/Mini-Challenge:
- Create a PVC requesting 2Gi of storage.
- Deploy a Pod running
ubuntuthat mounts this PVC to/data. - Inside the Pod, create a file
message.txtwith some content in/data. - Delete the Pod, then create a new Pod that mounts the same PVC to
/data. - Verify that
message.txtstill exists in the new Pod. - Delete the Pod and PVC.
3.1.3 Horizontal Pod Autoscaler (HPA)
The Horizontal Pod Autoscaler (HPA) automatically scales the number of Pod replicas in a Deployment or ReplicaSet based on observed CPU utilization or other select metrics. This ensures your application can handle varying loads efficiently without manual intervention.
How HPA works:
- HPA continuously monitors the specified metrics (e.g., average CPU utilization) of the Pods targeted by a Deployment.
- If the metrics exceed a predefined threshold, HPA increases the number of Pod replicas.
- If the metrics fall below the threshold, HPA decreases the number of Pod replicas.
Prerequisites: For CPU/Memory-based HPA, your Pods must have resource requests defined. For custom metrics, you need a metrics server deployed (which is typically available in AKS by default).
Code Example: HPA for Nginx Deployment Let’s configure HPA for our Nginx Deployment to scale based on CPU utilization.
# nginx-deployment-hpa.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: hpa-nginx-deployment
labels:
app: hpa-nginx
spec:
replicas: 1 # Start with 1 replica
selector:
matchLabels:
app: hpa-nginx
template:
metadata:
labels:
app: hpa-nginx
spec:
containers:
- name: nginx-container
image: nginx:latest
resources:
requests:
cpu: "100m" # Request 100 millicores of CPU
limits:
cpu: "200m" # Limit to 200 millicores
ports:
- containerPort: 80
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: nginx-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: hpa-nginx-deployment
minReplicas: 1
maxReplicas: 5 # Scale up to 5 replicas
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50 # Target 50% CPU utilization
Deploy the resources:
kubectl apply -f nginx-deployment-hpa.yaml
Check HPA status:
kubectl get hpa
Initially, it will show 1 replica.
To simulate load, you can use a busybox Pod to continuously hit the Nginx service (if exposed, or use a port-forward). For simplicity, we won’t show load generation here, but you would see the replica count increase if CPU utilization goes above 50%.
Clean up:
kubectl delete -f nginx-deployment-hpa.yaml
Exercise/Mini-Challenge:
- Deploy a simple
php-apacheapplication (e.g.,kubernetes/examples/hpa/php-apache). Ensure the Deployment has CPU requests defined. - Create an HPA that targets this Deployment, with
minReplicas: 1,maxReplicas: 10, andtargetCPUUtilizationPercentage: 50. - Simulate load on the
php-apacheservice. You can use abusyboxpod:kubectl run -it --rm load-generator --image=busybox:latest -- /bin/sh # Inside busybox: # while true; do wget -q -O- http://php-apache; done - Observe the HPA increasing the replica count using
kubectl get hpa -w. - Stop the load generator and observe the HPA scaling down.
3.1.4 Network Policies
Network Policies allow you to define rules for how Pods communicate with each other and with external network endpoints. By default, Pods are non-isolated and can accept traffic from any source. Network Policies enable a zero-trust approach by enforcing communication segmentation.
Key concepts:
- Isolation: By default, Pods in a namespace are isolated if a Network Policy exists for that namespace.
- Ingress/Egress rules: Define rules for incoming (Ingress) and outgoing (Egress) traffic.
- Selectors: Use Pod selectors and Namespace selectors to specify which Pods or Namespaces the policy applies to.
Code Example: Restricting Nginx access Let’s restrict our Nginx application to only be accessible from Pods with a specific label.
First, create a test-app Deployment that will try to access Nginx.
# restricted-nginx.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: restricted-nginx-deployment
labels:
app: restricted-nginx
spec:
replicas: 1
selector:
matchLabels:
app: restricted-nginx
template:
metadata:
labels:
app: restricted-nginx
spec:
containers:
- name: nginx-container
image: nginx:latest
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: restricted-nginx-service
spec:
selector:
app: restricted-nginx
ports:
- protocol: TCP
port: 80
targetPort: 80
type: ClusterIP
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: test-client-deployment
labels:
app: test-client
spec:
replicas: 1
selector:
matchLabels:
app: test-client
template:
metadata:
labels:
app: test-client
spec:
containers:
- name: busybox-container
image: busybox:latest
command: ["sh", "-c", "sleep 3600"]
Deploy these:
kubectl apply -f restricted-nginx.yaml
Get the cluster IP of the restricted-nginx-service:
kubectl get svc restricted-nginx-service
Note the CLUSTER-IP.
Now, exec into the test-client pod and try to wget the Nginx service:
CLIENT_POD=$(kubectl get pods -l app=test-client -o custom-columns=NAME:.metadata.name --no-headers)
NGINX_IP=$(kubectl get svc restricted-nginx-service -o jsonpath='{.spec.clusterIP}')
kubectl exec -it $CLIENT_POD -- wget -O- -T 2 http://$NGINX_IP
It should succeed (you’ll see the Nginx welcome page).
Now, let’s apply a Network Policy to isolate Nginx.
# nginx-network-policy.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-from-specific-app
namespace: default # Policy applies to pods in the default namespace
spec:
podSelector:
matchLabels:
app: restricted-nginx # This policy applies to pods with label app: restricted-nginx
policyTypes:
- Ingress # Only applies to Ingress traffic
ingress:
- from:
- podSelector:
matchLabels:
app: test-client # Only allow traffic from pods with label app: test-client
Apply the policy:
kubectl apply -f nginx-network-policy.yaml
Wait a few seconds for the policy to take effect. Then, try to wget again from the test-client pod:
kubectl exec -it $CLIENT_POD -- wget -O- -T 2 http://$NGINX_IP
This time, it should fail (connection timed out or similar error), because the test-client pod is not explicitly allowed by the policy (or, rather, the policy states that only pods matching the podSelector under from are allowed).
Clean up:
kubectl delete -f nginx-network-policy.yaml
kubectl delete -f restricted-nginx.yaml
Exercise/Mini-Challenge:
- Deploy two Deployments:
frontend(with labelapp: frontend) andbackend(with labelapp: backend). Each should have aClusterIPService. - Deploy a busybox Pod (label
app: admin-tool). - Create a Network Policy that allows
frontendPods to communicate withbackendPods on a specific port (e.g., 8080) and allows theadmin-toolPod to communicate withbackendPods on a different port (e.g., 9000), but no other ingress traffic tobackend. - Test connectivity from
frontendandadmin-toolPods tobackend, and try from a third, unprivileged Pod to verify the policy.
3.1.5 RBAC (Role-Based Access Control)
Role-Based Access Control (RBAC) is a method of regulating access to computer or network resources based on the roles of individual users within an organization. In Kubernetes, RBAC allows you to define who can do what to which resources.
Key concepts:
- Role: Defines permissions within a specific namespace.
- ClusterRole: Defines permissions across the entire cluster.
- RoleBinding: Grants the permissions defined in a Role to a user or ServiceAccount within a specific namespace.
- ClusterRoleBinding: Grants the permissions defined in a ClusterRole to a user or ServiceAccount across the entire cluster.
- ServiceAccount: An identity used by processes running in Pods. Pods that access the Kubernetes API do so using a ServiceAccount.
In AKS, you typically integrate Kubernetes RBAC with Azure Active Directory (Azure AD) RBAC. This allows you to manage Kubernetes permissions using your existing Azure AD identities.
Code Example: Granting Read-Only Access to a Namespace
Create a new namespace and a test user/group in Azure AD: This step is external to Kubernetes manifests. You would create an Azure AD Group (e.g.,
aks-dev-readers) and add Azure AD users to it. (For this example, we’ll assume an Azure AD group ID exists, or you can simulate by creating a newServiceAccount.)Create a
Rolefor read-only access:# readonly-role.yaml apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: pod-reader namespace: dev-environment # Define role in 'dev-environment' namespace rules: - apiGroups: [""] # "" indicates the core API group resources: ["pods", "pods/log"] verbs: ["get", "watch", "list"] - apiGroups: ["apps"] resources: ["deployments"] verbs: ["get", "watch", "list"]Create a
ServiceAccountandRoleBinding: Let’s create a ServiceAccount and bind thepod-readerrole to it.# readonly-serviceaccount-rolebinding.yaml apiVersion: v1 kind: ServiceAccount metadata: name: dev-reader-sa namespace: dev-environment
apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: dev-reader-binding namespace: dev-environment subjects:
- kind: ServiceAccount
name: dev-reader-sa
namespace: dev-environment
roleRef:
kind: Role
name: pod-reader
apiGroup: rbac.authorization.k8s.io
Deploy these:
kubectl create namespace dev-environment
kubectl apply -f readonly-role.yaml
kubectl apply -f readonly-serviceaccount-rolebinding.yaml
Test with the ServiceAccount: You can simulate using this ServiceAccount by creating a Pod that uses it.
# test-pod-with-sa.yaml apiVersion: v1 kind: Pod metadata: name: test-sa-pod namespace: dev-environment spec: serviceAccountName: dev-reader-sa # Assign the ServiceAccount containers: - name: busybox-container image: busybox:latest command: ["sh", "-c", "sleep 3600"]Deploy the pod:
kubectl apply -f test-pod-with-sa.yamlNow, from your local machine, try to list deployments as
dev-reader-sa(this is an advancedkubectltrick for testing).# Get the token for the ServiceAccount SA_TOKEN=$(kubectl create token dev-reader-sa -n dev-environment --duration 1h) # Use the token to list deployments in dev-environment kubectl get deployments -n dev-environment --token=$SA_TOKEN # This should succeed, as the 'pod-reader' role allows listing deployments. # Try to delete a deployment (should fail due to insufficient permissions) kubectl delete deployment -n dev-environment my-deployment --token=$SA_TOKEN # You should get an "Error from server (Forbidden)" message.This demonstrates the concept of restricted access. For actual Azure AD integration, you’d map Azure AD groups to
ClusterRoleBindings.
Clean up:
kubectl delete -f test-pod-with-sa.yaml -n dev-environment
kubectl delete -f readonly-serviceaccount-rolebinding.yaml -n dev-environment
kubectl delete -f readonly-role.yaml -n dev-environment
kubectl delete namespace dev-environment
Exercise/Mini-Challenge:
- Create a namespace
hr-app. - Create a
ServiceAccountnamedhr-operatorinhr-app. - Define a
Rolecalledhr-app-full-accessin thehr-appnamespace that grantscreate,get,list,update,delete, andwatchverbs onpodsanddeployments. - Bind the
hr-operatorServiceAccount to thehr-app-full-accessRole. - Try to list
podsinhr-appusing thehr-operatorServiceAccount token (similar to the example above). - Try to list
podsinkube-systemusing thehr-operatorServiceAccount token (should fail).
3.2 Advanced Helm Concepts
3.2.1 Chart Dependencies (Subcharts)
Complex applications often consist of multiple components. Helm allows you to manage these components as subcharts within a parent chart. This enables modularity and reusability.
Key characteristics:
- Modularization: Break down large applications into smaller, manageable charts.
- Reusability: Use existing, stable charts (e.g., from Bitnami) as dependencies.
- Dependencies: Defined in the
Chart.yamlof the parent chart.
Code Example: Application with a Database Subchart
Let’s create a parent chart (my-full-app) that depends on a database subchart (e.g., Bitnami’s PostgreSQL).
Create the parent chart:
helm create my-full-app cd my-full-appAdd the dependency in
Chart.yaml: Editmy-full-app/Chart.yamland add adependenciessection:# my-full-app/Chart.yaml apiVersion: v2 name: my-full-app description: A Helm chart for my full application with a database version: 0.1.0 appVersion: "1.0.0" dependencies: - name: postgresql version: 12.x.x # Use a specific major version, check Bitnami repo for latest repository: https://charts.bitnami.com/bitnami condition: postgresql.enabled # Enable/disable subchart with values.yaml(Note: Check the Bitnami PostgreSQL chart for its latest stable version.)
Update Helm dependencies: This downloads the subchart into the
charts/directory.helm dependency update .You should now see
charts/postgresql-12.x.x.tgz.Configure subchart values (optional for this example): You can override default values of the subchart by adding them under the subchart’s name in
my-full-app/values.yaml. For example, to set the PostgreSQL password:# my-full-app/values.yaml replicaCount: 1 image: repository: nginx tag: latest pullPolicy: IfNotPresent service: type: ClusterIP port: 80 # PostgreSQL subchart specific values postgresql: enabled: true # Explicitly enable the subchart auth: postgresPassword: "mysecretpostgrespassword"Install the parent chart:
helm install my-full-stack my-full-app/Verify both components:
kubectl get pods -l app.kubernetes.io/instance=my-full-stack kubectl get svc -l app.kubernetes.io/instance=my-full-stackYou should see both your Nginx pod (from the parent chart) and the PostgreSQL pod (from the subchart).
Clean up:
helm uninstall my-full-stack
Exercise/Mini-Challenge:
- Create a parent Helm Chart called
ecommerce-stack. - Add
mongodb(from Bitnami) andredis(from Bitnami) as subchart dependencies inChart.yaml. - Update the dependencies using
helm dependency update. - In
values.yamlofecommerce-stack, configure custom passwords for both MongoDB and Redis subcharts. - Install
ecommerce-stackand verify that all pods and services are running.
3.2.2 Conditional Logic and Loops in Templates
Helm’s Go templating language allows for powerful conditional logic (if/else) and iteration (range) to create flexible and dynamic manifests.
Use cases:
- Conditional resource creation: Create a resource only if a certain value is set (e.g.,
if .Values.ingress.enabled). - Dynamic configuration: Generate multiple environment variables or port mappings based on a list in
values.yaml.
Code Example: Conditional Ingress and Multiple Ports
Let’s modify my-full-app to conditionally create an Ingress and expose multiple ports if defined.
Edit
my-full-app/values.yaml: Addingress.enabledandservice.additionalPorts.# my-full-app/values.yaml # ... other values ... ingress: enabled: false host: myapp.local service: type: ClusterIP port: 80 additionalPorts: # List of additional ports - name: https port: 443 targetPort: 443 protocol: TCP - name: metrics port: 9090 targetPort: 9090 protocol: TCPEdit
my-full-app/templates/service.yaml: Add arangeloop to create additional ports.# my-full-app/templates/service.yaml apiVersion: v1 kind: Service metadata: name: {{ include "my-full-app.fullname" . }} labels: {{- include "my-full-app.labels" . | nindent 4 }} spec: type: {{ .Values.service.type }} ports: - port: {{ .Values.service.port }} targetPort: http protocol: TCP name: http {{- range .Values.service.additionalPorts }} - name: {{ .name }} port: {{ .port }} targetPort: {{ .targetPort }} protocol: {{ .protocol }} {{- end }} selector: {{- include "my-full-app.selectorLabels" . | nindent 4 }}Create
my-full-app/templates/ingress.yaml(new file): Use anifblock to conditionally create the Ingress.# my-full-app/templates/ingress.yaml {{- if .Values.ingress.enabled }} apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: {{ include "my-full-app.fullname" . }} labels: {{- include "my-full-app.labels" . | nindent 4 }} {{- with .Values.ingress.annotations }} annotations: {{- toYaml . | nindent 4 }} {{- end }} spec: rules: - host: {{ .Values.ingress.host }} http: paths: - path: / pathType: Prefix backend: service: name: {{ include "my-full-app.fullname" . }} port: number: {{ .Values.service.port }} {{- end }}(For this Ingress to work, you might need an Ingress Controller installed separately, like Nginx Ingress Controller.)
Install the chart with Ingress enabled:
helm install my-conditional-app my-full-app/ --set ingress.enabled=true --set ingress.host=myconditionalapp.localVerify: Check the Service for multiple ports, and the Ingress resource for conditional creation.
kubectl get svc my-conditional-app kubectl get ing my-conditional-app
Clean up:
helm uninstall my-conditional-app
Exercise/Mini-Challenge:
- In your
ecommerce-stackchart, modify thedeployment.yamltemplate. - Add a conditional
ifblock that, ifapp.envVars.enabledistrueinvalues.yaml, iterates through a listapp.envVars.listand creates environment variables for your application Pod. - Test by installing the chart with
app.envVars.enabled: trueand a few custom environment variables, then withapp.envVars.enabled: false.
4. Advanced Topics and Best Practices
This section delves into production-ready patterns, operations, and security for your Helm and Kubernetes deployments on AKS.
4.1 Infrastructure as Code (IaC) with Terraform for AKS
Infrastructure as Code (IaC) is the practice of managing and provisioning infrastructure through machine-readable definition files, rather than physical hardware configuration or interactive configuration tools. Terraform is a popular IaC tool that allows you to define and manage your Azure resources, including AKS clusters, declaratively.
Why use Terraform for AKS?
- Reproducibility: Create identical environments reliably.
- Version Control: Track infrastructure changes in Git.
- Automation: Automate the provisioning of complex infrastructure.
- State Management: Terraform keeps a state file to map real-world resources to your configuration.
Key Terraform resources for AKS:
azurerm_resource_group: To manage Azure Resource Groups.azurerm_kubernetes_cluster: To provision the AKS cluster itself.azurerm_kubernetes_cluster_node_pool: To manage additional node pools.azurerm_container_registry: For Azure Container Registry.
Code Example: Provisioning AKS with Terraform
Create a Terraform project directory:
mkdir azure-aks-infra cd azure-aks-inframain.tf: Define the Azure provider and resources.# main.tf terraform { required_providers { azurerm = { source = "hashicorp/azurerm" version = "~> 3.0" # Pin to a major version } } backend "azurerm" { # Configure remote state storage in Azure Storage Account resource_group_name = "tfstate-rg" storage_account_name = "tfstateuksouth2025" container_name = "tfstate" key = "aks.terraform.tfstate" } } provider "azurerm" { features {} } resource "azurerm_resource_group" "aks_rg" { name = "my-aks-terraform-rg" location = "East US" } resource "azurerm_kubernetes_cluster" "aks_cluster" { name = "my-terraform-aks" location = azurerm_resource_group.aks_rg.location resource_group_name = azurerm_resource_group.aks_rg.name dns_prefix = "myterraformaks" default_node_pool { name = "systempool" node_count = 2 vm_size = "Standard_DS2_v2" # only_critical_addons_enabled = true # Recommended for production } identity { type = "SystemAssigned" } tags = { environment = "dev" managed_by = "terraform" } # Optional: Enable AKS Automatic or Deployment Safeguards for production # automatic_upgrade_channel = "stable" # node_auto_provisioning { # enabled = true # } # deployment_safeguards { # enabled = true # } # Optional: Private Cluster for enhanced security # private_cluster_enabled = true # private_dns_zone_id = "..." # Reference existing private DNS zone } output "aks_cluster_name" { value = azurerm_kubernetes_cluster.aks_cluster.name description = "The name of the AKS cluster." } output "aks_kube_config" { value = azurerm_kubernetes_cluster.aks_cluster.kube_config_raw sensitive = true # Mark as sensitive to prevent plain text output description = "The raw Kubernetes configuration for the AKS cluster." }Initialize Terraform and create state storage (if not already done): Before
terraform init, you need an Azure Storage Account and Container for remote state.# Create resource group for state az group create --name tfstate-rg --location eastus # Create storage account az storage account create --name tfstateuksouth2025 --resource-group tfstate-rg --location eastus --sku Standard_LRS # Create storage container az storage container create --name tfstate --account-name tfstateuksouth2025Now, initialize Terraform:
terraform initPlan and Apply:
terraform plan terraform apply --auto-approve # --auto-approve for automated pipelines, otherwise manually type 'yes'This will provision your AKS cluster.
Get Kubeconfig: After apply, you can use the outputted kubeconfig:
# Store the kube_config_raw output into a file terraform output -raw aks_kube_config > kubeconfig_aks # Set KUBECONFIG environment variable to use it export KUBECONFIG=$(pwd)/kubeconfig_aks # Test connection kubectl get nodesClean up:
terraform destroy --auto-approve az group delete --name tfstate-rg --yes --no-wait # Delete the resource group for state storage
Best Practices for Terraform and AKS:
- Module Usage: For complex setups, use official Terraform modules (e.g.,
Azure/aks/azurerm) to create well-architected clusters. - State Management: Always use remote state (like Azure Storage Blob) and state locking to enable collaboration and prevent concurrent modifications.
- Version Pinning: Pin your
azurermprovider version to prevent unexpected changes due to new provider versions. - Prevent Destroy: For production clusters, use
lifecycle { prevent_destroy = true }onazurerm_kubernetes_clusterto prevent accidental deletion. - Private Clusters: Enable
private_cluster_enabledfor enhanced security where appropriate. - Managed Identities: Leverage Managed Identities for AKS and your applications to interact with other Azure services securely.
Exercise/Mini-Challenge:
- Modify the
main.tfto add an additionalazurerm_kubernetes_cluster_node_poolnamedapppoolwithnode_count = 1and a differentvm_size(e.g.,Standard_D2_v3). - Set
only_critical_addons_enabled = trueon thedefault_node_pool. - Deploy the infrastructure and verify the two node pools.
- Implement
automatic_upgrade_channel = "stable"in yourazurerm_kubernetes_clusterresource.
4.2 Logging, Debugging, and Tracing
Observability is crucial for understanding the health and performance of your applications and infrastructure.
4.2.1 Logging with Azure Monitor and Container Insights
Logging provides a record of events happening within your applications and Kubernetes cluster. Azure Monitor Container Insights is a feature of Azure Monitor that monitors the performance of container workloads deployed to AKS. It collects metrics and logs from containers, nodes, and the control plane.
Key features:
- Automatic collection: Collects logs, metrics, and events from AKS automatically.
- Log Analytics Workspace: Stores collected data in a centralized Log Analytics Workspace for querying and analysis.
- Pre-built dashboards: Provides out-of-the-box dashboards for cluster health, node performance, and container activity.
- Kubelet logs: You can also get
kubeletlogs for node-level troubleshooting.
Enabling Container Insights (usually enabled by default when creating AKS via Azure portal/CLI): You can enable it during cluster creation or afterwards:
az aks create --resource-group myAKSResourceGroup --name myAKSCluster --enable-managed-identity --enable-addons monitoring
# Or for existing cluster:
az aks enable-addons --addons monitoring --name myAKSCluster --resource-group myAKSResourceGroup
Viewing Logs in Azure Portal:
- Navigate to your AKS cluster in the Azure portal.
- Under “Monitoring”, click “Insights”.
- Explore the various views (Cluster, Nodes, Controllers, Containers) to see performance metrics and logs.
- Use “Logs” (or “Log Analytics Workspace” directly) to write Kusto Query Language (KQL) queries.
Example KQL Queries:
- Get container logs:
ContainerLogV2 | where TimeGenerated > ago(1h) | order by TimeGenerated desc - Get Pod CPU usage:
KubePodInventory | where TimeGenerated > ago(1h) | summarize max(CpuUsageMs) by PodName | order by max_CpuUsageMs desc - Kubelet Logs: You can access kubelet logs for a specific node via SSH (
kubectl debug node/<node-name> -it --image=mcr.microsoft.com/aks/fundamental/base-ubuntu:v0.0.12) and thenchroot /host journalctl -u kubelet -o cat. Or directly viakubectl get --raw "/api/v1/nodes/nodename/proxy/logs/messages"|grep kubelet
Best Practices for Logging:
- Centralized Logging: Always send your application logs to a centralized logging solution (Azure Monitor, Splunk, ELK, Grafana Loki).
- Structured Logging: Use JSON or other structured formats for application logs to make them easier to parse and query.
- Log Levels: Implement appropriate log levels (DEBUG, INFO, WARN, ERROR) in your applications.
- Sensitive Data: Avoid logging sensitive information.
Exercise/Mini-Challenge:
- Deploy a simple application that logs messages (e.g., “INFO: Application started” every 5 seconds).
- Enable Container Insights on your AKS cluster if it’s not already.
- Go to the Log Analytics Workspace linked to your AKS cluster and write a KQL query to filter for your application’s logs.
- Modify the application to log an “ERROR: Something went wrong!” message occasionally and observe it in your logs.
4.2.2 Debugging
Debugging involves identifying and resolving issues in your applications and Kubernetes environment.
Common Kubernetes debugging steps:
kubectl get pods,kubectl describe pod <pod-name>: Check pod status, events, and configuration.kubectl logs <pod-name>: View container logs.kubectl exec -it <pod-name> -- /bin/sh: Get a shell into a running container for interactive debugging.kubectl port-forward <pod-name> <local-port>:<container-port>: Access a service running inside a Pod from your local machine.kubectl debug: A powerful command for creating ephemeral debug containers, especially useful for debugging distroless images or containers without a shell.This creates a new container in the same Pod, sharing its process namespace, allowing you to debug the main container from a full OS environment.kubectl debug -it <pod-name> --image=ubuntu:latest --share-processes- Inspect events:
kubectl get events --all-namespacescan reveal scheduling issues, image pull failures, or other cluster-level problems. - Network troubleshooting: Use
ping,curl,netstatfrom within abusyboxor debug Pod to diagnose connectivity issues.
Debugging Pending Pods Example: (Refer to Core Concepts -> Debug Running Pods section in search results). If a Pod is Pending, kubectl describe pod is your first stop, looking at the Events section. Common reasons include insufficient resources (CPU/memory), node selectors not matching, or Persistent Volume issues.
Best Practices for Debugging:
- Start small: Isolate the problem to the smallest possible component (Pod, Service, Ingress).
- Check logs first: Application logs and Kubernetes events are often the quickest way to find clues.
- Reproduce consistently: Try to reproduce the issue in a development environment.
- Use ephemeral debug containers: Leverage
kubectl debugfor efficient in-cluster troubleshooting.
Exercise/Mini-Challenge:
- Create a Deployment that requests more CPU than any single node in your AKS cluster has (e.g., 4000m on a 2-core node).
- Observe the Pods in a
Pendingstate. - Use
kubectl describe deploymentandkubectl describe pod <pending-pod>to identify the reason for theFailedSchedulingerror. - Correct the CPU request in your Deployment manifest and redeploy to resolve the issue.
4.2.3 Tracing
Tracing helps you understand the flow of requests through complex distributed systems (microservices). It visualizes how different services interact and where latency occurs.
Key concepts:
- Spans: A unit of work within a trace, representing an operation (e.g., API call, database query).
- Traces: A collection of spans that represent a single request’s journey through the system.
- OpenTelemetry: A vendor-neutral set of APIs, SDKs, and tools to instrument, generate, collect, and export telemetry data (metrics, logs, and traces).
- Azure Application Insights: An Application Performance Management (APM) service that can collect and visualize traces (among other telemetry).
Implementing Tracing:
- Instrument your applications: Use OpenTelemetry SDKs in your application code to generate trace data.
- Deploy an OpenTelemetry Collector: This component collects trace data from your applications and exports it to a tracing backend.
- Choose a tracing backend: Integrate with services like Azure Application Insights, Jaeger, or Zipkin to visualize traces.
Example (Conceptual): Application Instrumentation
# app.py (Python example with OpenTelemetry)
from opentelemetry import trace
from opentelemetry.exporter.jaeger.thrift import JaegerExporter
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import (
ConsoleSpanExporter,
SimpleSpanProcessor,
)
from flask import Flask
# Configure TracerProvider
resource = Resource.create({"service.name": "my-web-app"})
provider = TracerProvider(resource=resource)
trace.set_tracer_provider(provider)
# Configure Jaeger Exporter (or Azure Application Insights exporter)
jaeger_exporter = JaegerExporter(
agent_host_name="otel-collector.monitoring", # Or the IP of your collector
agent_port=6831,
)
provider.add_span_processor(SimpleSpanProcessor(jaeger_exporter))
tracer = trace.get_tracer(__name__)
app = Flask(__name__)
@app.route("/")
def hello_world():
with tracer.start_as_current_span("hello-request"):
return "Hello, Traced World!"
if __name__ == "__main__":
app.run(host="0.0.0.0", port=80)
You would then deploy an OpenTelemetry Collector and a Jaeger/Azure Application Insights instance in your cluster to visualize these traces.
Best Practices for Tracing:
- Consistency: Ensure all services in your distributed system are instrumented consistently.
- Context Propagation: Use HTTP headers or other mechanisms to propagate trace context (trace ID, span ID) across service calls.
- Sampling: Implement sampling to manage the volume of trace data, especially in high-traffic environments.
- Link with Logs/Metrics: Correlate trace IDs with logs and metrics for a holistic view of your application’s health.
Exercise/Mini-Challenge (Conceptual):
- Research OpenTelemetry for a programming language you are familiar with (e.g., Python, Node.js, Java).
- Find an example of instrumenting a simple web service with OpenTelemetry.
- Outline the steps you would take to deploy this application to AKS, send its traces to an OpenTelemetry Collector (deployed via Helm), and visualize them in a tool like Jaeger (also deployed via Helm).
4.3 Handling Production Situations
4.3.1 Health Checks (Readiness and Liveness Probes)
Kubernetes uses probes to determine the health of your application Pods:
- Liveness Probe: Checks if a container is running. If it fails, Kubernetes restarts the container. Essential for applications that might get into a broken state without crashing.
- Readiness Probe: Checks if a container is ready to serve traffic. If it fails, Kubernetes removes the Pod from Service load balancers. Useful for applications that need time to warm up or load data.
Types of Probes:
- HTTP GET: Makes an HTTP request to a specified path on a given port.
- TCP Socket: Attempts to open a TCP socket on a specified port.
- Exec: Executes a command inside the container and checks the exit code.
Code Example: Nginx with Liveness and Readiness Probes
# nginx-probes.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-probes-deployment
labels:
app: nginx-probes
spec:
replicas: 1
selector:
matchLabels:
app: nginx-probes
template:
metadata:
labels:
app: nginx-probes
spec:
containers:
- name: nginx-container
image: nginx:latest
ports:
- containerPort: 80
livenessProbe:
httpGet:
path: /healthz
port: 80
initialDelaySeconds: 5 # Wait 5 seconds before first check
periodSeconds: 5 # Check every 5 seconds
failureThreshold: 3 # After 3 failures, restart
readinessProbe:
httpGet:
path: /ready
port: 80
initialDelaySeconds: 5
periodSeconds: 5
failureThreshold: 1 # After 1 failure, take out of service
lifecycle:
postStart:
exec:
command: ["/bin/sh", "-c", "echo 'OK' > /usr/share/nginx/html/healthz; echo 'OK' > /usr/share/nginx/html/ready"] # Create health endpoints
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 10 && rm -f /usr/share/nginx/html/ready"] # Simulate unreadiness before shutdown
Deploy:
kubectl apply -f nginx-probes.yaml
To observe, you would expose this with a Service. To test liveness failure, you could kubectl exec into the pod and rm /usr/share/nginx/html/healthz. Kubernetes should restart the container. To test readiness, you could rm /usr/share/nginx/html/ready. The Pod would remain running but be removed from the service endpoints.
Clean up:
kubectl delete -f nginx-probes.yaml
Best Practices for Probes:
- Realistic checks: Probes should reflect the actual health and readiness of your application.
- Lightweight endpoints: Health endpoints should be fast and not resource-intensive.
- Graceful shutdown: Use
preStophooks to gracefully drain traffic and perform cleanup before termination. - Configuration: Tune
initialDelaySeconds,periodSeconds,timeoutSeconds, andfailureThresholdbased on application characteristics.
Exercise/Mini-Challenge:
- Deploy an application (e.g., a simple Python Flask app) with both a liveness probe (checking
/healthz) and a readiness probe (checking/ready). - Implement a
/healthzendpoint that always returns 200. - Implement a
/readyendpoint that initially returns 200, but after 30 seconds, starts returning 500 (simulating unreadiness). - Observe the Pod’s status change to “Not Ready” after 30 seconds without the Pod restarting.
4.3.2 Resource Quotas and Limit Ranges
Resource Quotas and Limit Ranges are mechanisms to manage and constrain resource consumption within a Kubernetes cluster.
- Resource Quotas: Limits the total amount of resources (CPU, memory, storage) that can be consumed by all Pods within a namespace.
- Limit Ranges: Enforces default resource limits and requests for Pods within a namespace, and can also define minimum and maximum resource constraints for containers.
Why use them?
- Resource Governance: Prevent resource starvation and ensure fair resource distribution across teams/namespaces.
- Cost Control: Manage cloud spending by setting limits on resource usage.
- Stability: Prevent “noisy neighbor” issues where one application consumes excessive resources.
Code Example: Resource Quota and Limit Range in a Namespace
Create a namespace:
kubectl create namespace constrained-envDefine a Resource Quota:
# resource-quota.yaml apiVersion: v1 kind: ResourceQuota metadata: name: dev-quota namespace: constrained-env
spec: hard: pods: “10” # Max 10 pods requests.cpu: “1” # Total CPU requests for all pods max 1 CPU core requests.memory: “2Gi” # Total memory requests max 2 GB limits.cpu: “2” # Total CPU limits for all pods max 2 CPU cores limits.memory: “4Gi” # Total memory limits max 4 GB persistentvolumeclaims: “2” # Max 2 PVCs requests.storage: “5Gi” # Total storage requests max 5 GB
3. **Define a Limit Range**:
```yaml
# limit-range.yaml
apiVersion: v1
kind: LimitRange
metadata:
name: cpu-mem-limit-range
namespace: constrained-env
spec:
limits:
- default: # Default limits if not specified by container
cpu: 500m
memory: 512Mi
defaultRequest: # Default requests if not specified by container
cpu: 100m
memory: 256Mi
max: # Maximum allowed for a single container
cpu: 1
memory: 1Gi
type: Container
```
Apply them to the namespace:
```bash
kubectl apply -f resource-quota.yaml -n constrained-env
kubectl apply -f limit-range.yaml -n constrained-env
Now, try to deploy a Pod without specifying requests/limits in constrained-env:
# test-pod-no-limits.yaml
apiVersion: v1
kind: Pod
metadata:
name: test-pod-no-limits
namespace: constrained-env
spec:
containers:
- name: my-container
image: busybox:latest
command: ["sleep", "3600"]
Deploy: kubectl apply -f test-pod-no-limits.yaml -n constrained-env
kubectl describe pod test-pod-no-limits -n constrained-env will show that default requests and limits were applied from the LimitRange.
Now, try to create a Pod that violates the quota (e.g., requesting 3 CPU cores):
# test-pod-exceed-quota.yaml
apiVersion: v1
kind: Pod
metadata:
name: test-pod-exceed-quota
namespace: constrained-env
spec:
containers:
- name: my-container
image: busybox:latest
command: ["sleep", "3600"]
resources:
requests:
cpu: "3" # Exceeds resource quota total requests.cpu: 1
limits:
cpu: "3" # Exceeds resource quota total limits.cpu: 2
Deploying this will result in an Error from server (Forbidden) due to resource quota violation.
Clean up:
kubectl delete -f test-pod-no-limits.yaml -n constrained-env
# kubectl delete -f test-pod-exceed-quota.yaml -n constrained-env (if it was created)
kubectl delete -f limit-range.yaml -n constrained-env
kubectl delete -f resource-quota.yaml -n constrained-env
kubectl delete namespace constrained-env
Best Practices for Resource Governance:
- Define for every namespace: Apply Resource Quotas and Limit Ranges to all non-system namespaces.
- Sensible defaults: Set reasonable
defaultRequestanddefaultlimits inLimitRangesto prevent runaway containers. - Monitor usage: Monitor actual resource consumption against quotas to fine-tune your limits.
Exercise/Mini-Challenge:
- Create a namespace
staging. - Apply a
ResourceQuotatostagingthat limitspodsto 5 andrequests.memoryto 1Gi. - Apply a
LimitRangetostagingthat sets adefaultRequest.memoryof 128Mi anddefault.memoryof 256Mi for containers. - Deploy a Deployment with 4 replicas of a simple
nginxapp (without explicit memory requests/limits). Verify they use the default values and count towards the quota. - Try to deploy a 6th pod into the
stagingnamespace. Observe the quota error.
4.4 Automated Deployment with CI/CD (GitHub Actions)
Continuous Integration/Continuous Delivery (CI/CD) pipelines automate the process of building, testing, and deploying your applications. GitHub Actions provides a powerful, flexible, and fully integrated CI/CD solution directly within your GitHub repositories.
Workflow for deploying to AKS with GitHub Actions:
- Code Commit: Developer pushes code to GitHub.
- CI Build: GitHub Action triggers, builds the Docker image, runs tests.
- Image Push: Pushes the Docker image to Azure Container Registry (ACR).
- CD Deploy: GitHub Action triggers, logs into AKS, and deploys the Helm Chart (or Kubernetes manifests).
Key GitHub Actions components for AKS/Helm:
azure/login: Authenticate to Azure.azure/docker-login: Authenticate to ACR.azure/aks-set-context: Setkubectlcontext to AKS.helm/helm-action: Install/upgrade Helm Charts.actions/checkout: Checkout your repository code.
Code Example: GitHub Actions for Helm Deployment to AKS
Set up Azure Service Principal: GitHub Actions needs credentials to interact with Azure. Create a Service Principal with
Contributorrole on your AKS Resource Group and ACR.az ad sp create-for-rbac --name "github-actions-sp" --role contributor \ --scopes /subscriptions/<subscription-id>/resourceGroups/<your-aks-resource-group> \ --sdk-authThis will output a JSON object. Save it as a GitHub Secret (e.g.,
AZURE_CREDENTIALS).Create an ACR (Azure Container Registry):
az acr create --resource-group myAKSResourceGroup --name myacr2025example --sku Basic --admin-enabled trueNote your ACR login server (e.g.,
myacr2025example.azurecr.io).Create your application and Helm chart: Assume you have a simple Nginx application and the
mynginxappHelm chart from earlier..github/workflows/deploy.yaml:# .github/workflows/deploy.yaml name: Deploy to AKS on: push: branches: - main workflow_dispatch: # Allows manual trigger env: AZURE_CONTAINER_REGISTRY: myacr2025example.azurecr.io # Replace with your ACR name AZURE_AKS_CLUSTER_NAME: myAKSCluster # Replace with your AKS cluster name RESOURCE_GROUP: myAKSResourceGroup # Replace with your AKS resource group HELM_CHART_PATH: mynginxapp # Path to your Helm chart directory HELM_RELEASE_NAME: my-webapp jobs: build-and-deploy: runs-on: ubuntu-latest steps: - name: Checkout repository uses: actions/checkout@v4 - name: Azure Login uses: azure/login@v1 with: creds: ${{ secrets.AZURE_CREDENTIALS }} - name: Docker Login to ACR uses: azure/docker-login@v1 with: login-server: ${{ env.AZURE_CONTAINER_REGISTRY }} username: ${{ secrets.ACR_USERNAME }} # Use ACR Admin username password: ${{ secrets.ACR_PASSWORD }} # Use ACR Admin password - name: Build and Push Docker image run: | docker build . -t ${{ env.AZURE_CONTAINER_REGISTRY }}/mywebapp:${{ github.sha }} docker push ${{ env.AZURE_CONTAINER_REGISTRY }}/mywebapp:${{ github.sha }} - name: Set AKS Kubeconfig uses: azure/aks-set-context@v1 with: resource-group: ${{ env.RESOURCE_GROUP }} cluster-name: ${{ env.AZURE_AKS_CLUSTER_NAME }} - name: Install or Upgrade Helm Chart uses: helm/helm-action@v1.2.0 # Use a specific version with: command: upgrade chart: ${{ env.HELM_CHART_PATH }} release-name: ${{ env.HELM_RELEASE_NAME }} namespace: default # Or your target namespace values: | image: repository: ${{ env.AZURE_CONTAINER_REGISTRY }}/mywebapp tag: ${{ github.sha }} service: type: LoadBalancer # Expose for testing set-values: | # Example of setting individual values replicaCount=2 wait: true # Wait for the deployment to be ready atomic: true # Rollback on failureGitHub Secrets:
AZURE_CREDENTIALS: The JSON output fromaz ad sp create-for-rbac.ACR_USERNAME,ACR_PASSWORD: Get these from your ACR:az acr credential show --name <your-acr-name> --query 'username'and--query 'passwords[0].value'.
Commit and push: Push this
deploy.yamlto yourmainbranch. GitHub Actions will trigger the workflow.
Best Practices for CI/CD with AKS/Helm:
- GitOps: Adopt a GitOps approach where all desired state (infrastructure and application configs) is stored in Git, and an operator (like Argo CD or Flux CD) applies changes to the cluster. GitHub Actions can be used to update the Git repository that the GitOps operator watches.
- Separate Environments: Use different branches or separate workflows/pipelines for dev, staging, and production environments.
- Security: Use short-lived credentials (OpenID Connect with Azure AD for GitHub Actions) instead of Service Principals when possible. Grant least-privilege permissions.
- Testing: Integrate unit tests, integration tests, and end-to-end tests into your CI pipeline. Use
helm lintandhelm template --debugin CI. - Rollback Strategy: Ensure your deployments are atomic and have clear rollback procedures.
- Notifications: Configure notifications for pipeline success/failure.
Exercise/Mini-Challenge:
- Set up an Azure Service Principal and add it to your GitHub repository secrets.
- Create a simple Dockerfile for a basic web server (e.g., Python Flask).
- Create a Helm Chart for this web server.
- Implement a GitHub Actions workflow that:
- Builds the Docker image.
- Pushes the image to your ACR.
- Deploys/upgrades the Helm Chart to your AKS cluster, using the newly built image tag.
- Trigger the workflow and verify the deployment in AKS.
4.5 Secret Management (Azure Key Vault Provider for Secrets Store CSI Driver)
Kubernetes Secrets store sensitive data, but they are base64 encoded, not truly encrypted by default at rest in all scenarios. For enhanced security, especially in production, you should integrate with a dedicated secret management solution. Azure Key Vault is Azure’s fully managed secret store, and the Secrets Store CSI Driver allows you to mount secrets from Key Vault directly into your Pods as volumes.
Benefits:
- Centralized Secret Management: Manage all secrets in a secure, audited Key Vault.
- Encryption at Rest: Key Vault encrypts secrets at rest.
- Reduced Attack Surface: Secrets are not stored directly in Kubernetes etcd, but are injected dynamically.
- Automatic Rotation: Leverage Key Vault’s secret rotation capabilities.
How it works:
- Install Secrets Store CSI Driver and Azure Key Vault Provider: These components are installed in your AKS cluster.
SecretProviderClass: Defines which secrets to fetch from Key Vault.- Pod
volumeMounts: Pods reference theSecretProviderClassand mount the secrets as a volume. - Managed Identity: The Pod uses an Azure AD Managed Identity to authenticate to Key Vault.
Code Example: Azure Key Vault Secrets with CSI Driver
Enable the Azure Key Vault Secrets Provider Add-on for AKS: This is the easiest way to deploy the CSI driver and provider.
az aks enable-addons --addons azure-keyvault-secrets-provider --name myAKSCluster --resource-group myAKSResourceGroupEnsure your AKS cluster has a Managed Identity (system-assigned or user-assigned).
Create an Azure Key Vault and a Secret:
az keyvault create --name my-aks-keyvault-2025 --resource-group myAKSResourceGroup --location eastus --enabled-for-rbac true az keyvault secret set --vault-name my-aks-keyvault-2025 --name MyDbPassword --value "SuperSecretP@ssw0rd!"Grant AKS Managed Identity access to Key Vault: Get the Client ID of your AKS cluster’s Kubelet Identity (if system-assigned):
AKS_MI_CLIENT_ID=$(az aks show --resource-group myAKSResourceGroup --name myAKSCluster --query identity.principalId -o tsv) # Or for Kubelet identity (more granular): # KUBELET_MI_CLIENT_ID=$(az aks show -g myAKSResourceGroup -n myAKSCluster --query identityProfile.kubeletidentity.clientId -o tsv) # Grant Get and List permissions on secrets az keyvault set-policy --name my-aks-keyvault-2025 --resource-group myAKSResourceGroup --object-id $AKS_MI_CLIENT_ID --secret-permissions get list(Note: Using the cluster’s Kubelet identity or a dedicated user-assigned managed identity is a best practice for production. For simplicity, we used the main AKS identity above.)
Create a
SecretProviderClass:# secret-provider-class.yaml apiVersion: secrets-store.csi.k8s.io/v1 kind: SecretProviderClass metadata: name: azure-kv-secrets namespace: default # Or your target namespace spec: provider: azure parameters: usePodIdentity: "false" # Use cluster-level identity (Kubelet or AKS MI) useVMManagedIdentity: "true" # tenantId: "<your-azure-tenant-id>" # Optional, if not using AKS cluster MI keyvaultName: my-aks-keyvault-2025 # Replace with your Key Vault name objects: | array: - | objectName: MyDbPassword objectType: secret objectVersion: "" # Use latest versionDeploy a Pod to consume the secret:
# pod-with-kv-secret.yaml apiVersion: apps/v1 kind: Deployment metadata: name: myapp-kv-deployment labels: app: myapp-kv spec: replicas: 1 selector: matchLabels: app: myapp-kv template: metadata: labels: app: myapp-kv spec: containers: - name: myapp-container image: busybox:latest command: ["sh", "-c", "echo 'Application started...'; cat /mnt/secrets-store/MyDbPassword; sleep 3600"] volumeMounts: - name: secrets-store-volume mountPath: "/mnt/secrets-store" readOnly: true volumes: - name: secrets-store-volume csi: driver: secrets-store.csi.k8s.io readOnly: true volumeAttributes: secretProviderClass: azure-kv-secrets
Deploy the SecretProviderClass and the Pod:
kubectl apply -f secret-provider-class.yaml
kubectl apply -f pod-with-kv-secret.yaml
Check the pod logs:
kubectl logs -l app=myapp-kv
You should see “SuperSecretP@ssw0rd!” printed, demonstrating the secret was mounted.
Clean up:
kubectl delete -f pod-with-kv-secret.yaml
kubectl delete -f secret-provider-class.yaml
az keyvault delete --name my-aks-keyvault-2025 --resource-group myAKSResourceGroup
az keyvault purge --name my-aks-keyvault-2025 # Permanent delete if soft-delete is enabled
Best Practices for Secret Management:
- Centralized Key Vault: Use Azure Key Vault for all application secrets.
- Managed Identities: Always use Azure AD Managed Identities for AKS and application Pods to authenticate to Key Vault, following the principle of least privilege.
- CSI Driver: Leverage the Secrets Store CSI Driver for dynamic secret injection, avoiding direct storage in etcd.
- Rotation: Implement secret rotation policies in Key Vault.
- Auditing: Use Key Vault’s auditing capabilities to track access to secrets.
Exercise/Mini-Challenge:
- Create a second secret in your Azure Key Vault (e.g.,
ApiToken). - Modify the
SecretProviderClassto also fetch thisApiToken. - Modify your
myapp-kv-deploymentPod to mount and print theApiTokenas well. - Verify that both secrets are successfully mounted and accessible by the Pod.
5. Guided Projects
These projects will consolidate your learning by guiding you through deploying more complete applications.
Project 1: Deploying a Multi-Tier Web Application with Helm to AKS
Objective: Deploy a simple multi-tier web application (e.g., a frontend, backend API, and a database) to AKS using a single Helm Chart. Configure Ingress for external access and persistent storage for the database.
Problem Statement: You need to deploy a Guestbook application, which consists of a Python Flask frontend, a Redis database, and potentially an Ingress.
Steps:
Clone the Sample Application and Helm Chart (or create your own): For simplicity, we’ll outline the structure. You can create these files.
guestbook-chart/ Chart.yaml values.yaml charts/ # redis subchart templates/ frontend-deployment.yaml frontend-service.yaml backend-deployment.yaml # Optional, if you add a backend API backend-service.yaml # Optional ingress.yaml # Optional, if ingress is enabledFrontend (Python Flask) Dockerfile (Example
guestbook-chart/app/Dockerfile):# Dockerfile FROM python:3.9-slim-buster WORKDIR /app COPY requirements.txt . RUN pip install -r requirements.txt COPY . . EXPOSE 5000 CMD ["python", "app.py"](
app.pywould connect to Redis and display/store guestbook entries.requirements.txtwould containFlask,redis.)Create a Helm Chart (
guestbook-chart):helm create guestbook-chart cd guestbook-chartAdd Redis as a Subchart Dependency: Edit
Chart.yaml:# guestbook-chart/Chart.yaml apiVersion: v2 name: guestbook-app description: A multi-tier guestbook application version: 0.1.0 appVersion: "1.0.0" dependencies: - name: redis version: 17.x.x # Use a recent stable version repository: https://charts.bitnami.com/bitnami condition: redis.enabledRun
helm dependency update .Configure
values.yaml:- Enable Redis and set a password.
- Define frontend image (build your own and push to ACR).
- Configure service type (e.g., LoadBalancer for frontend), or Ingress.
- Set replica counts.
# guestbook-chart/values.yaml frontend: image: repository: <your-acr-name>.azurecr.io/guestbook-frontend tag: latest replicaCount: 2 service: type: LoadBalancer # Or ClusterIP if using Ingress port: 80 targetPort: 5000 redis: enabled: true password: "myredispassword" master: persistence: enabled: true size: 1Gi ingress: enabled: false # Set to true to enable ingress className: "nginx" host: guestbook.local annotations: {}Create Frontend Kubernetes Manifests in
templates/:frontend-deployment.yaml(using your custom image, connecting to Redis viaredis-masterservice from subchart).frontend-service.yaml(expose frontend).
Build and Push Frontend Docker Image: You’ll need to build your Python Flask app Docker image and push it to your ACR.
# From guestbook-chart/app directory docker build -t <your-acr-name>.azurecr.io/guestbook-frontend:latest . docker push <your-acr-name>.azurecr.io/guestbook-frontend:latestDeploy the Helm Chart:
helm install guestbook-release guestbook-chart/Verify and Test:
kubectl get pods,kubectl get svc.- Access the frontend via LoadBalancer IP or Ingress hostname.
- Verify guestbook entries persist across frontend pod restarts (Redis persistence).
Encourage independent problem-solving:
- Task: Implement a
/healthzand/readyendpoint in the Flask frontend and add liveness/readiness probes to its Deployment. - Task: Configure
ResourceQuotasandLimitRangesfor the namespace where the Guestbook app is deployed. - Task: Modify the chart to support an optional backend API service if
backend.enabledistrueinvalues.yaml.
Project 2: Implementing CI/CD with GitHub Actions, IaC, and Secret Management
Objective: Extend Project 1 to include a full CI/CD pipeline using GitHub Actions for automated deployment to an AKS cluster provisioned with Terraform, securely managing secrets with Azure Key Vault.
Problem Statement: Automate the deployment of the Guestbook application from Project 1 to AKS using GitHub Actions, while the AKS cluster itself is managed by Terraform, and Redis password is fetched from Azure Key Vault.
Steps:
Review Terraform AKS Setup: Ensure your AKS cluster is provisioned via Terraform as described in Section 4.1. The AKS cluster should have the Azure Key Vault Secrets Provider add-on enabled.
Configure Azure Key Vault: Create an Azure Key Vault and store the Redis password (e.g.,
RedisPassword) there. Grant the AKS cluster’s managed identity (or a dedicated user-assigned managed identity)getandlistpermissions to secrets in the Key Vault.Create
SecretProviderClass: Define aSecretProviderClassin Kubernetes that fetches theRedisPasswordfrom your Azure Key Vault.Modify
guestbook-chartto use Key Vault Secret:- In
guestbook-chart/templates/redis-deployment.yaml(or where Redis password is configured), modify it to consume the secret mounted by the CSI driver. - Mount the secrets volume into the Redis Pod:
# Example snippet for Redis Pod in its deployment template spec: # ... volumes: - name: secrets-store-volume csi: driver: secrets-store.csi.k8s.io readOnly: true volumeAttributes: secretProviderClass: azure-kv-secrets # Name of your SecretProviderClass containers: - name: redis # ... volumeMounts: - name: secrets-store-volume mountPath: "/mnt/secrets-store" # Mount path for secrets env: - name: REDIS_PASSWORD # Redis chart usually expects this env var valueFrom: secretKeyRef: name: redis-secret-from-kv # Create a k8s secret from mounted file key: MyDbPassword # Name of the file in /mnt/secrets-store # ... also need to create a Kubernetes Secret from the mounted file for Redis to use it # This is often done with an initContainer or a separate Helm chart for secret syncing # For simplicity, if the Redis chart can directly read from a file, use that. # Otherwise, the CSI driver can create a K8s Secret directly from Key Vault. # Let's assume CSI driver is configured to sync to a K8s secret directly: # In SecretProviderClass parameters: # secretObjects: | # array: # - secretName: redis-secret-from-kv # type: Opaque # data: # - key: MyDbPassword # objectName: MyDbPassword
- In
Create GitHub Actions Workflow (
.github/workflows/deploy-guestbook.yaml):- IaC Pipeline (Terraform): A separate job/workflow that
terraform planandterraform applyyour AKS infrastructure. - Application CI/CD Pipeline:
- Checks out code.
- Logs into Azure (using
AZURE_CREDENTIALSsecret). - Logs into ACR (
ACR_USERNAME,ACR_PASSWORDsecrets). - Builds the
guestbook-frontendDocker image and pushes it to ACR (tagged withgithub.sha). - Sets
kubectlcontext to your AKS cluster. - Deploys the
SecretProviderClass(if not already managed by Terraform). - Installs or upgrades the
guestbook-chartHelm release to AKS, passing the dynamic image tag.
# .github/workflows/deploy-guestbook.yaml name: Guestbook App CI/CD to AKS on: push: branches: - main paths: - 'guestbook-chart/**' - 'azure-aks-infra/**' # If you include Terraform here workflow_dispatch: env: AZURE_CONTAINER_REGISTRY: myacr2025example.azurecr.io AZURE_AKS_CLUSTER_NAME: my-terraform-aks RESOURCE_GROUP: my-aks-terraform-rg HELM_CHART_PATH: guestbook-chart # Path to your Helm chart directory HELM_RELEASE_NAME: guestbook-app-release FRONTEND_IMAGE_NAME: guestbook-frontend jobs: terraform-apply: runs-on: ubuntu-latest steps: - name: Checkout repository uses: actions/checkout@v4 - name: Azure Login uses: azure/login@v1 with: creds: ${{ secrets.AZURE_CREDENTIALS }} - name: Setup Terraform uses: hashicorp/setup-terraform@v3 - name: Terraform Init run: terraform init working-directory: azure-aks-infra # Your Terraform folder - name: Terraform Plan run: terraform plan working-directory: azure-aks-infra - name: Terraform Apply run: terraform apply -auto-approve working-directory: azure-aks-infra if: github.event_name == 'push' && github.ref == 'refs/heads/main' # Only apply on push to main # Consider manual approval for production IaC deployments build-and-deploy-app: needs: terraform-apply # Ensure infrastructure is ready runs-on: ubuntu-latest steps: - name: Checkout repository uses: actions/checkout@v4 - name: Azure Login uses: azure/login@v1 with: creds: ${{ secrets.AZURE_CREDENTIALS }} - name: Docker Login to ACR uses: azure/docker-login@v1 with: login-server: ${{ env.AZURE_CONTAINER_REGISTRY }} username: ${{ secrets.ACR_USERNAME }} password: ${{ secrets.ACR_PASSWORD }} - name: Build and Push Frontend Docker image run: | docker build ${{ env.HELM_CHART_PATH }}/app -t ${{ env.AZURE_CONTAINER_REGISTRY }}/${{ env.FRONTEND_IMAGE_NAME }}:${{ github.sha }} docker push ${{ env.AZURE_CONTAINER_REGISTRY }}/${{ env.FRONTEND_IMAGE_NAME }}:${{ github.sha }} - name: Set AKS Kubeconfig uses: azure/aks-set-context@v1 with: resource-group: ${{ env.RESOURCE_GROUP }} cluster-name: ${{ env.AZURE_AKS_CLUSTER_NAME }} - name: Deploy SecretProviderClass # You can also manage SPC via Helm or Terraform # For this project, assuming SPC definition is static in a .yaml file run: kubectl apply -f secret-provider-class.yaml # working-directory: path/to/your/secret-provider-class-definition - name: Install or Upgrade Helm Chart uses: helm/helm-action@v1.2.0 with: command: upgrade chart: ${{ env.HELM_CHART_PATH }} release-name: ${{ env.HELM_RELEASE_NAME }} namespace: default values: | frontend: image: repository: ${{ env.AZURE_CONTAINER_REGISTRY }}/${{ env.FRONTEND_IMAGE_NAME }} tag: ${{ github.sha }} redis: enabled: true # Ensure subchart is enabled password: "" # Password now comes from CSI driver service: type: LoadBalancer set-values: | replicaCount=2 wait: true atomic: true- IaC Pipeline (Terraform): A separate job/workflow that
Encourage independent problem-solving:
- Task: Implement separate
dev,staging, andproductionworkflows/jobs in GitHub Actions, possibly using environment-specific secrets and values files for Helm. - Task: Add a step in the GitHub Actions pipeline to run
helm lintandhelm template --debugto validate your Helm Chart before deployment. - Task: Configure monitoring alerts in Azure Monitor for your deployed application (e.g., high CPU usage, HTTP 500 errors).
6. Bonus Section: Further Learning and Resources
Congratulations on making it this far! This guide provides a strong foundation, but the cloud-native landscape is vast and ever-evolving. Here are some resources to continue your learning journey:
Recommended Online Courses/Tutorials:
- Official Kubernetes Documentation: The best place for in-depth concepts and examples.
- Helm Documentation: Comprehensive guide for Helm Chart development and management.
- Azure Kubernetes Service (AKS) Documentation: Microsoft’s official documentation for AKS, including best practices and integrations.
- Pluralsight/Udemy/Coursera: Look for courses like “Certified Kubernetes Administrator (CKA)”, “Certified Kubernetes Application Developer (CKAD)”, or specific AKS and Helm courses.
- KodeKloud: Offers excellent hands-on labs and courses for Kubernetes.
Official Documentation:
- Kubernetes Documentation
- Helm Documentation
- Azure Kubernetes Service (AKS) Documentation
- Azure Container Registry (ACR) Documentation
- Azure Key Vault Documentation
- Terraform Azure Provider Documentation
- GitHub Actions Documentation
- OpenTelemetry Documentation
Blogs and Articles:
- Microsoft Azure Blog: Stay updated on the latest Azure and AKS features.
- Kubernetes Blog: Official blog for Kubernetes project updates.
- Helm Blog: News and updates from the Helm community.
- CNCF (Cloud Native Computing Foundation) Blog: Broader cloud-native ecosystem news.
- Developer.microsoft.com/reactor/events: Look for live or recorded sessions on AKS and related technologies.
YouTube Channels:
- Azure Friday: Weekly videos on various Azure services, often including AKS.
- Kubernetes Official Channel: Talks, tutorials, and conference recordings.
- TechWorld with Nana: Excellent beginner-friendly tutorials on Docker, Kubernetes, and DevOps.
- Fireship: Quick, engaging overviews of new technologies.
Community Forums/Groups:
- Stack Overflow: For specific technical questions.
- Kubernetes Slack: Active community for discussions and help (
#helm-users,#aks-users, etc.). - GitHub Issues/Discussions: For specific projects (Helm, AKS-engine, CSI drivers).
- Reddit Communities: r/kubernetes, r/helm, r/azure, r/devops.
Next Steps/Advanced Topics:
- Service Mesh (Istio/Linkerd): For advanced traffic management, security, and observability between microservices. AKS offers an Istio add-on.
- GitOps with Argo CD/Flux CD: Advanced automated deployment where Git is the single source of truth, and a controller reconciles the cluster state.
- Custom Resource Definitions (CRDs) and Operators: Extend Kubernetes functionality with custom resources and controllers to manage complex applications.
- Advanced Networking (Azure CNI, Calico): Deeper dive into network policies, IP addressing, and connectivity.
- AKS Security Best Practices: In-depth security hardening, vulnerability management, and compliance for AKS.
- Performance Tuning and Cost Optimization: Optimize resource requests/limits, auto-scaling, and cluster sizing for efficiency.
- Disaster Recovery and Business Continuity: Strategies for multi-region deployments and data backup/restore in AKS.
- Open Policy Agent (OPA) Gatekeeper: Policy enforcement for Kubernetes clusters to ensure compliance and security.
- Kubernetes with WebAssembly (Wasm): An emerging trend for running highly efficient, sandboxed workloads.
By continuously exploring these resources and engaging with the community, you’ll stay at the forefront of cloud-native development and master the complexities of Helm and Kubernetes on AKS. Happy learning!