Introduction

Kubernetes has become the standard for hosting containerized applications, with Azure Kubernetes Service (AKS) being one of the most popular managed options. AKS simplifies management by handling the control plane, but securing the environment remains the customer’s responsibility. The shared responsibility model requires you to focus on strengthening worker nodes, controlling cluster access, and ensuring that workloads and pods operate with the least privileges.
Security in Kubernetes involves multiple layers. The base layer is the host security of the virtual machines in the node pools. Next is the cluster layer, where identity, networking, and governance must be carefully established and maintained. At the top, securing pods and containers that run your business logic is crucial to prevent privilege escalation or unauthorized communication. This approach, known as “defense in depth,” emphasizes addressing security across various boundaries instead of relying on a single tool or configuration for full protection.

We will now take a closer look at the different layers that make up the ‘defense in depth’.

Host Security in AKS

Although AKS provides a managed service, the worker nodes are virtual machines within your subscription. This means that patching, upgrading, and monitoring are your responsibilities. The first decision you need to make is which operating system your nodes should run. AKS supports both Ubuntu and Azure Linux, formerly known as CBL-Mariner. Ubuntu is a more familiar choice for many teams, but Azure Linux has been specifically designed for Azure infrastructure. It has a smaller footprint, a more streamlined kernel, and faster patch delivery, which helps reduce the attack surface.
Keeping the hosts updated is a vital operational task. AKS enables automatic node pool upgrades, ensuring nodes are regularly cycled and patched with the latest security fixes. You can also set maintenance windows so that upgrades happen outside business-critical hours, reducing the impact of restarts. For teams that want maximum control, upgrading can also be done manually by recreating node pools, draining pods, and reattaching them after patching.

Storage and networking on the nodes need careful attention. All disks should be encrypted, using either platform-managed keys or customer-managed keys from Azure Key Vault. On the network side, placing nodes in a dedicated subnet with a network security group adds an extra layer of security. Inbound access should be highly restricted, and, whenever possible, the cluster API should be private to ensure only requests from the Azure backbone are accepted.

Monitoring host security is an ongoing responsibility. Azure Monitor for containers offers telemetry on CPU, memory, and disk usage, and also detects daemon processes and abnormal node behavior. Combining this with Microsoft Defender for Containers provides runtime protection that can identify suspicious binaries, cryptomining attempts, or privilege escalations at the node level.

Cluster Security

The cluster layer is where governance and identity meet. Access to the Kubernetes API is often the most critical control point. A public AKS cluster makes its API accessible to the internet, but in most cases, a private cluster is the better choice, ensuring traffic only flows through the Azure backbone. If a public endpoint is necessary, IP allowlists should be used to limit access to known corporate networks.

Authentication in AKS should always be linked to Azure Active Directory (now Entra ID). This enables you to connect groups from your corporate directory to Kubernetes roles. For example, a developer group can be granted permissions to list and create pods in a development namespace, while a platform engineering group may have higher rights for cluster setup. The essential principle here is the principle of least privilege. The cluster-admin role should only be used in emergencies, and audit trails should be kept for all actions.

 

What “Least Privilege” Really Means

The principle of least privilege is fundamental to security in Kubernetes and beyond. It means that each user, service account, and workload should only have the permissions necessary to perform their tasks — no more, no less.

For example, a developer who only needs to read logs from a single namespace should not be granted full cluster-admin rights to simplify the process. Such broad access could allow them to delete pods, create secrets, or even shut down the cluster, and if compromised, could cause severe damage.

In Kubernetes, enforcing least privilege involves precisely mapping Entra ID groups to Kubernetes RBAC roles, setting permissions at the namespace level when possible, and regularly reviewing bindings. By limiting privileges, you reduce the impact of potential breaches and hinder attackers from moving laterally within the cluster. Think of least privilege as creating secure corridors: users and workloads can only access designated areas, keeping everything else protected.

Role-Based Access Control (RBAC) provides the mechanism for defining these permissions. For instance, the following role grants developers limited access inside a namespace:

kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
namespace: dev
name: developer-role
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list", "create", "delete"]

Regular auditing of roles and bindings is essential, since lingering permissions are a common source of escalation in compromised clusters.

Networking within the cluster should follow a zero-trust approach. AKS offers support for both Azure CNI Overlay and Cilium for pod networking. Enabling network policies ensures that pods only communicate with necessary services. Starting with a deny-all policy, you can add specific allow rules as needed. When combined with Azure Firewall or a Web Application Firewall at ingress points, this setup provides multiple security layers between the workload and external access.

Secrets and configuration management require careful handling. Storing passwords or connection strings in ConfigMaps poses a security risk. Instead, Azure Key Vault should serve as the central repository for secrets, mounted into pods via the Secrets Store CSI Driver. For instance, a pod can access a database connection string directly from Key Vault by mounting it into its file system, thereby removing the necessity to embed secrets in YAML files.

Another key concern is securing the supply chain. Container images must be sourced from trusted registries, such as Azure Container Registry (ACR). Defender for Containers can automatically scan images during push to ACR, identifying vulnerabilities before deployment. To enforce this policy within the cluster, admission controllers such as Gatekeeper with Open Policy Agent (OPA) or Kyverno can be set up to reject images that come from untrusted sources or lack proper signatures. Signing images with Notary v2 or Cosign guarantees their integrity, and policies can require signature verification during runtime.

 

Notary vs. Cosign: What You Need to Know

When discussing supply chain security in Kubernetes, the issue of image signing naturally arises. The two main tools you’ll often see are Notary and Cosign. Although both aim to verify that container images are authentic and unchanged, they handle the task in different ways.

Notary has a longstanding history and is closely linked with Docker and OCI registries. Its second version, Notary v2, is tailored to function within the OCI artifacts ecosystem, embedding signatures and metadata directly into registries. With Notary, the trust is anchored in the registry, allowing policies to ensure that only signed and verified images are admitted into your cluster. In Azure, this naturally integrates with Azure Container Registry, which utilizes Notary to support content trust.

Cosign, part of the Sigstore project, adopts a modern approach by not relying solely on registry-based trust. It can attach signatures as OCI artifacts and introduces ‘keyless signing,’ allowing images to be signed with ephemeral keys associated with your identity via an OpenID Connect (OIDC) provider like GitHub Actions or Azure AD. This facilitates smooth integration into CI/CD pipelines without the hassle of key management, making it highly appealing for contemporary DevSecOps teams.

Essentially, Notary offers a conventional, registry-based model suitable for organizations that prefer deterministic trust rooted in their container registry. Cosign provides a flexible, cloud-native developer experience, especially when integrated with automated pipelines and keyless workflows. Many enterprises choose to adopt both: using Notary for registry-level enforcement and Cosign for signing within developer pipelines.

Think of Notary as the vault-based approach, while Cosign is the pipeline-native method. Both aim to verify authenticity, but the choice depends on where in your workflow you want to establish trust.

Cluster-level security relies on robust logging and auditing. Kubernetes audit logs can be sent to Azure Monitor and stored in a Log Analytics Workspace. From there, they can be integrated with Microsoft Sentinel for correlation and threat detection across your broader environment.

Pod Security

At the top of the stack are the workloads themselves. Pods are frequently the initial entry point for attackers, particularly when applications have exploitable vulnerabilities. In Kubernetes, Pod Security Admission (PSA) has replaced PodSecurityPolicy, offering a native way to enforce basic or restricted security policies across namespaces. Implementing restricted policies helps prevent pods from running as root or mounting sensitive host paths.

An example restricted policy might include the following configuration within a pod specification:

securityContext:
runAsUser: 1000
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"]

This guarantees that the container runs as a non-root user, cannot escalate privileges, and does not inherit unnecessary Linux capabilities. Network segmentation should also be implemented at the pod level. For example, you could use a frontend service to communicate only with a backend service, while preventing lateral movement with unrelated workloads.

This is enforced through network policies, which effectively implement microsegmentation within the cluster. Resource requests and limits are another aspect of pod security. By setting appropriate CPU and memory requests, you prevent noisy neighbors from consuming all available resources and ensure that denial-of-service attempts cannot starve other workloads. A basic specification might look like this:

resources:
requests:
cpu: "500m"
memory: "256Mi"
limits:
cpu: "1"
memory: "512Mi"

Finally, runtime protection is crucial. Defender for Containers works with AKS to identify anomalies, such as reverse shells or crypto-mining activities, within pods. Open-source options like Falco provide similar capabilities by monitoring system calls in real-time and triggering alerts for unusual behavior.

Security as a Continuous Process

Securing AKS isn’t a one-time task. Azure Policy for AKS helps enforce rules across your clusters, such as blocking privileged pods or requiring integration with Key Vault. Ongoing security management with Defender CSPM detects and fixes misconfigurations quickly. Using tools like Azure Chaos Studio or open-source kube-hunter to simulate attacks offers a proactive way to assess resilience before actual incidents happen.

Incorporating security into the DevOps pipeline, commonly referred to as DevSecOps, is essential. Images need to be scanned during build processes using tools like Trivy or Defender, policies should be enforced at the admission stage, and runtime environments should be under continuous surveillance. This approach helps identify vulnerabilities early and ensures a layered security strategy from the build stage through to production workloads.

Conclusion

AKS offers a robust enterprise platform for Kubernetes, but security responsibilities span multiple layers. At the host level, selecting an appropriate OS, performing regular updates, encrypting disks, and monitoring runtime activity form the basis of security. On the cluster level, controlling access via Entra ID and RBAC, isolating networks with policies, securing supply chains, and auditing logs enhance resilience. At the pod level, implementing non-root execution, dropping capabilities, segmenting traffic, and monitoring runtime behavior completes the security framework.

Securing AKS is an ongoing process. With regulations like NIS2 and DORA influencing the European scene, organizations must view security as more than just a technical issue; it’s a core business concern. By integrating these practices into daily operations and adopting a defense-in-depth approach, enterprises can safely run critical workloads in AKS and stay ahead of emerging threats.

References & Further Reading

• Azure Kubernetes Service Security Baseline
• Defender for Containers
• Pod Security Standards
• NIS2 Directive
• DORA Regulation

Want to know more about what we do?

We are your dedicated partner. Reach out to us.