AKS with Terraform
How to create a Kubernetes cluster with Azure Kubernetes Service - AKS using Terraform.
AKS - Azure-Kubernetes-Service
Azure Kubernetes Service (AKS) manages your hosted
Kubernetes
environment. AKS allows you to deploy and manage containerized applications without container orchestration expertise. AKS also enables you to do many common maintenance operations without taking your app offline. These operations include provisioning, upgrading, and scaling resources on demand.
AKS - Terraform
Terraform provides following Azure Providers to provision infrastructure in Azure Public Cloud.
We will be using azurerm
and azuread
providers for creating AKS.
AKS - Terraform Prerequisites
Azure subscription:
Create a free account which to have Azure subscription.Terraform:
Install Terraform latest versionAzure service principal:
Follow the directions in the Create the service principal section. Create an Azure service principal with Azure CLI. Take note of the values for the appId, displayName, password, and tenant. If not, then I will show you how to createservice principal
using Terraform.
Terraform AKS
Terraforming AKS doesn’t involve lot of moving components like AWS-EKS.
Make sure have terraform installed and Azure service principal
details ready.
Terraform Variables used must be put in variables.tf
.
#1 - Terraform Environment Variables
terraform.tfvars
must be used in case, common or secret variables are getting injected as environment variables.
terraform.tfvarsbase_tags = {
Tier = "Internal"
CostCentre = "xyzxyz"
Compliance = "no"
Owner = "azure-sub"
Escalation = "azure-subm@me.com"
}
svc_prpl_pwd = ""
ARM_ENVIRONMENT = "public"
ARM_CLIENT_ID = ""
ARM_SUBSCRIPTION_ID = ""
ARM_TENANT_ID = ""
#2 - Terraform Providers
Terraform Providers are responsible for understanding API interactions with given providers resources. They can create any resource, if proper credentials for an account in public cloud is given.
For AKS, we will need 4 providers to run our terraform code successfully.
terraform providers- azurerm
- azuread
- local
- tls
Definition of providers in terraform is shown below. In Azure, with proper permissions, we can get all the 4 variables needed to initiliase AKS azurerm
providers terraform code.
You can pin providers to sepecific versions in terraform as shown below.
providers.tfprovider "azurerm" {
version = "=2.14.0"
subscription_id = var.subscription_id
client_id = var.client_id
client_secret = var.client_secret
tenant_id = var.tenant_id
features {}
}
provider "azuread" {
version = "=0.10.0"
}
provider "local" {
version = "~> 1.4"
}
provider "tls" {
version = "~> 2.1"
}
#2 - Create AD Service Principal
We can create Azure Service Principal and a password with an expiry date. Password can be stored in same repo encrypted with sops or can be also retrive from Azure Vault.
azure-ad.tfresource "azuread_application" "aks_app" {
name = "aks_rbac"
}
resource "azuread_service_principal" "aks_svc_prnpl"{
application_id = azuread_application.aks_app.application_id
app_role_assignment_required = false
tags = ["aks", "azure", "team-1"]
}
resource "azuread_service_principal_password" "aks_svc_prnpl_pwd" {
service_principal_id = azuread_service_principal.aks_svc_prnpl.id
value = var.svc_prpl_pwd
end_date = "2099-01-01T01:02:03Z"
description = "My managed password"
}
#3 - Create Resource Group
We will start creating AKS with creating a Resource Group first. All our resources created will be under this Resource Group.
rg.tfresource "azurerm_resource_group" "k8s" {
name = var.resource_group_name
location = var.location
tags = local.tags
}
#4 - Create SSH-Key (Optional)
This part is optional. If you want to ssh worker nodes, its better to create a key-pair. Private key can be stored in same repo encrypted using a key with sops or can be put in Azure Vault.
module.tfvariable "public_ssh_key" {
description = "A custom ssh key to control access to the AKS cluster"
default = ""
}
module "ssh-key" {
source = "./modules/ssh-key"
public_ssh_key = var.public_ssh_key == "" ? "" : var.public_ssh_key
}
Module ssh-key
looks like as below.
./modules/ssh-key/main.tfvariable "public_ssh_key" {
description = "An ssh key set in the main variables of the terraform-azurerm-aks module"
default = ""
}
resource "tls_private_key" "ssh" {
algorithm = "RSA"
rsa_bits = 4096
}
resource "local_file" "private_key" {
count = var.public_ssh_key == "" ? 1 : 0
content = tls_private_key.ssh.private_key_pem
filename = "./aks_private_ssh_key"
}
output "public_ssh_key" {
# Only output a generated ssh public key
value = var.public_ssh_key != "" ? "" : tls_private_key.ssh.public_key_openssh
}
#5 - Create AKS
azurerm_kubernetes_cluster
is the main resource which manages and creates Azure AKS.
Below azurerm_kubernetes_cluster
resource has many arguement blocks and arguements. Lets explore them, one by one.
arguements
can be Required or Optional. The names are quite self explanatory here.
arguements AKS name = var.cluster_name
location = azurerm_resource_group.k8s.location
resource_group_name = azurerm_resource_group.k8s.name
dns_prefix = var.dns_prefix
kubernetes_version = "1.16.9"
private_cluster_enabled = false
tags = local.tags
sku_tier = "Free"
service_principal
block contains Service Principal applicaition id and the secret as show below. Values are coming from the resources created in Step-2.
service_principal block service_principal {
client_id = azuread_service_principal.aks_svc_prnpl.application_id
client_secret = azuread_service_principal_password.aks_svc_prnpl_pwd.value
}
default_node_pool
block contains the worker-nodes details like total node counts, min/max node counts, vm size, disk size, tags, taints on nodes, whether nodes have public ip’s etc.
We are using below arguements for worker nodes.
default_node_pool block default_node_pool {
name = "default"
enable_node_public_ip = false
enable_auto_scaling = false
node_count = 2
min_count = 2
max_count = 6
vm_size = var.agent_size
type = "VirtualMachineScaleSets"
os_disk_size_gb = 50
node_taints = ["vm=OnDemand:NoSchedule"]
tags = local.tags
node_labels = {
Tier = "internal"
Team = "team-1"
Type = "OnDemand"
}
}
auto_scaler_profile
block lets us know about cluster autoscaler needs with in the AKS cluster. If you have already work with cluster-autoscaler, all the arguements in block looks quite similar.
auto_scaler_profile block auto_scaler_profile {
balance_similar_node_groups = true
max_graceful_termination_sec = 300
scale_down_delay_after_add = "10m"
scale_down_delay_after_delete = "10s"
scan_interval = "10s"
scale_down_delay_after_failure = "3m"
scale_down_unneeded = "10m"
scale_down_unready = "20m"
scale_down_utilization_threshold = 0.5
}
addon_profile
block can create a azure policy, application routing, kubernetes dashboards and most importantly can enable azure monitoring for the cluster.
addon_profile block dynamic addon_profile {
for_each = var.enable_log_analytics_workspace ? ["log_analytics"] : []
content {
oms_agent {
enabled = true
log_analytics_workspace_id = azurerm_log_analytics_workspace.main[0].id
}
}
}
linux_profile
block contains admin username for cluster and the secret key to login inside vm.
ssh-key
is oming from the module that we created in Step-4.
linux_profile block linux_profile {
admin_username = var.admin_username
ssh_key {
key_data = replace(var.public_ssh_key == "" ? module.ssh-key.public_ssh_key : var.public_ssh_key, "\n", "")
}
}
timeouts
block allows you to specify timeouts for certain actions
timeouts arguements- create - (Defaults to 90 minutes) Used when creating the Kubernetes Cluster.
- update - (Defaults to 90 minutes) Used when updating the Kubernetes Cluster.
- read - (Defaults to 5 minutes) Used when retrieving the Kubernetes Cluster.
- delete - (Defaults to 90 minutes) Used when deleting the Kubernetes Cluster.
timeouts block timeouts {
create = "2h"
delete = "2h"
update = "2h"
read = "5m"
}
Similarly there are other blocks in azurerm_kubernetes_cluster as mentioned below. We can add them as per cluster requirements.
other blocksazure_active_directory{}
azure_policy{}
http_application_routing{}
kube_dashboard{}
oms_agent{}
network_profile{}
role_based_access_control{}
api_server_authorized_ip_ranges = ""
identity {}
Combining all the above arguements and blocks, azurerm_kubernetes_cluster
looks like as below.
aks.tfresource "azurerm_kubernetes_cluster" "k8s" {
name = var.cluster_name
location = azurerm_resource_group.k8s.location
resource_group_name = azurerm_resource_group.k8s.name
dns_prefix = var.dns_prefix
kubernetes_version = "1.16.9"
private_cluster_enabled = false
tags = local.tags
sku_tier = "Free"
default_node_pool {
name = "default"
enable_node_public_ip = false
enable_auto_scaling = false
node_count = 2
min_count = 2
max_count = 6
vm_size = var.agent_size
type = "VirtualMachineScaleSets"
os_disk_size_gb = 50
node_taints = ["vm=OnDemand:NoSchedule"]
tags = local.tags
node_labels = {
Tier = "internal"
Team = "team-1"
Type = "OnDemand"
}
}
service_principal {
client_id = azuread_service_principal.aks_svc_prnpl.application_id
client_secret = azuread_service_principal_password.aks_svc_prnpl_pwd.value
}
linux_profile {
admin_username = var.admin_username
ssh_key {
key_data = replace(var.public_ssh_key == "" ? module.ssh-key.public_ssh_key : var.public_ssh_key, "\n", "")
}
}
dynamic addon_profile {
for_each = var.enable_log_analytics_workspace ? ["log_analytics"] : []
content {
oms_agent {
enabled = true
log_analytics_workspace_id = azurerm_log_analytics_workspace.main[0].id
}
}
}
auto_scaler_profile {
balance_similar_node_groups = true
max_graceful_termination_sec = 300
scale_down_delay_after_add = "10m"
scale_down_delay_after_delete = "10s"
scan_interval = "10s"
scale_down_delay_after_failure = "3m"
scale_down_unneeded = "10m"
scale_down_unready = "20m"
scale_down_utilization_threshold = 0.5
}
timeouts {
create = "2h"
delete = "2h"
update = "2h"
read = "5m"
}
}
#6 - AKS Logs
azurerm_log_analytics_workspace
manages a Log Analytics (formally Operational Insights) Workspace.
azurerm_log_analytics_solution
manages a Log Analytics (formally Operational Insights) Solution.
aks-logs.tfresource "random_id" "log_analytics_workspace_name_suffix" {
byte_length = 8
}
resource "azurerm_log_analytics_workspace" "main" {
count = var.enable_log_analytics_workspace ? 1 : 0
name = join("-", [var.log_analytics_workspace_name,
random_id.log_analytics_workspace_name_suffix.dec,
"workspace"])
location = azurerm_resource_group.k8s.location
resource_group_name = azurerm_resource_group.k8s.name
sku = var.log_analytics_workspace_sku
retention_in_days = var.log_retention_in_days
tags = local.tags
}
resource "azurerm_log_analytics_solution" "main" {
count = var.enable_log_analytics_workspace ? 1 : 0
solution_name = "ContainerInsights"
location = azurerm_resource_group.k8s.location
resource_group_name = azurerm_resource_group.k8s.name
workspace_resource_id = azurerm_log_analytics_workspace.main[0].id
workspace_name = azurerm_log_analytics_workspace.main[0].name
plan {
publisher = "Microsoft"
product = "OMSGallery/ContainerInsights"
promotion_code = ""
}
}
#7 - AKS Create
Run terraform
with below commands to create AKS.
aks createterraform init
terraform validate
terraform plan
terraform apply
References