Skip to main content

Overview

The EKS (Elastic Kubernetes Service) Module creates a fully-managed Kubernetes cluster on AWS with compute options (EC2 node groups or Fargate), essential add-ons, and OIDC-based service account authentication for the Artos platform. The cluster is configured for high security with private endpoints, encryption, and comprehensive logging.

Key Features

  • Private Cluster: API endpoint accessible only from within VPC
  • Flexible Compute: Mix EC2 node groups and Fargate profiles based on workload needs
  • Essential Add-ons: Pre-configured VPC CNI, CoreDNS, and kube-proxy
  • OIDC Authentication: IAM roles for Kubernetes service accounts (IRSA)
  • Encrypted Secrets: KMS encryption for Kubernetes secrets at rest
  • Comprehensive Logging: All control plane logs sent to CloudWatch

Core Resources

1. EKS Cluster

The EKS cluster is the Kubernetes control plane managed by AWS, including the API server, scheduler, and controller manager. Key Configuration:
  • Kubernetes Version: Default 1.28 (configurable), usually set to >=1.31
  • Endpoint Access: Private only (no public internet access)
  • Authentication Mode: API_AND_CONFIG_MAP (supports both OIDC and traditional aws-auth ConfigMap)
  • Secrets Encryption: KMS encryption for etcd secrets
VPC Configuration:
endpoint_private_access = true   # API accessible from VPC
endpoint_public_access  = false  # No public internet access
public_access_cidrs    = []      # No external CIDR ranges allowed
Private Endpoint: With endpoint_public_access = false, the Kubernetes API is only accessible from within the VPC. Use a bastion host or VPN for cluster administration. This significantly reduces the attack surface.
Enabled Logging:
  • api - API server audit logs
  • audit - Kubernetes audit logs
  • authenticator - Authentication attempts
  • controllerManager - Controller manager logs
  • scheduler - Scheduler decisions
All logs are sent to CloudWatch Log Group: /aws/eks/{cluster_name}/cluster

2. Compute Options

The module supports two types of compute for running Kubernetes workloads:

EC2 Node Groups

Purpose: Traditional EC2 instances for running pods, giving full control over instance types and scaling. Configuration:
node_groups = {
  general = {
    instance_types = ["t3.large", "t3.xlarge"]
    capacity_type  = "ON_DEMAND"
    min_size      = 2
    max_size      = 10
    desired_size  = 3
  }
  
  compute_optimized = {
    instance_types = ["c5.2xlarge"]
    capacity_type  = "SPOT"
    min_size      = 0
    max_size      = 20
    desired_size  = 2
  }
}
Features:
  • AMI: Amazon Linux 2 (AL2_x86_64)
  • Disk Size: 50 GB GP3 per node
  • Update Strategy: 25% max unavailable during updates
  • Scaling: Managed by Kubernetes Cluster Autoscaler or manual adjustment
Use Cases:
  • General application workloads
  • Artos document generation workloads
  • Artos regeneration workflows
  • Artos document editing functionality

Fargate Profiles

Purpose: Serverless compute that runs each pod in an isolated environment without managing EC2 instances. Configuration:
fargate_profiles = {
  app_profile = {
    namespace = "artos-system"
    labels = {
      compute-type = "fargate"
    }
  }
  
  dev_profile = {
    namespace = "development"
    labels = {}  # All pods in namespace
  }
}
How It Works:
  • Pods matching the selector (namespace + labels) run on Fargate
  • Each pod gets dedicated compute resources (no node sharing)
  • AWS manages all infrastructure (no EC2 instances to maintain)
Use Cases:
  • General application workloads
  • Artos document generation workloads
  • Artos regeneration workflows
  • Artos document editing functionality
Limitations:
  • No DaemonSets support
  • No privileged containers
  • No hostNetwork or hostPort
  • Limited to specific regions and availability zones
CoreDNS Dependency: CoreDNS must run on EC2 nodes for Fargate pods to resolve DNS. Always configure at least one EC2 node group when using Fargate profiles.

3. EKS Add-ons

Add-ons are essential Kubernetes components provided and managed by AWS.

VPC CNI (vpc-cni)

Purpose: Enables pod networking using AWS VPC networking primitives (ENIs and secondary IPs). Configuration:
{
  "env": {
    "ENABLE_PREFIX_DELEGATION": "true",
    "ENABLE_POD_ENI": "true"
  }
}
Features:
  • Prefix Delegation: Increases pod density per node by delegating IP prefixes instead of individual IPs
  • Pod ENI: Allows pods to have dedicated Elastic Network Interfaces for advanced networking
Why It Matters: Without VPC CNI, pods cannot receive IP addresses from your VPC subnets, making them unreachable.

CoreDNS (coredns)

Purpose: Provides DNS resolution for services and pods within the cluster. How It Works:
  • Runs as a deployment in the kube-system namespace
  • Handles DNS queries like service-name.namespace.svc.cluster.local
  • Integrates with VPC DNS for external domain resolution
Why It Matters: Services cannot discover each other without DNS. CoreDNS is critical for inter-service communication.

kube-proxy (kube-proxy)

Purpose: Manages network rules on each node to enable Kubernetes service networking. How It Works:
  • Runs as a DaemonSet on every node
  • Implements Kubernetes Service abstraction using iptables or IPVS
  • Routes traffic to correct pods based on service selectors
Why It Matters: Without kube-proxy, Kubernetes Services don’t work - traffic to service IPs won’t route to backend pods.

4. OIDC Identity Provider

The OIDC (OpenID Connect) provider enables Kubernetes service accounts to assume AWS IAM roles. How It Works:
  1. EKS cluster has an OIDC issuer URL (e.g., https://oidc.eks.us-east-1.amazonaws.com/id/EXAMPLED539D4633E53DE1B71EXAMPLE)
  2. AWS IAM trust policy references this OIDC provider
  3. Kubernetes service accounts annotated with IAM role ARN can assume the role
  4. Pods using these service accounts get temporary AWS credentials
Example Trust Policy:
{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Principal": {
      "Federated": "arn:aws:iam::123456789012:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/EXAMPLE"
    },
    "Action": "sts:AssumeRoleWithWebIdentity",
    "Condition": {
      "StringEquals": {
        "oidc.eks.us-east-1.amazonaws.com/id/EXAMPLE:sub": "system:serviceaccount:namespace:service-account-name",
        "oidc.eks.us-east-1.amazonaws.com/id/EXAMPLE:aud": "sts.amazonaws.com"
      }
    }
  }]
}
Use Cases:
  • Application pods accessing S3, RDS, or other AWS services
  • Add-on controllers (AWS Load Balancer Controller, EBS CSI Driver)

5. IAM Roles for Add-ons

Each add-on requiring AWS API access gets a dedicated IAM role using OIDC authentication.

VPC CNI Role

Service Account: system:serviceaccount:kube-system:aws-node Permissions: AmazonEKS_CNI_Policy (AWS managed policy) Capabilities:
  • Attach/detach ENIs to EC2 instances
  • Assign secondary IP addresses
  • Manage network interfaces for pod networking

CoreDNS Role

Service Account: system:serviceaccount:kube-system:coredns Permissions: AmazonEKSClusterPolicy (AWS managed policy) Capabilities:
  • Describe cluster resources
  • Access cluster configuration

EBS CSI Driver Role

Service Account: system:serviceaccount:kube-system:ebs-csi-controller-sa Permissions: Custom policy for EBS operations Capabilities:
  • Create, attach, detach, and delete EBS volumes
  • Create and delete snapshots
  • Describe volumes and instances
  • Manage volume tags
Use Case: Enables Kubernetes PersistentVolumes backed by EBS volumes for stateful applications.

Module Configuration

Basic Configuration

module "eks" {
  source = "./modules/eks"

  cluster_name    = "artos-production"
  cluster_version = "1.28"
  
  # Networking
  vpc_id     = module.vpc.vpc_id
  subnet_ids = module.vpc.private_subnets
  
  # IAM roles (created in IAM module)
  eks_service_role_arn    = module.iam.eks_service_role_arn
  eks_node_group_role_arn = module.iam.eks_node_group_role_arn
  eks_fargate_role_arn    = module.iam.eks_fargate_role_arn
  
  # Security
  kms_key_arn                = module.kms.eks_key_arn
  eks_nodes_security_group_id = module.security_groups.eks_nodes_sg_id
  
  # Compute
  node_groups = {
    general = {
      instance_types = ["t3.large"]
      capacity_type  = "ON_DEMAND"
      min_size      = 2
      max_size      = 10
      desired_size  = 3
    }
  }
  
  # Logging
  log_retention_days = 7
  
  tags = {
    Environment = "production"
  }
}

Production Configuration with Mixed Compute

module "eks_production" {
  source = "./modules/eks"

  cluster_name    = "artos-production"
  cluster_version = "1.28"
  
  vpc_id     = module.vpc.vpc_id
  subnet_ids = module.vpc.private_subnets
  
  eks_service_role_arn    = module.iam.eks_service_role_arn
  eks_node_group_role_arn = module.iam.eks_node_group_role_arn
  eks_fargate_role_arn    = module.iam.eks_fargate_role_arn
  
  kms_key_arn                = module.kms.eks_key_arn
  eks_nodes_security_group_id = module.security_groups.eks_nodes_sg_id
  
  # Multiple node groups for different workloads
  node_groups = {
    # General workloads - on-demand for stability
    general = {
      instance_types = ["t3.large", "t3.xlarge"]
      capacity_type  = "ON_DEMAND"
      min_size      = 3
      max_size      = 10
      desired_size  = 5
    }
    
    # CPU-intensive workloads - spot for cost savings
    compute = {
      instance_types = ["c5.2xlarge", "c5.4xlarge"]
      capacity_type  = "SPOT"
      min_size      = 0
      max_size      = 20
      desired_size  = 2
    }
    
    # Memory-intensive workloads
    memory = {
      instance_types = ["r5.xlarge", "r5.2xlarge"]
      capacity_type  = "ON_DEMAND"
      min_size      = 1
      max_size      = 10
      desired_size  = 2
    }
  }
  
  # Fargate for specific namespaces
  fargate_profiles = {
    batch_jobs = {
      namespace = "batch-processing"
      labels = {
        compute-type = "fargate"
      }
    }
  }
  
  # Extended log retention for production
  log_retention_days = 30
  
  tags = {
    Environment = "production"
    Backup      = "daily"
  }
}

Development Configuration

module "eks_dev" {
  source = "./modules/eks"

  cluster_name    = "artos-dev"
  cluster_version = "1.28"
  
  vpc_id     = module.vpc.vpc_id
  subnet_ids = module.vpc.private_subnets
  
  eks_service_role_arn    = module.iam.eks_service_role_arn
  eks_node_group_role_arn = module.iam.eks_node_group_role_arn
  eks_fargate_role_arn    = module.iam.eks_fargate_role_arn
  
  kms_key_arn                = module.kms.eks_key_arn
  eks_nodes_security_group_id = module.security_groups.eks_nodes_sg_id
  
  # Minimal node group for development
  node_groups = {
    dev = {
      instance_types = ["t3.medium"]
      capacity_type  = "SPOT"  # Use spot for cost savings
      min_size      = 1
      max_size      = 5
      desired_size  = 2
    }
  }
  
  # Shorter log retention for development
  log_retention_days = 3
  
  tags = {
    Environment = "development"
    AutoShutdown = "true"
  }
}

Accessing the Cluster

Configure kubectl

# Update kubeconfig
aws eks update-kubeconfig \
    --region us-east-1 \
    --name artos-production

# Verify connection
kubectl get nodes
kubectl get pods -A

Grant Additional IAM Users/Roles Access

The module uses bootstrap_cluster_creator_admin_permissions = false, meaning the cluster creator doesn’t automatically get admin access. You must explicitly grant access using EKS Access Entries or the aws-auth ConfigMap. Option 1: EKS Access Entries (Recommended)
# Grant cluster admin access to an IAM role
aws eks create-access-entry \
    --cluster-name artos-production \
    --principal-arn arn:aws:iam::123456789012:role/DevOpsRole \
    --type STANDARD

aws eks associate-access-policy \
    --cluster-name artos-production \
    --principal-arn arn:aws:iam::123456789012:role/DevOpsRole \
    --policy-arn arn:aws:eks::aws:cluster-access-policy/AmazonEKSClusterAdminPolicy \
    --access-scope type=cluster
Option 2: aws-auth ConfigMap (Legacy)
apiVersion: v1
kind: ConfigMap
metadata:
  name: aws-auth
  namespace: kube-system
data:
  mapRoles: |
    - rolearn: arn:aws:iam::123456789012:role/DevOpsRole
      username: devops-user
      groups:
        - system:masters
    - rolearn: arn:aws:iam::123456789012:role/artos-bastion-role
      username: bastion-user
      groups:
        - system:masters

Module Maintenance: This module is compatible with Terraform 1.0+ and AWS Provider 5.x. The cluster uses Kubernetes 1.28 by default. Review AWS EKS release notes before upgrading cluster versions.