Overview
The RDS Module deploys an Amazon Aurora PostgreSQL database cluster optimized for the Artos platform. Aurora provides a fully-managed, highly available database with automated backups, point-in-time recovery, and read scaling capabilities. The module handles secure credential management, network isolation, parameter optimization, and monitoring integration.Key Features
- High Availability: Multi-AZ deployment with automatic failover
- Secure Credentials: Automated password generation and Secrets Manager storage
- Encryption: Data encrypted at rest using KMS and in transit using SSL
- Automated Backups: Continuous backups with configurable retention
- Performance Insights: Built-in query performance monitoring
- Read Scaling: Support for multiple reader instances
- Serverless Option: Aurora Serverless v2 for variable workloads
Core Components
1. Database Credentials Management
The module automatically generates and securely stores database credentials.Random Password Generation
Purpose: Creates a cryptographically secure password for the database master user. Configuration:- Length: 32 characters
- Character Set: Letters, numbers, and special characters
- Special Characters:
!#$%&*()-_=+[]{}|;:,.<>?~(excludes problematic characters)
- No hardcoded passwords in code or configuration
- Meets complexity requirements for enterprise security policies
- Unique per deployment
Secrets Manager Storage
Purpose: Securely stores database credentials for application access. Secret Name:{db_identifier}-password
Secret Content:
- Production: 30 days (allows recovery from accidental deletion)
- Non-Production: 0 days (immediate deletion for faster iteration)
ignore_changes on secret_string to prevent Terraform from updating the password after initial creation. This means:
- Password rotations are managed outside Terraform
- State file doesn’t contain password updates
- Manual rotation doesn’t trigger Terraform changes
Accessing Credentials: Applications retrieve credentials from Secrets Manager using IAM authentication. The IAM module grants the necessary permissions to application pods.
2. Aurora PostgreSQL Cluster
The Aurora cluster is the logical database container that manages replication, backups, and endpoints. Engine: Aurora PostgreSQL (MySQL-compatible API on PostgreSQL engine) Engine Version: 15.4 (default, configurable) Database Name:artos (default, configurable)
Key Configurations:
Storage and Encryption
- Storage Type: Aurora Serverless storage (auto-scaling)
- Encryption: Enabled by default using KMS
- KMS Key: Customer-managed key for compliance requirements
- Storage automatically grows from 10GB to 128TB
- Only pay for storage actually used
- KMS encryption enables fine-grained access control
Backup Configuration
Automated Backups:- Retention Period: 7 days (default), configurable up to 35 days
- Backup Window: 03:00-04:00 UTC (default, configurable)
- Continuous Backups: Transaction logs continuously backed up to S3
- Restore to any second within retention period
- No impact on production performance
- Fast restore times (minutes vs hours for snapshots)
- skip_final_snapshot: false (default)
- Creates final snapshot when cluster is deleted
- Enables data recovery after accidental deletion
Maintenance Configuration
Preferred Maintenance Window:sun:04:00-sun:05:00 (default)
What Happens During Maintenance:
- Minor version patches applied
- Security updates installed
- Parameter changes requiring reboot take effect
- Schedule during low-traffic periods
- Coordinate with application maintenance windows
- Monitor CloudWatch alarms during maintenance
CloudWatch Logs Integration
Enabled Logs: PostgreSQL logs Log Group:/aws/rds/cluster/{db_identifier}/postgresql
Retention: 7 days (default, configurable)
Log Contents:
- SQL queries (if logging enabled)
- Connection events
- Error messages
- Slow query logs
- Database startup/shutdown events
3. Cluster Instances
Aurora uses a cluster architecture with separate compute instances. Instance Configuration:| Component | Description |
|---|---|
| Instance Count | 2 (default) - 1 writer, 1 reader |
| Instance Class | db.r6g.large (default) |
| Writer Instance | Handles all write operations |
| Reader Instances | Handle read operations for scaling |
- Instances automatically deployed across multiple availability zones
- Aurora maintains 6 copies of data across 3 AZs
- Automatic failover in case of AZ failure (typically < 30 seconds)
- Primary instance becomes unavailable
- Aurora promotes a reader instance to writer
- DNS endpoint automatically updates
- Applications reconnect transparently
db.r6g.large: 2 vCPU, 16 GB RAMdb.r6g.xlarge: 4 vCPU, 32 GB RAMdb.r6g.2xlarge: 8 vCPU, 64 GB RAM
- Standard production workloads
- Predictable traffic patterns
- Consistent performance requirements
db.serverless: Auto-scaling compute- ACUs (Aurora Capacity Units): 0.5 to 4 (default)
- Scales in 0.5 ACU increments
- Variable workloads with unpredictable traffic
- Development/staging environments
- Applications with long idle periods
4. Parameter Group
The parameter group defines database configuration settings. Family:aurora-postgresql16
Configured Parameters:
max_connections
Default: 1000 connections Purpose: Maximum number of concurrent database connections. Formula: Based on instance memory- Connection Pooling: Use PgBouncer or application-level pooling
- Monitor: Track connection usage via CloudWatch metric
DatabaseConnections - Adjust: Increase if seeing “too many connections” errors
- Development: 100-200
- Staging: 500-1000
- Production: 1000-2000
shared_buffers
Default: 262144 KB (256 MB) Purpose: Memory used for caching data pages. Best Practice: Set to 25% of available memory for dedicated database server. Formula for Aurora:- db.r6g.large (16 GB): ~4 GB shared_buffers
- db.r6g.xlarge (32 GB): ~8 GB shared_buffers
- Higher values: Better cache hit ratio, fewer disk reads
- Too high: Less memory for other operations
- Requires reboot to apply changes
Apply Method: Both parameters use
apply_method = "pending-reboot", meaning changes take effect after cluster restart. Plan parameter changes during maintenance windows.5. Network Configuration
DB Subnet Group
Purpose: Defines which subnets the RDS cluster can use. Requirements:- Minimum 2 subnets in different availability zones
- Subnets must be in database tier (isolated from application tier)
- All subnets in same VPC
- Required for Multi-AZ deployments
- Enables automatic failover
- Aurora distributes replicas across AZs
Security Group
Purpose: Controls network access to the database. Ingress Rules:| Port | Protocol | Source | Description |
|---|---|---|---|
| 5432 | TCP | Allowed CIDR blocks | PostgreSQL from specified networks |
| 5432 | TCP | Allowed security groups | PostgreSQL from EKS nodes |
- All traffic to VPC CIDR (enables communication within VPC)
- Prefer security group references over CIDR blocks
- Never use
0.0.0.0/0for database access - Use bastion host for administrative access
- Enable VPC Flow Logs to monitor connection attempts
6. Cluster Endpoints
Aurora provides multiple endpoints for different access patterns.Writer Endpoint
Format:{cluster-identifier}.cluster-xxxxx.{region}.rds.amazonaws.com
Purpose: All write operations and consistent reads
Use Cases:
- INSERT, UPDATE, DELETE operations
- Schema changes (DDL)
- Transactions requiring consistency
- Administrative operations
Reader Endpoint
Format:{cluster-identifier}.cluster-ro-xxxxx.{region}.rds.amazonaws.com
Purpose: Load-balanced read operations across reader instances
Use Cases:
- SELECT queries for reporting
- Analytics workloads
- Read-only API endpoints
- Background processing tasks
- Aurora automatically distributes connections across readers
- Round-robin with session-level stickiness
- Unhealthy readers automatically removed from rotation
Module Configuration
Basic Configuration
Production Configuration
Serverless v2 Configuration
Connecting to the Database
From Application Pods
Applications access the database using credentials from Secrets Manager: Python (using psycopg2):From Bastion Host
For administrative access and troubleshooting: SSH to Bastion:connect-bastion.sh script:
Connection Pooling
For production applications, use connection pooling to optimize database connections.Database Maintenance
Backup and Restore
Manual Snapshot
Create a manual snapshot for specific points in time:Point-in-Time Restore
Restore to any second within the retention period:Restore from Snapshot
Parameter Changes
Applying parameter changes:Version Upgrades
Minor version upgrades are automatic during maintenance windows. For major version upgrades:Monitoring and Troubleshooting
CloudWatch Metrics
Key metrics to monitor:| Metric | Description | Alarm Threshold |
|---|---|---|
| CPUUtilization | CPU usage percentage | > 80% |
| DatabaseConnections | Active connections | > 80% of max_connections |
| FreeableMemory | Available memory | < 1 GB |
| ReadLatency | Read operation latency | > 20ms |
| WriteLatency | Write operation latency | > 50ms |
| CommitLatency | Transaction commit time | > 100ms |
Common Issues
Too Many Connections
Symptoms:FATAL: sorry, too many clients already
Solutions:
- Increase
max_connectionsparameter - Implement connection pooling (PgBouncer)
- Review application connection management
- Close idle connections
Connection Timeouts
Symptoms: Applications cannot connect to database Troubleshooting:Security
Descriptions of security around the RDS instances.1. Network Isolation
- Deploy in private database subnets (no internet access)
- Use security groups to restrict access to EKS nodes only
- Enable VPC Flow Logs for connection audit trail
2. Encryption
At Rest:- Always use KMS encryption with customer-managed keys
- Enables fine-grained access control via key policies
- Meets compliance requirements (HIPAA, PCI-DSS)
- Enforce SSL/TLS connections
- Set
rds.force_ssl = 1parameter (optional) - Verify SSL in application connection strings
3. Credential Management
- Never hardcode database passwords
- Rotate passwords regularly (outside Terraform)
- Use Secrets Manager for automatic rotation
- Grant minimal IAM permissions for secret access
Related Modules
- Networking Module - Provides database subnets and security groups
- IAM Module - Grants permissions to access RDS and Secrets Manager
- Monitoring Module - CloudWatch log groups for RDS logs
- Bastion Module - Administrative access to database
Module Maintenance: This module is compatible with Terraform 1.0+ and AWS Provider 5.x. Aurora PostgreSQL version 15.4 is the default. Review AWS Aurora release notes before upgrading to newer versions. Always test database upgrades in non-production environments first.