Skip to main content

Overview

The S3 Module creates secure, scalable object storage for the Artos platform. It provisions an S3 bucket with encryption, versioning, lifecycle management, and optional CloudFront distribution for static asset delivery. The module handles access control, event notifications, and ensures data security through comprehensive configuration.

Key Features

  • Secure Storage: KMS encryption and public access blocking
  • Data Protection: Versioning for file recovery and audit trails
  • Lifecycle Management: Automated data retention and storage tier transitions
  • Event Notifications: Lambda triggers for document processing workflows
  • IAM Access Control: Dedicated role and policy for application access
  • Optional CDN: CloudFront distribution for fast global content delivery

Core Components

1. S3 Bucket

The S3 bucket is the primary storage container for all Artos documents, files, and assets. Bucket Naming: {bucket_name}-{random_suffix} Random Suffix:
  • 4-byte random hex string (e.g., a3f2c9b1)
  • Ensures globally unique bucket names
  • Prevents naming conflicts across AWS accounts
Example Bucket Name: artos-production-documents-a3f2c9b1 Purpose:
  • Store uploaded documents and files
  • Archive processed data
  • Host static assets (images, PDFs, reports)
  • Store application backups
Bucket Properties:
  • Region-Specific: Deployed in the same region as your infrastructure
  • Private: No public access by default
  • Encrypted: All data encrypted at rest
  • Versioned: Track changes and enable recovery

2. Versioning

Versioning maintains multiple versions of objects in the bucket. Status: Enabled by default (versioning_enabled = true) How It Works:
  • Each file upload creates a new version
  • Previous versions are retained
  • Deleted objects become “delete markers” (soft delete)
  • Versions can be permanently deleted if needed
Benefits: Data Protection:
Upload: document.pdf (Version 1)
Upload: document.pdf (Version 2) - Version 1 still exists
Delete: document.pdf - Delete marker created, both versions still exist
Restore: Remove delete marker to restore latest version
Use Cases:
  • Accidental Deletion Recovery: Restore deleted files
  • Audit Trail: Track who changed what and when
  • Rollback Capability: Revert to previous file versions
  • Compliance: Meet data retention requirements
Storage Implications: Versioning retains all versions of objects, which increases storage usage. Use lifecycle rules to automatically delete old versions after a retention period.

3. Server-Side Encryption

All objects are automatically encrypted at rest using AWS KMS. Encryption Algorithm: aws:kms (KMS-managed encryption) KMS Key: Customer-managed key (provided via kms_key_arn) Bucket Key: Enabled (reduces KMS API calls and costs) How It Works:
  1. Application uploads object to S3
  2. S3 generates data encryption key (DEK) using KMS key
  3. S3 encrypts object with DEK
  4. S3 stores encrypted object and encrypted DEK
  5. On retrieval, S3 decrypts DEK with KMS, then decrypts object
Benefits: Security:
  • Data encrypted at rest in AWS data centers
  • Encryption keys managed by AWS KMS
  • Fine-grained access control via KMS key policies
  • Meets compliance requirements (HIPAA, PCI-DSS)
Transparency:
  • Automatic encryption/decryption
  • No application code changes required
  • AWS SDK handles encryption transparently
Access Control:
{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Principal": {
      "AWS": "arn:aws:iam::123456789012:role/artos-app-role"
    },
    "Action": [
      "kms:Decrypt",
      "kms:GenerateDataKey"
    ],
    "Resource": "*",
    "Condition": {
      "StringEquals": {
        "kms:ViaService": "s3.us-east-1.amazonaws.com"
      }
    }
  }]
}
Bucket Key Optimization: With bucket keys enabled, S3 uses a bucket-level key for encryption operations, reducing the number of KMS API calls. This improves performance and reduces KMS request costs.

4. Public Access Block

The public access block prevents any accidental public exposure of bucket contents. Configuration (All Enabled):
SettingValueEffect
block_public_aclstrueBlocks new public ACLs
block_public_policytrueBlocks new public bucket policies
ignore_public_aclstrueIgnores existing public ACLs
restrict_public_bucketstrueRestricts public bucket policies
Result: The bucket is strictly private and only accessible via IAM permissions.

5. Lifecycle Rules

Lifecycle rules automate data retention, storage tier transitions, and cleanup operations. Rule Components:

Expiration

Purpose: Automatically delete objects after a specified number of days. Example:
lifecycle_rules = [{
  id                  = "delete-old-temp-files"
  status              = "Enabled"
  expiration_days     = 30
  filter_prefix       = "temp/"
}]
Use Cases:
  • Delete temporary uploads after 30 days
  • Remove logs older than 90 days
  • Clean up intermediate processing files

Noncurrent Version Expiration

Purpose: Delete old versions after they are no longer current. Example:
lifecycle_rules = [{
  id                                = "expire-old-versions"
  status                           = "Enabled"
  noncurrent_version_expiration_days = 90
}]
How It Works:
Day 0: Upload document.pdf (Version 1 - Current)
Day 10: Upload document.pdf (Version 2 - Current, Version 1 - Noncurrent)
Day 100: Version 1 automatically deleted (90 days after becoming noncurrent)
Use Cases:
  • Retain file history for audit purposes (90 days)
  • Automatically clean up old versions
  • Balance compliance requirements with storage costs

Storage Class Transitions

Purpose: Move objects to lower-cost storage tiers as they age. Available Storage Classes:
  • STANDARD: Frequent access (default)
  • STANDARD_IA: Infrequent access (30+ days)
  • INTELLIGENT_TIERING: Automatic tiering based on access patterns
  • GLACIER_IR: Instant retrieval archive (90+ days)
  • GLACIER: Archive storage (180+ days, minutes to hours retrieval)
  • DEEP_ARCHIVE: Long-term archive (180+ days, 12-48 hours retrieval)
Example:
lifecycle_rules = [{
  id     = "archive-old-documents"
  status = "Enabled"
  transitions = [
    {
      days          = 90
      storage_class = "STANDARD_IA"
    },
    {
      days          = 365
      storage_class = "GLACIER_IR"
    }
  ]
}]
Timeline:
Day 0-89: STANDARD (frequent access)
Day 90-364: STANDARD_IA (infrequent access)
Day 365+: GLACIER_IR (archive with instant retrieval)
Use Cases:
  • Archive old documents while maintaining instant access
  • Reduce storage costs for infrequently accessed data
  • Meet compliance retention requirements

Abort Incomplete Multipart Uploads

Purpose: Clean up abandoned multipart upload operations. Multipart Upload Process:
  1. Application initiates multipart upload for large file
  2. File divided into parts and uploaded separately
  3. If upload never completes, parts remain in bucket
  4. Incomplete parts incur storage charges
Example:
lifecycle_rules = [{
  id                                = "cleanup-multipart"
  status                           = "Enabled"
  abort_incomplete_multipart_upload_days = 7
}]
Result: Parts from incomplete uploads automatically deleted after 7 days. Best Practice: Always configure this rule to prevent storage costs from failed uploads.

6. IAM Access Control

The module creates IAM resources for application access to the S3 bucket.

IAM Policy

Policy Name: {bucket_name}-s3-access-policy Granted Permissions:
ActionDescription
s3:GetObjectDownload objects from bucket
s3:PutObjectUpload objects to bucket
s3:DeleteObjectDelete objects from bucket
s3:ListBucketList objects in bucket
Resource Scope:
  • Bucket itself: arn:aws:s3:::bucket-name
  • All objects: arn:aws:s3:::bucket-name/*
Example Usage in Application:
import boto3

# IAM role with attached policy automatically provides credentials
s3 = boto3.client('s3')

# Upload file
s3.put_object(
    Bucket='artos-production-documents-a3f2c9b1',
    Key='documents/report.pdf',
    Body=file_data
)

# Download file
response = s3.get_object(
    Bucket='artos-production-documents-a3f2c9b1',
    Key='documents/report.pdf'
)
file_data = response['Body'].read()

# List files
response = s3.list_objects_v2(
    Bucket='artos-production-documents-a3f2c9b1',
    Prefix='documents/'
)
for obj in response.get('Contents', []):
    print(obj['Key'])

IAM Role

Role Name: {bucket_name}-s3-access-role Trust Policy: Allows EKS service to assume the role Purpose: Provides application pods with S3 access via IRSA (IAM Roles for Service Accounts)
Integration with IAM Module: In practice, the IAM module typically manages application roles. This role is provided as a convenience and may be replaced with roles from the IAM module that have broader AWS service access.

Security Best Practices

1. Encryption

Always Use KMS Encryption:
  • Customer-managed keys for compliance
  • Encryption enabled by default in module
  • Key policies control who can encrypt/decrypt
Enforce Encryption:
{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Deny",
    "Principal": "*",
    "Action": "s3:PutObject",
    "Resource": "arn:aws:s3:::bucket-name/*",
    "Condition": {
      "StringNotEquals": {
        "s3:x-amz-server-side-encryption": "aws:kms"
      }
    }
  }]
}

2. Access Control

Principle of Least Privilege:
  • Grant only required permissions (GetObject, PutObject, etc.)
  • Use IAM roles instead of access keys
  • Restrict access to specific prefixes when possible
Example Restricted Policy:
{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Action": ["s3:GetObject", "s3:PutObject"],
    "Resource": "arn:aws:s3:::bucket-name/uploads/*"
  }]
}

3. Versioning and Backups

Enable Versioning:
  • Protects against accidental deletion
  • Provides audit trail
  • Enables compliance with retention policies
MFA Delete (Optional):
# Require MFA to permanently delete versions
aws s3api put-bucket-versioning \
  --bucket artos-production-documents-a3f2c9b1 \
  --versioning-configuration Status=Enabled,MFADelete=Enabled \
  --mfa "arn:aws:iam::123456789012:mfa/user arn:aws:iam::123456789012:mfa/user 123456"

4. Monitoring and Auditing

Enable S3 Access Logging:
resource "aws_s3_bucket_logging" "main" {
  bucket = aws_s3_bucket.main.id
  target_bucket = aws_s3_bucket.logs.id
  target_prefix = "s3-access-logs/"
}
CloudTrail S3 Data Events:
  • Log all API calls to S3
  • Track who accessed which objects
  • Detect unauthorized access attempts
CloudWatch Metrics:
  • Monitor bucket size and object count
  • Alert on unusual access patterns
  • Track request rates and errors

Troubleshooting

Access Denied Errors

Symptoms: AccessDenied or 403 Forbidden errors when accessing S3 Troubleshooting Steps:
  1. Verify IAM Permissions:
aws iam get-role-policy --role-name artos-app-role --policy-name s3-access
  1. Check KMS Key Policy:
aws kms get-key-policy --key-id <key-id> --policy-name default
  1. Test from Pod:
kubectl exec -it <pod-name> -- python3 -c "
import boto3
s3 = boto3.client('s3')
print(s3.list_buckets())
"

Module Maintenance: This module is compatible with Terraform 1.0+ and AWS Provider 5.x. S3 buckets use KMS encryption by default and have public access blocked for security. Review lifecycle rules periodically to ensure they align with your data retention requirements.