跳到主要内容

Production Backend Migration: IAM User → EC2 Instance Profile

Date: 2026-05-18

Objective

Migrate the production backend (aovis-store-aws on i-01b89c7132555fc68) from long-term IAM user access keys (aovis-backend-service) to EC2 instance profile temporary credentials.

Source of Truth

  • Architecture reference: docs/aws-ai-architecture.md
  • Implementation plan: docs/ai-push-pipeline-implementation-plan.md §3.4

Migration Steps Summary

P8-0: Read-Only Inventory

  • Confirmed EC2 i-01b89c7132555fc68 had no instance profile attached.
  • Old IAM user aovis-backend-service with policy aovis-backend-service-policy v6.
  • Identified over-permissions (iot:DescribeThing, kinesisvideo:DescribeSignalingChannel) and deferred permissions (sns:CreatePlatformEndpoint/SetEndpointAttributes/DeleteEndpoint).

P8-1: Create IAM Resources

  • Role: aovis-backend-ec2-role
  • Policy: aovis-backend-ec2-policy
  • Instance profile: aovis-backend-ec2-instance-profile
  • Attached instance profile to EC2 i-01b89c7132555fc68.
  • Updated WebRTC viewer role trust policy to include both old IAM user and new EC2 role.

P8-2: Switch Runtime to Instance Profile

  • Commented out AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY from .env.production.
  • Deleted and restarted PM2 process (using pm2 delete + pm2 start to clear cached runtime env).
  • Verified app now uses assumed-role/aovis-backend-ec2-role/i-01b89c7132555fc68.
  • deploy:verify 7/7 passed.
  • Daily Summary dryRun ok.

P8-2B: Comprehensive Smoke Tests

All passed:

TestResult
Caller identityassumed-role/aovis-backend-ec2-role
S3 HeadObject957,923 bytes
KVS DescribeStreamACTIVE, 720h
KVS HLSSkipped — no active stream data
WebRTC signaling endpointsHTTPS + WSS
WebRTC ICE config2 servers
STS AssumeRole (viewer role)ASIA prefix
Bedrock Nova LiteOK
Daily Summary dryRunok=True
AI summarize S3 videoeventType=Surveillance, confidence=0.95
PM2 error log auditZero credential errors post-migration

P8-2C: Fix Bedrock Cross-Region Permission Gap

  • Found: us.amazon.nova-pro-v1:0 cross-region inference profile routes to us-west-2, but policy only allowed us-east-1 for the Nova Pro foundation model ARN.
  • Fixed: Changed arn:aws:bedrock:us-east-1::foundation-model/amazon.nova-pro-v1:0arn:aws:bedrock:*::foundation-model/amazon.nova-pro-v1:0 in aovis-backend-ec2-policy v2.
  • Matches the existing Nova Lite pattern (*::foundation-model/amazon.nova-lite-v1:0).
  • Access Analyzer validate-policy: 0 findings.
  • Post-fix verification: Nova Pro now works, classifyThumbnails works.

P8-3A: Deactivate Old IAM User Key

  • Old key ending 3JXF: ActiveInactive.
  • Smoke tests all passed with zero credential errors.
  • App restarts: 0. PM2 logs: clean.

P8-3B: Delete Old IAM User Key

  • Old key ending 3JXF: deleted.
  • list-access-keys confirms zero keys remain.
  • Smoke tests all passed.

Resources Created

ResourceARN
IAM rolearn:aws:iam::288669178338:role/aovis-backend-ec2-role
IAM policyarn:aws:iam::288669178338:policy/aovis-backend-ec2-policy (v2)
Instance profileaovis-backend-ec2-instance-profile

Policy Notes (aovis-backend-ec2-policy v2)

  • Bedrock us.amazon.nova-pro-v1:0 foundation model: region wildcard (*) for cross-region inference routing.
  • Bedrock us.amazon.nova-lite-v1:0 foundation model + inference profile: region wildcard (*).
  • WebRTC viewer STS AssumeRole: retained.
  • Removed: iot:DescribeThing, kinesisvideo:DescribeSignalingChannel (unused by code).
  • Deferred: sns:CreatePlatformEndpoint/SetEndpointAttributes/DeleteEndpoint (mobile push not yet active).

Runtime State

  • PM2: No AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY in env.
  • AWS SDK: Uses EC2 instance profile metadata credentials.
  • App caller identity: arn:aws:sts::288669178338:assumed-role/aovis-backend-ec2-role/i-01b89c7132555fc68

Rollback

Old IAM user access key (ending 3JXF) has been deleted. Rollback requires creating a new access key for aovis-backend-service and adding it to .env.production. Preferred rollback path: fix instance role policy, not reintroduce long-term keys.

Remaining Follow-Ups

  • SNS platform endpoint permissions (sns:CreatePlatformEndpoint etc.) when mobile push is activated.
  • Bedrock embedding model ID (amazon.nova-embed-multimodal-v1:0) appears invalid — investigate correct model ID.
  • Replace dev WebRTC viewer role with production multi-device IAM design.