-
Notifications
You must be signed in to change notification settings - Fork 9.7k
Enable ECS Service deployment graceful termination and Blue/Green read handler changes #43986
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Community GuidelinesThis comment is added to every new Pull Request to provide quick reference to how the Terraform AWS Provider is maintained. Please review the information below, and thank you for contributing to the community that keeps the provider thriving! 🚀 Voting for Prioritization
Pull Request Authors
|
✅ Thank you for correcting the previously detected issues! The maintainers appreciate your efforts to make the review process as smooth as possible. |
e5c8870
to
f5d893c
Compare
✅ Thank you for correcting the previously detected issues! The maintainers appreciate your efforts to make the review process as smooth as possible. |
f5d893c
to
e841889
Compare
8f4effd
to
35f4d81
Compare
dd27500
to
c05ab4d
Compare
# Conflicts: # internal/service/ecs/service.go # internal/service/ecs/service_test.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🚀.
% make testacc TESTARGS='-run=TestAccECSService_BlueGreenDeployment_\|TestAccECSService_DeploymentConfiguration_\|TestAccECSService_basic' PKG=ecs ACCTEST_PARALLELISM=3
make: Verifying source code with gofmt...
==> Checking that code complies with gofmt requirements...
TF_ACC=1 go1.24.6 test ./internal/service/ecs/... -v -count 1 -parallel 3 -run=TestAccECSService_BlueGreenDeployment_\|TestAccECSService_DeploymentConfiguration_\|TestAccECSService_basic -timeout 360m -vet=off
2025/08/26 17:22:19 Creating Terraform AWS Provider (SDKv2-style)...
2025/08/26 17:22:19 Initializing Terraform AWS Provider (SDKv2-style)...
=== RUN TestAccECSService_basic
=== PAUSE TestAccECSService_basic
=== RUN TestAccECSService_BlueGreenDeployment_basic
=== PAUSE TestAccECSService_BlueGreenDeployment_basic
=== RUN TestAccECSService_BlueGreenDeployment_outOfBandRemoval
=== PAUSE TestAccECSService_BlueGreenDeployment_outOfBandRemoval
=== RUN TestAccECSService_BlueGreenDeployment_sigintRollback
service_test.go:1084: SIGINT handling can't reliably be tested in CI
--- SKIP: TestAccECSService_BlueGreenDeployment_sigintRollback (0.00s)
=== RUN TestAccECSService_BlueGreenDeployment_circuitBreakerRollback
=== PAUSE TestAccECSService_BlueGreenDeployment_circuitBreakerRollback
=== RUN TestAccECSService_BlueGreenDeployment_createFailure
=== PAUSE TestAccECSService_BlueGreenDeployment_createFailure
=== RUN TestAccECSService_BlueGreenDeployment_changeStrategy
=== PAUSE TestAccECSService_BlueGreenDeployment_changeStrategy
=== RUN TestAccECSService_BlueGreenDeployment_updateFailure
=== PAUSE TestAccECSService_BlueGreenDeployment_updateFailure
=== RUN TestAccECSService_BlueGreenDeployment_updateInPlace
=== PAUSE TestAccECSService_BlueGreenDeployment_updateInPlace
=== RUN TestAccECSService_BlueGreenDeployment_waitServiceActive
=== PAUSE TestAccECSService_BlueGreenDeployment_waitServiceActive
=== RUN TestAccECSService_BlueGreenDeployment_withoutTestListenerRule
=== PAUSE TestAccECSService_BlueGreenDeployment_withoutTestListenerRule
=== RUN TestAccECSService_DeploymentConfiguration_strategy
=== PAUSE TestAccECSService_DeploymentConfiguration_strategy
=== CONT TestAccECSService_basic
=== CONT TestAccECSService_BlueGreenDeployment_updateFailure
=== CONT TestAccECSService_BlueGreenDeployment_circuitBreakerRollback
--- PASS: TestAccECSService_basic (74.06s)
=== CONT TestAccECSService_BlueGreenDeployment_withoutTestListenerRule
--- PASS: TestAccECSService_BlueGreenDeployment_updateFailure (1099.56s)
=== CONT TestAccECSService_DeploymentConfiguration_strategy
--- PASS: TestAccECSService_DeploymentConfiguration_strategy (77.54s)
=== CONT TestAccECSService_BlueGreenDeployment_waitServiceActive
--- PASS: TestAccECSService_BlueGreenDeployment_withoutTestListenerRule (1362.71s)
=== CONT TestAccECSService_BlueGreenDeployment_updateInPlace
--- PASS: TestAccECSService_BlueGreenDeployment_circuitBreakerRollback (1543.91s)
=== CONT TestAccECSService_BlueGreenDeployment_outOfBandRemoval
--- PASS: TestAccECSService_BlueGreenDeployment_waitServiceActive (368.59s)
=== CONT TestAccECSService_BlueGreenDeployment_changeStrategy
--- PASS: TestAccECSService_BlueGreenDeployment_updateInPlace (647.56s)
=== CONT TestAccECSService_BlueGreenDeployment_createFailure
--- PASS: TestAccECSService_BlueGreenDeployment_outOfBandRemoval (769.23s)
=== CONT TestAccECSService_BlueGreenDeployment_basic
--- PASS: TestAccECSService_BlueGreenDeployment_createFailure (352.34s)
--- PASS: TestAccECSService_BlueGreenDeployment_changeStrategy (1051.84s)
--- PASS: TestAccECSService_BlueGreenDeployment_basic (466.80s)
PASS
ok github.com/hashicorp/terraform-provider-aws/internal/service/ecs 2785.278s
? github.com/hashicorp/terraform-provider-aws/internal/service/ecs/test-fixtures [no test files]
% make testacc TESTARGS='-run=TestAccECSService_BlueGreenDeployment_circuitBreakerRollback' PKG=ecs
make: Verifying source code with gofmt...
==> Checking that code complies with gofmt requirements...
TF_ACC=1 go1.24.6 test ./internal/service/ecs/... -v -count 1 -parallel 20 -run=TestAccECSService_BlueGreenDeployment_circuitBreakerRollback -timeout 360m -vet=off
2025/08/27 10:04:31 Creating Terraform AWS Provider (SDKv2-style)...
2025/08/27 10:04:31 Initializing Terraform AWS Provider (SDKv2-style)...
=== RUN TestAccECSService_BlueGreenDeployment_circuitBreakerRollback
=== PAUSE TestAccECSService_BlueGreenDeployment_circuitBreakerRollback
=== CONT TestAccECSService_BlueGreenDeployment_circuitBreakerRollback
--- PASS: TestAccECSService_BlueGreenDeployment_circuitBreakerRollback (3426.10s)
PASS
ok github.com/hashicorp/terraform-provider-aws/internal/service/ecs 3431.680s
? github.com/hashicorp/terraform-provider-aws/internal/service/ecs/test-fixtures [no test files]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🚀
@djglaser Thanks for the contribution 🎉 👏. |
Warning This Issue has been closed, meaning that any additional comments are much easier for the maintainers to miss. Please assume that the maintainers will not see them. Ongoing conversations amongst community members are welcome, however, the issue will be locked after 30 days. Moving conversations to another venue, such as the AWS Provider forum, is recommended. If you have additional concerns, please open a new issue, referencing this one where needed. |
This functionality has been released in v6.11.0 of the Terraform AWS Provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading. For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template. Thank you! |
Title
Enable customer-initiated Blue/Green deployment rollback using SIGINT and add
deployment_configuration
monitoring to read handler.Description
This PR adds support for graceful termination of Blue/Green deployments in ECS services by handling SIGINT signals. When enabled, this allows customers to safely cancel an in-progress Blue/Green deployment and automatically trigger a rollback to the previous stable state. It also implements read handler changes needed for handling out-of-band changes made to Blue/Green deployment configurations.
Key Changes
sigint_cancellation
boolean attribute to ECS service resource (defaults to false)Example Usage
Additional Notes
strategy
was changed from defaulting toROLLING
to a computed strategy, as this default value is set by the API.bake_time_in_minutes
was changed to a computed strategy, as the default value is enforced by the API as well.TestAccECSService_BlueGreenDeployment_sigintRollback
,ExpectNonEmptyPlan: true
is needed for step 3: After SIGINT rollback, the actual ECS service state (using original task definition :1 differs from the desired step 2 configuration (task definition :2), creating expected configuration drift that validates the rollback succeeded.Graceful Termination Logic for Blue/Green Deployments
The graceful termination logic handles interruption scenarios during ECS blue/green deployments through a coordinated cancellation and rollback mechanism:
Key Components:
Cancellation Detection:
waitForCancellation()
monitors for context cancellation (SIGINT) and triggers automatic rollback when deployment is interruptedSmart Rollback:
rollbackBlueGreenDeployment()
checks deployment status before attempting rollback - skips if already in terminal state (successful/stopped/rollback_failed/rollback_successful)Graceful Shutdown: Uses
StopServiceDeployment
withStopTypeRollback
to safely revert to previous stable state rather than leaving deployment in inconsistent stateTimeout Protection: Implements 1-hour maximum wait before SIGKILL, ensuring process doesn't hang indefinitely
Flow:
•
waitForCancellation()
Goroutine is spawned only once primary deployment ARN is available, as there is nothing to rollback prior to this being available• On cancellation signal → immediate rollback initiation → wait for terminal status → cleanup
• Prevents partial deployments and maintains service stability during interruptions
This ensures blue/green deployments can be safely cancelled without leaving services in broken states.
Testing
TestAccECSService_BlueGreenDeployment_sigintRollback
andTestAccECSService_BlueGreenDeployment_outOfBandRemoval
sigint_helper.go
to simulate a customer-initiated SIGINT trigger (i.e.^+C
) on an UpdateService operationDocumentation Updates
sigint_cancellation
attributeRollback Plan
If a change needs to be reverted, we will publish an updated version of the library.
Changes to Security Controls
Are there any changes to security controls (access controls, encryption, logging) in this pull request? If so, explain.
Relations
Relates #43434
Relates #43502
Related #43558
References
Output from Acceptance Testing
Note:
TestAccECSService_BlueGreenDeployment_sigintRollback
only passes when ran in isolation. Otherwise, it gets errorservice_test.go:1089: Step 2/3, expected an error but got none
.TestAccECSService_VolumeConfiguration_throughputTypeChange
failure is transient and unrelated to this change (see #38475).