Skip to content

Conversation

amandladev
Copy link
Contributor

@amandladev amandladev commented Sep 30, 2025

Implements container startup health check timeout configuration for SageMaker endpoint production variants as available in CloudFormation but missing in CDK constructs.

Issue #35566

  • Add containerStartupHealthCheckTimeout property to InstanceProductionVariantProps interface
  • Add comprehensive validation for timeout range (60-3600 seconds)
  • Add CloudFormation template generation for ContainerStartupHealthCheckTimeoutInSeconds property
  • Include test coverage for validation scenarios and edge cases
  • Update README documentation with usage examples and constraints

Reason for this change

AWS SageMaker EndpointConfig supports ContainerStartupHealthCheckTimeoutInSeconds in CloudFormation to configure health check timeout for inference containers, but this property is not exposed in the CDK SageMaker L2 constructs. Users with models that require longer initialization time cannot configure appropriate health check timeouts, leading to premature health check failures.

Description of changes

Implements AWS SageMaker container startup health check timeout support in CDK SageMaker L2 constructs, enabling users to configure appropriate health check timeouts for inference containers:

  • New containerStartupHealthCheckTimeout property in InstanceProductionVariantProps interface with AWS-compliant validation:
    Range: 60-3600 seconds (1 minute to 1 hour)
    Type: cdk.Duration for intuitive time specification
    Optional property maintaining backward compatibility
  • Enhanced addInstanceProductionVariant() method with comprehensive input validation
  • Automatic conversion from cdk.Duration to seconds for CloudFormation compatibility
  • Synthesis-time validation with clear, actionable error messages
  • CloudFormation integration mapping to ContainerStartupHealthCheckTimeoutInSeconds property

Usage Example:

import * as cdk from 'aws-cdk-lib';
import * as sagemaker from '@aws-cdk/aws-sagemaker-alpha';

declare const model: sagemaker.IModel;

// Create endpoint configuration with health check timeout
const endpointConfig = new sagemaker.EndpointConfig(this, 'EndpointConfig', {
  instanceProductionVariants: [{
    variantName: 'my-variant',
    model: model,
    containerStartupHealthCheckTimeout: cdk.Duration.minutes(5), // 5 minutes timeout
  }],
});

Describe any new or updated permissions being added

N/A - No new IAM permissions required. Leverages existing SageMaker endpoint configuration permissions.

Description of how you validated changes

Unit tests: Added 5 comprehensive container startup health check timeout tests covering all validation scenarios:

  • Property inclusion in CloudFormation template when provided
  • Property absence in CloudFormation template when not provided
  • Range validation for minimum value (60 seconds)
  • Range validation for maximum value (3600 seconds)
  • Acceptance of valid timeout values at boundaries
  • Duration to seconds conversion verification

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license

@github-actions github-actions bot added beginning-contributor [Pilot] contributed between 0-2 PRs to the CDK p2 labels Sep 30, 2025
@aws-cdk-automation aws-cdk-automation requested a review from a team September 30, 2025 03:59
@aws-cdk-automation aws-cdk-automation added the pr/needs-community-review This PR needs a review from a Trusted Community Member or Core Team Member. label Sep 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
beginning-contributor [Pilot] contributed between 0-2 PRs to the CDK p2 pr/needs-community-review This PR needs a review from a Trusted Community Member or Core Team Member.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants