Skip to content

providerID not set with karpenter nodes #1156

@Sindvero

Description

@Sindvero

Hello,

I'm seeing issue in aws cluster (not eks) with karpenter deployed where nodes created by Karpenter do not get their providerID set ever by the AWS CCM. The cluster is being created with ASG and when trying similar action on ASG created nodes, the nodes join the cluster without issue and get the providerID set by the CCM. Here's the error from the CCM:

error syncing 'ip-<node-ip>.<region>.compute.internal': failed to get instance metadata for node ip-<node-ip>.<region>.compute.internal: instance not found, requeuing
E0602 19:56:35.311131       1 node_controller.go:244] "Unhandled Error" err="error syncing 'ip-<node-ip>.<region>.compute.internal': failed to get instance metadata for node ip-<node-ip>.<region>.compute.internal: instance not found, requeuing" logger="UnhandledError" 
E0602 19:56:36.976054       1 node_lifecycle_controller.go:156] error checking if node ip-<node-ip>.<region>.compute.internal exists: instance not found
I0602 19:56:39.224230       1 node_controller.go:271] Update 7 nodes status took 1.627947232s.         
E0602 19:56:42.128159       1 node_lifecycle_controller.go:156] error checking if node ip-<node-ip>.<region>.compute.internal exists: instance not found
E0602 19:56:47.302222       1 node_lifecycle_controller.go:156] error checking if node ip-<node-ip>.<region>.compute.internal exists: instance not found

When setting the providerID manually or through another application, The node gets picked up by the CCM and everything's fine.

NodeClass config:

spec:
  amiFamily: Custom
  amiSelectorTerms:
  - tags:
      <custom-name>-karpenter-version: current
  associatePublicIPAddress: false
  blockDeviceMappings:
  - deviceName: /dev/xvda
    ebs:
      deleteOnTermination: true
      encrypted: true
      kmsKeyID: <kms-key-id>
      volumeSize: 100Gi
      volumeType: gp3
  detailedMonitoring: true
  metadataOptions:
    httpEndpoint: enabled
    httpProtocolIPv6: disabled
    httpPutResponseHopLimit: 3
    httpTokens: optional
  role: <iam-role>
  securityGroupSelectorTerms:
  - id: <sg>
  subnetSelectorTerms:
  - id: <subnet1>
  - id: <subnet2>
  - id: <subnet3>
  tags:
    <some-tag>
    <name>:rancher:managementApi: <api-url>
  userData: |
    #cloud-config
    yum_repos:
      artifactory:
        baseurl: <url>
        enabled: true
        gpgcheck: false
        name: "<name>"
    package_update: true
    package_upgrade: true
    write_files:
      - <file_config>
    runcmd:
      - |
        <Some script and initialization steps with CA and so on>
nodepool:
spec:
  disruption:
    budgets:
    - nodes: "1"
    consolidateAfter: 5m
    consolidationPolicy: WhenEmpty
  limits:
    cpu: 512
  template:
    metadata:
      labels:
        <some-tag>
    spec:
      expireAfter: Never
      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: prometheus
      requirements:
      - key: kubernetes.io/arch
        operator: In
        values:
        - amd64
      - key: kubernetes.io/os
        operator: In
        values:
        - linux
      - key: karpenter.sh/capacity-type
        operator: In
        values:
        - on-demand
      startupTaints:
      - effect: NoExecute
        key: node.cilium.io/agent-not-ready
        value: "true"
      taints:
      - effect: NoExecute
        key: <prefix>/dedicated
        value: prometheus

The cluster is managed with rke2 and has cloud-provider=external set.

Is there a particular setting that needs to be done on either side that I'm missing?

(cross-posting from kubernetes-sigs/karpenter#2281)

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    lifecycle/staleDenotes an issue or PR has remained open with no activity and has become stale.needs-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions