Skip to content

aws eks 1.21 Bound Service Account Token Volume fails postgres-operator and pods runs into readonly mode #1904

@kost2191

Description

@kost2191
  • Which image of the operator are you using? e.g. registry.opensource.zalan.do/acid/postgres-operator:v1.6.1
  • Where do you run it - cloud or metal? Kubernetes or OpenShift? [AWS K8s 1.21
  • Are you running Postgres Operator in production? yes
  • **Type of issue?**question

After upgrading to 1.21 eks AWS we fased issue of outdated serviceaccount token (https://docs.aws.amazon.com/eks/latest/userguide/service-accounts.html#identify-pods-using-stale-tokens). Postgresql-operator is set to use podtgres-pod serviceaccount. After 90 days after upgrading eks cluster pods that are 90d old faced this error in postgres pods:

2022-05-25 00:31:10,670 ERROR: Unexpected error from Kubernetes API
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/patroni/dcs/kubernetes.py", line 481, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/patroni/dcs/kubernetes.py", line 1012, in touch_member
    ret = self._api.patch_namespaced_pod(self._name, self._namespace, body)
  File "/usr/local/lib/python3.6/dist-packages/patroni/dcs/kubernetes.py", line 466, in wrapper
    return getattr(self._core_v1_api, func)(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/patroni/dcs/kubernetes.py", line 402, in wrapper
    return self._api_client.call_api(method, path, headers, body, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/patroni/dcs/kubernetes.py", line 371, in call_api
    return self._handle_server_response(response, _preload_content)
  File "/usr/local/lib/python3.6/dist-packages/patroni/dcs/kubernetes.py", line 201, in _handle_server_response
    raise k8s_client.rest.ApiException(http_resp=response)
patroni.dcs.kubernetes.K8sClient.rest.ApiException: (401)
Reason: Unauthorized

and this one:

2022-05-25 02:33:10,501 ERROR: ObjectCache.run ApiException()
2022-05-25 02:33:10,501 ERROR: ObjectCache.run ApiException()
2022-05-25 02:33:11,507 ERROR: ObjectCache.run ApiException()
2022-05-25 02:33:11,508 ERROR: ObjectCache.run ApiException()
2022-05-25 02:33:12.222 39 LOG {ticks: 0, maint: 0, retry: 0}
2022-05-25 02:33:12,513 ERROR: ObjectCache.run ApiException()
2022-05-25 02:33:12,514 ERROR: ObjectCache.run ApiException()
2022-05-25 02:33:13,524 ERROR: ObjectCache.run ApiException()
2022-05-25 02:33:13,525 ERROR: ObjectCache.run ApiException()
2022-05-25 02:33:14,532 ERROR: ObjectCache.run ApiException()
2022-05-25 02:33:14,532 ERROR: ObjectCache.run ApiException()
2022-05-25 02:33:15,547 ERROR: ObjectCache.run ApiException()
2022-05-25 02:33:15,547 ERROR: ObjectCache.run ApiException()
2022-05-25 02:33:16,564 ERROR: ObjectCache.run ApiException()
2022-05-25 02:33:16,565 ERROR: ObjectCache.run ApiException()
2022-05-25 02:33:17,572 ERROR: ObjectCache.run ApiException()
2022-05-25 02:33:17,572 ERROR: ObjectCache.run ApiException()
2022-05-25 02:33:18,380 ERROR: get_cluster
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/patroni/dcs/kubernetes.py", line 701, in _load_cluster
    self._wait_caches(stop_time)
  File "/usr/local/lib/python3.6/dist-packages/patroni/dcs/kubernetes.py", line 693, in _wait_caches
    raise RetryFailedError('Exceeded retry deadline')
patroni.utils.RetryFailedError: 'Exceeded retry deadline'
2022-05-25 02:33:18,380 ERROR: Error communicating with DCS
2022-05-25 02:33:18,381 INFO: DCS is not accessible
2022-05-25 02:33:18,382 WARNING: Loop time exceeded, rescheduling immediately.
2022-05-25 02:33:18,580 ERROR: ObjectCache.run ApiException()
2022-05-25 02:33:18,581 ERROR: ObjectCache.run ApiException()
2022-05-25 02:33:19,591 ERROR: ObjectCache.run ApiException()
2022-05-25 02:33:19,591 ERROR: ObjectCache.run ApiException()

Is there any option to set refresh time for tokens?
We solved it deleting pods one by one, but this is not an option in long run

Metadata

Metadata

Assignees

No one assigned

    Labels

    patroniIssue more related to PatronispiloIssue more related to Spilo

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions