Skip to content

HPO 659: Mix read/write with md5 checking halts all I/O on all metalLB IP's and does not recover unless noobaa pod restart #6934

@MonicaLemay

Description

@MonicaLemay

Environment info

  • NooBaa Version:
    [root@c83f1-app1 ~]# noobaa status
    INFO[0000] CLI version: 5.9.2
    INFO[0000] noobaa-image: noobaa/noobaa-core:nsfs_backport_5.9-20220331
    INFO[0000] operator-image: quay.io/rhceph-dev/odf4-mcg-rhel8-operator@sha256:01a31a47a43f01c333981056526317dfec70d1072dbd335c8386e0b3f63ef052
    INFO[0000] noobaa-db-image: quay.io/rhceph-dev/rhel8-postgresql-12@sha256:98990a28bec6aa05b70411ea5bd9c332939aea02d9d61eedf7422a32cfa0be54
  • Platform:
    [root@c83f1-app1 ~]# oc get csv
    NAME DISPLAY VERSION REPLACES PHASE
    mcg-operator.v4.9.5 NooBaa Operator 4.9.5 mcg-operator.v4.9.4 Succeeded
    ocs-operator.v4.9.5 OpenShift Container Storage 4.9.5 ocs-operator.v4.9.4 Succeeded
    odf-operator.v4.9.5 OpenShift Data Foundation 4.9.5 odf-operator.v4.9.4 Succeeded

Actual behavior

This is not the same as issue 6930.

In this issue that I am opening, it is true that the node remained in the Ready state so I don't expect any IP failover. This defect is not about metallb IP's not failing over. In this defect, I/O was running to metallb IP 172.20.100.31 which is for node master1. On node master0 , in the CNSA scale core pod, (namespace ibm-spectrum-scale), mmshutdown was issued for just that node. The other nodes remained active and with the filesystem mounted. Master0 has metallb IP 172.20.100.30. There was no I/O going to that IP.

What was observed after mmshutdown on master0 was that all I/O going to 172.20.100.31 stopped. Because of issue 6930, there was no failover. That is fine and expected. But what is not expected is for all I/O to stop.

sh-4.4# date; mmshutdown
Tue Apr  5 17:10:28 UTC 2022
Tue Apr  5 17:10:28 UTC 2022: mmshutdown: Starting force unmount of GPFS file systems
Tue Apr  5 17:10:34 UTC 2022: mmshutdown: Shutting down GPFS daemons
Shutting down!

When mmshutdown was issued, the noobaa endpoint pods only error was Stale file handle

Logs show stale file handle

pod/noobaa-endpoint-7fdb5b75fd-t99nd/endpoint] Apr-5 17:12:26.930 [Endpoint/14] [ERROR] core.endpoint.s3.s3_rest:: S3 ERROR <?xml version="1.0" encoding="UTF-8"?><Error><Code>InternalError</Code><Message>We encountered an internal error. Please try again.</Message><Resource>/s5001b85</Resource><RequestId>l1mefzt6-3wj6yz-8x</RequestId></Error> PUT /s5001b85 {"host":"172.20.100.30","accept-encoding":"identity","user-agent":"aws-cli/2.3.2 Python/3.8.8 Linux/4.18.0-240.el8.x86_64 exe/x86_64.rhel.8 prompt/off command/s3.mb","x-amz-date":"20220405T171225Z","x-amz-content-sha256":"61d056dc66f1882c0f4053be381523a7a28d384abde04fcf5b0021c716bb0ea1","authorization":"AWS4-HMAC-SHA256 Credential=QzhyXj9wVDH9DvnK97L9/20220405/us-east-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=6d9fa5c22501bfed4f312ac47621b6cec691bf1cf8f719e8250fcdc0f61522f1","content-length":"154"} Error: Stale file handle
[pod/noobaa-endpoint-7fdb5b75fd-t99nd/endpoint] Apr-5 17:12:27.589 [Endpoint/14]    [L0] core.sdk.bucketspace_nb:: could not create underlying directory - nsfs, deleting bucket [Error: Stale file handle] { code: 'Unknown system error -116' }
[pod/noobaa-endpoint-7fdb5b75fd-t99nd/endpoint] Apr-5 17:12:27.793 [Endpoint/14] [ERROR] core.endpoint.s3.s3_rest:: S3 ERROR <?xml version="1.0" encoding="UTF-8"?><Error><Code>InternalError</Code><Message>We encountered an internal error. Please try again.</Message><Resource>/s5001b85</Resource><RequestId>l1meg1if-7apxvy-1bas</RequestId></Error> PUT /s5001b85 {"host":"172.20.100.30","accept-encoding":"identity","user-agent":"aws-cli/2.3.2 Python/3.8.8 Linux/4.18.0-240.el8.x86_64 exe/x86_64.rhel.8 prompt/off command/s3.mb","x-amz-date":"20220405T171227Z","x-amz-content-sha256":"61d056dc66f1882c0f4053be381523a7a28d384abde04fcf5b0021c716bb0ea1","authorization":"AWS4-HMAC-SHA256 Credential=QzhyXj9wVDH9DvnK97L9/20220405/us-east-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=bb34c5aa7ed665ae73a2d172ab33d4056c2611ad2c09331ece80089cff46df05","content-length":"154"} Error: Stale file handle
[root@c83f1-app1 ~

This error is a bit odd because it is on the endpoint pod that was for master0. Master0's metallb IP was 172.20.100.30. Cosbench workload was only set up for 172.20.100.31.

An additional observation is that s3 command for list will work but not for write.

[root@c83f1-dan4 RW_workloads]# date; s5001_2_31 ls
Tue Apr  5 18:33:40 EDT 2022
2022-04-05 18:33:43 s5001b100
2022-04-05 18:33:43 s5001b63
2022-04-05 18:33:43 s5001b62
2022-04-05 18:33:43 s5001b61


[root@c83f1-dan4 RW_workloads]# date;  s5001_2_31 cp alias_commands s3://s5001b1
Tue Apr  5 18:35:36 EDT 2022
^C^Cfatal error:
^CError in atexit._run_exitfuncs:
Traceback (most recent call last):
  File "concurrent/futures/thread.py", line 40, in _python_exit
  File "threading.py", line 1011, in join
  File "threading.py", line 1027, in _wait_for_tstate_lock
KeyboardInterrupt
[root@c83f1-dan4 RW_workloads]# date
Tue Apr  5 18:36:43 EDT 2022
[root@c83f1-dan4 RW_workloads]#

All future PUT to 172.20.100.31 and 172.20.100.32, get a timeout (if I don't CTL-C) and the endpoint pods record a "Error: Semaphore Timeout"
From 31 and 32, we can do GET and we can read from the Noobaa database. If we rsh into the endpoint pods for the IP's of 172.20100.31 and 172.20.100.32, we see that Spectrum Scale is still mounted in the correct place and we can write to it manually with touch file. So, this tells us that the IP's 31 and 32 are still alive and that the noobaa db is still online. It also tells us that the Spectrum Scale filesystem is still mounted and writable. The timeout on the subsequent PUT tell us that it makes a connection request but never gets a response.

The endpoint pods never restarted and they sill have their labels.

Also, In the scale core pod we run mmhealth node show -N all and we see that everything is HEALTHY, except of course on the one node that we did mmshutdown.

  sh-4.4# mmhealth node show

Node name:      master1-daemon
Node status:    HEALTHY
Status Change:  1 day ago

Component      Status        Status Change     Reasons & Notices
----------------------------------------------------------------
GPFS           HEALTHY       1 day ago         -
GUI            HEALTHY       1 day ago         -
NETWORK        HEALTHY       9 days ago        -
FILESYSTEM     HEALTHY       9 days ago        -
NOOBAA         HEALTHY       26 min. ago       -
PERFMON        HEALTHY       1 day ago         -
THRESHOLD      HEALTHY       9 days ago        -
sh-4.4# set -o vi
sh-4.4# mmhealth node show -N all

Node name:      master0-daemon
Node status:    FAILED
Status Change:  36 min. ago

Component      Status        Status Change     Reasons & Notices
--------------------------------------------------------------------------------
GPFS           FAILED        36 min. ago       gpfs_down, quorum_down
NETWORK        HEALTHY       1 day ago         -
FILESYSTEM     DEPEND        36 min. ago       unmounted_fs_check(remote-sample)
PERFMON        HEALTHY       1 day ago         -
THRESHOLD      HEALTHY       1 day ago         -

Node name:      master1-daemon
Node status:    HEALTHY
Status Change:  1 day ago

Component      Status        Status Change     Reasons & Notices
----------------------------------------------------------------
GPFS           HEALTHY       1 day ago         -
GUI            HEALTHY       1 day ago         -
NETWORK        HEALTHY       9 days ago        -
FILESYSTEM     HEALTHY       9 days ago        -
NOOBAA         HEALTHY       27 min. ago       -
PERFMON        HEALTHY       1 day ago         -
THRESHOLD      HEALTHY       9 days ago        -

Node name:      master2-daemon
Node status:    HEALTHY
Status Change:  1 day ago

Component       Status        Status Change     Reasons & Notices
-----------------------------------------------------------------
CALLHOME        HEALTHY       9 days ago        -
GPFS            HEALTHY       1 day ago         -
NETWORK         HEALTHY       9 days ago        -
FILESYSTEM      HEALTHY       9 days ago        -
GUI             HEALTHY       3 days ago        -
HEALTHCHECK     HEALTHY       9 days ago        -
PERFMON         HEALTHY       9 days ago        -
THRESHOLD       HEALTHY       9 days ago        -

Something is obviously hung in the PUT connection but logs and noobaa health don't point to anything.
When we issue mmstartup the PUT's still fail. The only way to recover is to delete the Noobaa endpoint pods and have new ones generated again.

I have been able to recreate this very easily so if it is required I can set this up on my test stand

Expected behavior

1.When doing mmshutdown on one node, it should not impact cluster wide I/O capability. It should not be an outage. If indeed an outage is expected, then mmstartup should recover I/O capability.

Steps to reproduce

  1. Start a cosbench run. I can provide the xml if needed. Once I/O is running, issue mmshutdown from within one CNSA Ibm Spectrum Scale core pod.

More information - Screenshots / Logs / Other output

Must gather and noobaa diagnose in https://ibm.ent.box.com/folder/145794528783?s=uueh7fp424vxs2bt4ndrnvh7uusgu6tocd

This issue started as HPO https://github.ibm.com/IBMSpectrumScale/hpo-core/issues/659 Screeners determined that it was with Nooobaa. I have also slacked the CNSA team for input but have not heard back.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions