Skip to content

Instance metadata will be lost after Nacos restart #11890

@nkorange

Description

@nkorange

Describe the bug

Instance metadata will be lost after Nacos restart

Expected behavior

Instance metadata is not affected after Nacos restart

Actually behavior

Instance metadata is lost after Nacos restart

How to Reproduce
Steps to reproduce the behavior:

  1. Deploy 3 Nacos servers.
  2. Run a Nacos client 2.x registering an instance with service name 'test.1'.
  3. Change the instance to offline on the Nacos conole.
  4. Restart the Nacos server which the Nacos client connected to.
  5. Wait for 5 minutes.
  6. Call the following command to force refresh the data:
curl 'http://127.0.0.1:8848/nacos/v1/ns/instance/list?serviceName=test.1&udpPort=1111' -H 'User-Agent:Nacos-Java-Client:v2.0.0'
  1. You can find the instance status is back to online.

Desktop (please complete the following information):

  • OS: MacOS
  • Version nacos-server 2.3.1, nacos-client 2.1.2
  • Module naming
  • SDK original

Additional context

There was an issue #10975 reported a similar problem. The fix to that issue did solve the metadata loss after client reconnection.

But for Nacos restart, the metadata loss issue still persists.

The reason of this bug is that Nacos has a ExpiredClientCleaner that would remove all expired clients.

Consider the three Nacos servers are Nacos1, Nacos2 and Nacos3:

  1. Client connects to Nacos1 with client ID client_1
  2. client_1 connection data is synced to Nacos2 and Nacos3
  3. Set the instance to offline on Nacos console.
  4. Restart Nacos1
  5. Client would connect to Nacos2 with client ID client_2
  6. As there is no more heartbeat from client_1, and Nacos1 didn't send the Client-Delete event to Nacos2 and Nacos3. So client_1 connection data is still there in Nacos2 and Nacos3. Then Nacos2 and Nacos3 will consider client_1 expired. After 3 minutes, they will trigger the clean task in ExpiredClientCleaner.
  7. ExpiredClientCleaner will publish a ClientDisconnectEvent event
  8. NamingMetadataManager received the ClientDisconnectEvent event and mark the instance metadata expired.
  9. After 1 minute, the instance metadata is deleted.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/Namingkind/bugCategory issues or prs related to bug.kind/discussionCategory issues related to discussion

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions