-
Notifications
You must be signed in to change notification settings - Fork 26
Open
Description
网络问题导致Eureka同步失败问题
问题现象
测试时候发现,nacos同步Eureka时候一个节点失联了,Eureka Server日志如下
2022-06-09 10:49:30.830 WARN 41149 --- [cosSynchronizer] c.n.e.registry.AbstractInstanceRegistry : DS: Registry: lease doesn't exist, registering resource: NACOS-SERVICE-DEMO - nacos-service-demo:192.168.8.13:8889
2022-06-09 10:49:40.842 WARN 41149 --- [cosSynchronizer] c.n.e.registry.AbstractInstanceRegistry : DS: Registry: lease doesn't exist, registering resource: NACOS-SERVICE-DEMO - nacos-service-demo:192.168.8.13:8889
2022-06-09 10:49:50.860 WARN 41149 --- [cosSynchronizer] c.n.e.registry.AbstractInstanceRegistry : DS: Registry: lease doesn't exist, registering resource: NACOS-SERVICE-DEMO - nacos-service-demo:192.168.8.13:8889
2022-06-09 10:50:00.877 WARN 41149 --- [cosSynchronizer] c.n.e.registry.AbstractInstanceRegistry : DS: Registry: lease doesn't exist, registering resource: NACOS-SERVICE-DEMO - nacos-service-demo:192.168.8.13:8889
2022-06-09 10:50:10.893 WARN 41149 --- [cosSynchronizer] c.n.e.registry.AbstractInstanceRegistry : DS: Registry: lease doesn't exist, registering resource: NACOS-SERVICE-DEMO - nacos-service-demo:192.168.8.13:8889
2022-06-09 10:50:20.904 WARN 41149 --- [cosSynchronizer] c.n.e.registry.AbstractInstanceRegistry : DS: Registry: lease doesn't exist, registering resource: NACOS-SERVICE-DEMO - nacos-service-demo:192.168.8.13:8889
代码追踪
NacosSynchronizer
public void syncService() throws Exception {
ListView<String> serviceList = namingService.getServicesOfServer(1, 1000);
for (String service : serviceList.getData()) {
List<Instance> instances = namingService.getAllInstances(service);
for (Instance instance : instances) {
if (!isFromEureka(instance)) {
String instanceId = String.format("%s:%s:%s", service, instance.getIp(), instance.getPort());
peerAwareInstanceRegistry.renew(service.toUpperCase(), instanceId, false);
}
}
List<ServiceInfo> list = namingService.getSubscribeServices();
Optional<ServiceInfo> optional = list.stream().filter(serviceInfo -> serviceInfo.getName().equals(service)).findFirst();
if (!optional.isPresent()) {
namingService.subscribe(service, listener);
}
}
}
代码做renew并监听nacos状态同步到eureka
其中这个代码peerAwareInstanceRegistry.renew(service.toUpperCase(), instanceId, false);处理逻辑
public boolean renew(String appName, String id, boolean isReplication) {
RENEW.increment(isReplication);
Map<String, Lease<InstanceInfo>> gMap = registry.get(appName);
Lease<InstanceInfo> leaseToRenew = null;
if (gMap != null) {
leaseToRenew = gMap.get(id);
}
if (leaseToRenew == null) {
RENEW_NOT_FOUND.increment(isReplication);
logger.warn("DS: Registry: lease doesn't exist, registering resource: {} - {}", appName, id);
return false;
} else {
// do shting
}
}
当节点在Eureka不存在leaseToRenew == null,不会触发renew操作,也不会触发注册,若节点因为网络问题导致Eureka server和nacos节点失联
Eureka删除了nacos节点,在网络恢复后Eureka Server定时任务拉取节点做renew操作因为节点不存在不会做renew,nacos集群状态没有变更也不会触发NacosEventListener监听去注册节点,
最终导致即使网络恢复节点也无法同步到eureka
临时解决方案,是通过重启nacos节点重复NacosEventListener事件
是否应该在syncService里面做个兜底,类似,或判断存在性 做renew或register
if (!isFromEureka(instance)) {
String instanceId = String.format("%s:%s:%s", service, instance.getIp(), instance.getPort());
boolean renewSuccess = peerAwareInstanceRegistry.renew(service.toUpperCase(), instanceId, false);
if(!renewSuccess) {
peerAwareInstanceRegistry.register(xxx);
}
}
Metadata
Metadata
Assignees
Labels
No labels