【原创】zookeeper 使用中的 TCP 链路残留问题

本文涉及的产品
服务治理 MSE Sentinel/OpenSergo,Agent数量 不受限
简介:
最近公司环境中报了一个 zookeeper 相关问题,问题描述如下:  
业务人员发现xx资源被重复上报了,导致资源池中的内容出了问题,问题的发现是在一次机房断电后,据相关人员说,上报资源的业务也重启(也可能是被强杀)过几次。
      最终看到的结果是,连接到 zookeeper 的 TCP 连接出现了多条,而正常情况下,业务和 zookeeper 之间的 TCP 连接只有一条,同时发现,多出的几条 TCP 连接似乎都是该业务每次重启(也可能是被强杀)后残留下来的。  
      那么问题就来了,为什么系统中仅仅跑了一个业务进程,而一个进程只会建立一条到 zookeeper 的 TCP 链路,但却同时存在多条?退一步讲,即使在一段时间内,存在了多条残留的 TCP 连接,但 zookeeper 也应该能够通过心跳机制,检测到相应连接已经断开了呀(遗憾的时,排查问题的相关人员,在发现问题后没有针对几条疑似残留的 TCP 连接进行抓包分析),为何在观察的几个小时内(观察时间超过了 2 个小时)毫无反应?  

该问题由于不是我主要负责,所以我也就基于自己了解到的信息查询了相关资料,排查过程如下:  

1.确认 zookeeper 心跳相关内容  

客户端侧代码如下
 

 

 

 
         对一个正常 zookeeper TCP 连接进行抓包,可以看到下图
 
 
      从上图中可以看出,在空闲状态下,每隔 3 秒左右,业务(客户端)会主动向 zookeeper 发送心跳请求(ping),zookeeper 也会进行相应的应答回复。  
         另一个发现,当根据包的大小进行排序后(例如长度为 74 的包对应了 ping),可以看到:所有的心跳请求都是由业务(客户端)发送给 zookeeper ,反方向却一个没有~~    
这个行为就有点意思了,假设 zookeeper 不会主动发送心跳请求给客户端,那么如果其没有实现一个连接超时检测机制(或者该机制存在 bug)的话,当客户端侧发生异常,则是由可能导致 TCP 连接残留的。  
      由于 zookeeper 服务端代码我没看过,咨询过相关维护人员,其告诉我 zookeeper 侧是存在这种超时检测机制的,所以理论上讲应该不存在上述问题~~  

查看了 zookeeper 的配置信息,似乎也没什么特别问题
[root@upu_2 ~]# ps aux|grep zook
root      7853  0.1  5.4 6213632 888956 ?      Sl   Mar10   3:59 /usr/java/jdk1.6.0_45/bin/java -Dzookeeper.log.dir=/var/log/zookeeper -Dzookeeper.root.logger=INFO,CONSOLE -cp /usr/local/zookeeper/bin/../build/classes:/usr/local/zookeeper/bin/../build/lib/*.jar:/usr/local/zookeeper/bin/../lib/slf4j-log4j12-1.6.1.jar:/usr/local/zookeeper/bin/../lib/slf4j-api-1.6.1.jar:/usr/local/zookeeper/bin/../lib/netty-3.2.2.Final.jar:/usr/local/zookeeper/bin/../lib/log4j-1.2.15.jar:/usr/local/zookeeper/bin/../lib/jline-0.9.94.jar:/usr/local/zookeeper/bin/../zookeeper-3.4.5.jar:/usr/local/zookeeper/bin/../src/java/lib/*.jar:/usr/local/zookeeper/bin/../conf: org.apache.zookeeper.server.quorum.QuorumPeerMain /usr/local/zookeeper/bin/../conf/zoo.cfg
root     17915  0.0  0.0 103252   844 pts/1    S+   10:11   0:00 grep zook
[root@upu_2 ~]# 
[root@upu_2 ~]# 
[root@upu_2 ~]# cat /usr/local/zookeeper/bin/../conf/zoo.cfg
tickTime=2000
dataDir=/usr/local/zookeeper/data
initLimit=5
syncLimit=2
clientPort=2181
autopurge.snapRetainCount=6
autopurge.purgeInterval=12
server.2=10.23.46.35:2888:3888
[root@upu_2 ~]#

2.怀疑 zookeeper 在 ipv6 地址上进行监听存在缺陷

      后来有大牛在排查时注意到出现问题的 TCP 连接,均属于 IPv6 而并非常见的 IPv4 ,那么是否和这个有关呢?  
查看《   ZooKeeper Administrator's Guide   》,有如下说明:
Configuration Parameters
...
clientPort
    the port to listen for client connections; that is, the port that clients attempt to connect to.
...
clientPortAddress
    New in 3.3.0: the address (ipv4, ipv6 or hostname) to listen for client connections; that is, the address that clients attempt to connect to. This is optional, by default we bind in such a way that any connection to the clientPort for any address/interface/nic on the server will be accepted.
另外,网上可以查到的文章都说  
  • 若不配置 clientPortAddress 的话,则 zookeeper 默认会监听 IPv6 地址(当然,前提是你的机器配置了 IPv6 相关内容);
  • 可以通过在 zookeeper 的启动脚本中配置 "-Djava.net.preferIPv4Stack=true" 以强制其优先使用 IPv4 接收来自客户端的连接(可能存在客户端通过 IPv4 地址连接不上 zookeeper 的 IPv6 监听上的情况);
根据我观察到的情况,结论如下:  
  • 在不配置 clientPortAddress 的情况下,服务器确实监听的是 IPv6 ;
  • 在不强制 zookeeper 优先使用 IPv4 的情况下,其依然可以接收来自 IPv4 的连接;
[root@upu_2 zookeeper]# lsof -i :2181 -P
COMMAND   PID USER   FD   TYPE   DEVICE SIZE/OFF NODE NAME
java     7853 root   16u  IPv6    21294      0t0  TCP *:2181 (LISTEN)
...
java     7853 root   27u  IPv6 15051913      0t0  TCP 10.23.46.36:2181->10.23.46.36:26888 (ESTABLISHED)
css     29804 root  113u  IPv4 15051910      0t0  TCP 10.23.46.36:26888->10.23.46.36:2181 (ESTABLISHED)
既然 TCP 连接(即 IPv6 协议栈能够正确处理 IPv4 连接)没有问题,那么接下来的问题就是,出现问题的连接和其他的连接有和不同?  

实验如下
[root@upu_2 zookeeper]# lsof -i :2181 -P
COMMAND   PID USER   FD   TYPE   DEVICE SIZE/OFF NODE NAME
java     7853 root   16u  IPv6    21294      0t0  TCP *:2181 (LISTEN)
java     7853 root   19u  IPv6 11514633      0t0  TCP 10.23.46.36:2181->10.23.46.39:26570 (ESTABLISHED)
java     7853 root   20u  IPv6 11514635      0t0  TCP 10.23.46.36:2181->10.23.46.42:37657 (ESTABLISHED)
java     7853 root   21u  IPv6 11515884      0t0  TCP 10.23.46.36:2181->10.23.46.43:33259 (ESTABLISHED)
java     7853 root   22u  IPv6 11515899      0t0  TCP 10.23.46.36:2181->10.23.46.39:26577 (ESTABLISHED)
java     7853 root   23u  IPv6 11516014      0t0  TCP 10.23.46.36:2181->10.23.46.39:26581 (ESTABLISHED)
java     7853 root   24u  IPv6 11516017      0t0  TCP 10.23.46.36:2181->10.23.46.39:26584 (ESTABLISHED)
java     7853 root   25u  IPv6 11516068      0t0  TCP 10.23.46.36:2181->10.23.46.39:26586 (ESTABLISHED)
java     7853 root   26u  IPv6 11516069      0t0  TCP 10.23.46.36:2181->10.23.46.39:26587 (ESTABLISHED)
java     7853 root   27u  IPv6 15051913      0t0  TCP 10.23.46.36:2181->10.23.46.36:26888 (ESTABLISHED)
css     29804 root  113u  IPv4 15051910      0t0  TCP 10.23.46.36:26888->10.23.46.36:2181 (ESTABLISHED)
[root@upu_2 zookeeper]# 


[root@upu_2 zookeeper]# tcpdump -i eth2 tcp port 26570 -s 0     
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth2, link-type EN10MB (Ethernet), capture size 65535 bytes
14:41:40.342746 IP 10.23.46.39.26570 > 10.23.46.36.eforward: Flags [P.], seq 2527560660:2527560664, ack 198355173, win 269, options [nop,nop,TS val 170901466 ecr 170833048], length 4
14:41:40.342764 IP 10.23.46.39.26570 > 10.23.46.36.eforward: Flags [P.], seq 4:12, ack 1, win 269, options [nop,nop,TS val 170901466 ecr 170833048], length 8
14:41:40.343032 IP 10.23.46.36.eforward > 10.23.46.39.26570: Flags [.], ack 12, win 114, options [nop,nop,TS val 170836385 ecr 170901466], length 0
14:41:40.343332 IP 10.23.46.36.eforward > 10.23.46.39.26570: Flags [P.], seq 1:21, ack 12, win 114, options [nop,nop,TS val 170836385 ecr 170901466], length 20
14:41:40.343418 IP 10.23.46.39.26570 > 10.23.46.36.eforward: Flags [.], ack 21, win 269, options [nop,nop,TS val 170901467 ecr 170836385], length 0
^C
5 packets captured
11 packets received by filter
0 packets dropped by kernel
[root@upu_2 zookeeper]# ^C
[root@upu_2 zookeeper]# 
[root@upu_2 zookeeper]# 
[root@upu_2 zookeeper]# tcpdump -i eth2 tcp port 37657 -s 0     
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth2, link-type EN10MB (Ethernet), capture size 65535 bytes
14:41:50.340486 IP 10.23.46.42.37657 > 10.23.46.36.eforward: Flags [P.], seq 4258153112:4258153116, ack 4088319963, win 18, length 4
14:41:50.340501 IP 10.23.46.42.37657 > 10.23.46.36.eforward: Flags [P.], seq 4:12, ack 1, win 18, length 8
14:41:50.340625 IP 10.23.46.36.eforward > 10.23.46.42.37657: Flags [.], ack 12, win 537, length 0
14:41:50.340798 IP 10.23.46.36.eforward > 10.23.46.42.37657: Flags [P.], seq 1:21, ack 12, win 537, length 20
14:41:50.340927 IP 10.23.46.42.37657 > 10.23.46.36.eforward: Flags [.], ack 21, win 18, length 0

^C
5 packets captured
5 packets received by filter
0 packets dropped by kernel
[root@upu_2 zookeeper]# 
[root@upu_2 zookeeper]# 
[root@upu_2 zookeeper]# tcpdump -i eth2 tcp port 33259 -s 0     
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth2, link-type EN10MB (Ethernet), capture size 65535 bytes
14:42:00.802239 IP 10.23.46.43.33259 > 10.23.46.36.eforward: Flags [P.], seq 3634190124:3634190128, ack 3831351430, win 18, length 4
14:42:00.802258 IP 10.23.46.43.33259 > 10.23.46.36.eforward: Flags [P.], seq 4:12, ack 1, win 18, length 8
14:42:00.802377 IP 10.23.46.36.eforward > 10.23.46.43.33259: Flags [.], ack 12, win 501, length 0
14:42:00.802562 IP 10.23.46.36.eforward > 10.23.46.43.33259: Flags [P.], seq 1:21, ack 12, win 501, length 20
14:42:00.802691 IP 10.23.46.43.33259 > 10.23.46.36.eforward: Flags [.], ack 21, win 18, length 0

^C
5 packets captured
5 packets received by filter
0 packets dropped by kernel
[root@upu_2 zookeeper]# ^C
[root@upu_2 zookeeper]# 
[root@upu_2 zookeeper]# 
[root@upu_2 zookeeper]# tcpdump -i eth2 tcp port 26577 -s 0     
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth2, link-type EN10MB (Ethernet), capture size 65535 bytes
14:42:08.255987 IP 10.23.46.39.26577 > 10.23.46.36.eforward: Flags [P.], seq 245299443:245299447, ack 584627376, win 58, options [nop,nop,TS val 170929380 ecr 170860963], length 4
14:42:08.256011 IP 10.23.46.39.26577 > 10.23.46.36.eforward: Flags [P.], seq 4:12, ack 1, win 58, options [nop,nop,TS val 170929380 ecr 170860963], length 8
14:42:08.256176 IP 10.23.46.36.eforward > 10.23.46.39.26577: Flags [.], ack 12, win 1539, options [nop,nop,TS val 170864298 ecr 170929380], length 0
14:42:08.256503 IP 10.23.46.36.eforward > 10.23.46.39.26577: Flags [P.], seq 1:21, ack 12, win 1539, options [nop,nop,TS val 170864298 ecr 170929380], length 20
14:42:08.256640 IP 10.23.46.39.26577 > 10.23.46.36.eforward: Flags [.], ack 21, win 58, options [nop,nop,TS val 170929380 ecr 170864298], length 0

^C
5 packets captured
6 packets received by filter
0 packets dropped by kernel
[root@upu_2 zookeeper]# 
[root@upu_2 zookeeper]# 
[root@upu_2 zookeeper]# tcpdump -i eth2 tcp port 26581 -s 0     
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth2, link-type EN10MB (Ethernet), capture size 65535 bytes
14:42:17.330446 IP 10.23.46.39.26581 > 10.23.46.36.eforward: Flags [P.], seq 311282941:311282945, ack 758925432, win 58, options [nop,nop,TS val 170938454 ecr 170870040], length 4
14:42:17.330466 IP 10.23.46.39.26581 > 10.23.46.36.eforward: Flags [P.], seq 4:12, ack 1, win 58, options [nop,nop,TS val 170938454 ecr 170870040], length 8
14:42:17.330588 IP 10.23.46.36.eforward > 10.23.46.39.26581: Flags [.], ack 12, win 114, options [nop,nop,TS val 170873373 ecr 170938454], length 0
14:42:17.330816 IP 10.23.46.36.eforward > 10.23.46.39.26581: Flags [P.], seq 1:21, ack 12, win 114, options [nop,nop,TS val 170873373 ecr 170938454], length 20
14:42:17.330976 IP 10.23.46.39.26581 > 10.23.46.36.eforward: Flags [.], ack 21, win 58, options [nop,nop,TS val 170938455 ecr 170873373], length 0

^C
5 packets captured
5 packets received by filter
0 packets dropped by kernel
[root@upu_2 zookeeper]# 
[root@upu_2 zookeeper]# 
[root@upu_2 zookeeper]# tcpdump -i eth2 tcp port 26584 -s 0     
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth2, link-type EN10MB (Ethernet), capture size 65535 bytes
14:42:25.784667 IP 10.23.46.39.26584 > 10.23.46.36.eforward: Flags [P.], seq 2592715739:2592715743, ack 2440448357, win 58, options [nop,nop,TS val 170946908 ecr 170880076], length 4
14:42:25.784688 IP 10.23.46.39.26584 > 10.23.46.36.eforward: Flags [P.], seq 4:109, ack 1, win 58, options [nop,nop,TS val 170946908 ecr 170880076], length 105
14:42:25.784825 IP 10.23.46.36.eforward > 10.23.46.39.26584: Flags [.], ack 109, win 114, options [nop,nop,TS val 170881827 ecr 170946908], length 0
14:42:25.789914 IP 10.23.46.36.eforward > 10.23.46.39.26584: Flags [P.], seq 1:89, ack 109, win 114, options [nop,nop,TS val 170881832 ecr 170946908], length 88
14:42:25.790030 IP 10.23.46.39.26584 > 10.23.46.36.eforward: Flags [.], ack 89, win 58, options [nop,nop,TS val 170946914 ecr 170881832], length 0

^C
5 packets captured
12 packets received by filter
0 packets dropped by kernel
[root@upu_2 zookeeper]# ^C
[root@upu_2 zookeeper]# 
[root@upu_2 zookeeper]# 
[root@upu_2 zookeeper]# tcpdump -i eth2 tcp port 26586 -s 0          
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth2, link-type EN10MB (Ethernet), capture size 65535 bytes
14:42:37.566211 IP 10.23.46.39.26586 > 10.23.46.36.eforward: Flags [P.], seq 1062233276:1062233280, ack 4128653018, win 769, options [nop,nop,TS val 170958690 ecr 170890272], length 4
14:42:37.566230 IP 10.23.46.39.26586 > 10.23.46.36.eforward: Flags [P.], seq 4:12, ack 1, win 769, options [nop,nop,TS val 170958690 ecr 170890272], length 8
14:42:37.566348 IP 10.23.46.36.eforward > 10.23.46.39.26586: Flags [.], ack 12, win 114, options [nop,nop,TS val 170893608 ecr 170958690], length 0
14:42:37.566500 IP 10.23.46.36.eforward > 10.23.46.39.26586: Flags [P.], seq 1:21, ack 12, win 114, options [nop,nop,TS val 170893608 ecr 170958690], length 20
14:42:37.566641 IP 10.23.46.39.26586 > 10.23.46.36.eforward: Flags [.], ack 21, win 769, options [nop,nop,TS val 170958690 ecr 170893608], length 0

^C
5 packets captured
5 packets received by filter
0 packets dropped by kernel
[root@upu_2 zookeeper]# ^C
[root@upu_2 zookeeper]# 
[root@upu_2 zookeeper]# 
[root@upu_2 zookeeper]# tcpdump -i eth2 tcp port 26587 -s 0     
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth2, link-type EN10MB (Ethernet), capture size 65535 bytes
14:42:45.986057 IP 10.23.46.36.eforward > 10.23.46.39.26587: Flags [P.], seq 335958315:335958381, ack 2473985165, win 114, options [nop,nop,TS val 170902028 ecr 170965365], length 66
14:42:45.986197 IP 10.23.46.39.26587 > 10.23.46.36.eforward: Flags [.], ack 66, win 771, options [nop,nop,TS val 170967110 ecr 170902028], length 0
14:42:45.986330 IP 10.23.46.39.26587 > 10.23.46.36.eforward: Flags [P.], seq 1:5, ack 66, win 771, options [nop,nop,TS val 170967110 ecr 170902028], length 4
14:42:45.986346 IP 10.23.46.39.26587 > 10.23.46.36.eforward: Flags [P.], seq 5:52, ack 66, win 771, options [nop,nop,TS val 170967110 ecr 170902028], length 47
14:42:45.986453 IP 10.23.46.36.eforward > 10.23.46.39.26587: Flags [.], ack 52, win 114, options [nop,nop,TS val 170902028 ecr 170967110], length 0
14:42:45.986671 IP 10.23.46.36.eforward > 10.23.46.39.26587: Flags [P.], seq 66:209, ack 52, win 114, options [nop,nop,TS val 170902029 ecr 170967110], length 143
14:42:46.026053 IP 10.23.46.39.26587 > 10.23.46.36.eforward: Flags [.], ack 209, win 771, options [nop,nop,TS val 170967150 ecr 170902029], length 0

^C
7 packets captured
8 packets received by filter
0 packets dropped by kernel
[root@upu_2 zookeeper]# ^C
[root@upu_2 zookeeper]# 
[root@upu_2 zookeeper]# 
[root@upu_2 zookeeper]# tcpdump -i eth2 tcp port 26888 -s 0     
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth2, link-type EN10MB (Ethernet), capture size 65535 bytes

(过了很久也没抓到任何东西,后来发现原来是抓错了网口,应该抓 lo 才对,ORZ...)

^C
0 packets captured
1 packets received by filter
0 packets dropped by kernel
[root@upu_2 zookeeper]# ^C
[root@upu_2 zookeeper]#
总之,从实验中可以看到,针对所有的 zookeeper 连接进行抓包,所有连接上都抓到了心跳交互。  

3.怀疑业务与 zookeeper 通过虚地址建立的 TCP 在异常断电时有问题  

从连接信息上看,本地业务通过虚地址连接到了 zookeeper 上。
 
难道这个情况和虚地址有关系?这个问题目前无法确认,因为测试人员在自己的环境中进行模拟测试时,无法复现该问题。  
相关实践学习
基于MSE实现微服务的全链路灰度
通过本场景的实验操作,您将了解并实现在线业务的微服务全链路灰度能力。
目录
相关文章
|
15天前
|
网络协议 Linux 网络架构
TCP中的两类故障模式
【4月更文挑战第10天】故障分为两大类,一类是对端无 FIN 包,需要通过巡检或超时来发现;另一类是对端有 FIN 包发出,需要通过增强 read 或 write 操作的异常处理,帮助我们发现此类异常。
|
6月前
|
canal 消息中间件 监控
使用TCP方式拉取Canal数据
使用TCP方式拉取Canal数据
62 0
|
8月前
|
存储 网络协议 Linux
深入理解Linux网络——TCP连接建立过程(三次握手源码详解)-1
一、相关实际问题 为什么服务端程序都需要先listen一下 半连接队列和全连接队列长度如何确定 “Cannot assign requested address”这个报错是怎么回事
|
8月前
|
缓存 网络协议 NoSQL
深入理解Linux网络——TCP连接建立过程(三次握手源码详解)-3
五、异常TCP建立情况 1)connect系统调用耗时失控 客户端在发起connect系统调用的的时候,主要工作就是端口选择。在选择的过程中有一个大循环
|
8月前
|
存储 缓存 网络协议
深入理解Linux网络——TCP连接建立过程(三次握手源码详解)-2
三、深入理解connect 客户端再发起连接的时候,创建一个socket,如何瞄准服务端调用connect就可以了,代码可以简单到只有两句。
深入理解Linux网络——TCP连接建立过程(三次握手源码详解)-2
|
自然语言处理 网络协议 Java
【网络篇】第十一篇——简单的TCP英译汉服务器
【网络篇】第十一篇——简单的TCP英译汉服务器
【网络篇】第十一篇——简单的TCP英译汉服务器
|
缓存 网络协议 Java
在项目中使用Curator的Java 客户端搭建后进行长TCP连接和TCP权限配置【Zookeeper】
在项目中使用Curator的Java 客户端搭建后进行长TCP连接和TCP权限配置【Zookeeper】
257 0
在项目中使用Curator的Java 客户端搭建后进行长TCP连接和TCP权限配置【Zookeeper】
|
网络协议 中间件 Java
解Bug之路-记一次对端机器宕机后的tcp行为
解Bug之路-记一次对端kernel宕机后的tcp行为 前言 机器一般过质保之后,就会因为各种各样的问题而宕机。而这一次的宕机,让笔者观察到了平常观察不到的tcp在对端宕机情况下的行为。经过详细跟踪分析原因之后,发现可以通过调整内核tcp参数来减少宕机造成的影响。
2216 0