记一次DNS问题排查
阅读原文时间:2023年09月02日阅读:1

排查过程

抓包分析:tcpdump -i eth0 -n -s 500 port domain

1 14:40:44.548553 IP 10.13.21.38.29551 > 10.13.255.1.domain: 18621+ A? flow.nzkong.com. (33)
2 14:40:44.549297 IP 10.13.255.1.domain > 10.13.21.38.29551: 18621| 0/0/0 (33)
3 14:40:44.549395 IP 10.13.21.38.51700 > 10.13.255.1.domain: Flags [S], seq 472124968, win 64952, options [mss 1412,sackOK,TS val 217687079 ecr 0,nop,wscale 8], length 0
4 14:40:44.549913 IP 10.13.255.1.domain > 10.13.21.38.51700: Flags [S.], seq 4192295067, ack 472124969, win 24160, options [mss 1220,sackOK,TS val 2624377172 ecr 217687079,nop,wscale 9], length 0
5 14:40:44.549949 IP 10.13.21.38.51700 > 10.13.255.1.domain: Flags [.], ack 1, win 254, options [nop,nop,TS val 217687079 ecr 2624377172], length 0
6 14:40:44.550135 IP 10.13.21.38.51700 > 10.13.255.1.domain: Flags [P.], seq 1:71, ack 1, win 254, options [nop,nop,TS val 217687080 ecr 2624377172], length 7018621+ A? flow.nzkong.com. (68)
7 14:40:44.550260 IP 10.13.255.1.domain > 10.13.21.38.51700: Flags [.], ack 71, win 48, options [nop,nop,TS val 2624377173 ecr 217687080], length 0
8 14:40:44.558209 IP 10.13.255.1.domain > 10.13.21.38.51700: Flags [P.], seq 1:52, ack 71, win 48, options [nop,nop,TS val 2624377180 ecr 217687080], length 5118621 1/0/0 A 106.75.178.212 (49)
9 14:40:44.558224 IP 10.13.21.38.51700 > 10.13.255.1.domain: Flags [.], ack 52, win 254, options [nop,nop,TS val 217687088 ecr 2624377180], length 0
10 14:40:44.791651 IP 10.13.255.1.domain > 10.13.21.38.51700: Flags [P.], seq 52:161, ack 71, win 48, options [nop,nop,TS val 2624377414 ecr 217687088], length 10939640 0/1/0 (107)
11 14:40:44.791698 IP 10.13.21.38.51700 > 10.13.255.1.domain: Flags [.], ack 161, win 254, options [nop,nop,TS val 217687321 ecr 2624377414], length 0
12 14:40:44.792061 IP 10.13.21.38.51700 > 10.13.255.1.domain: Flags [F.], seq 71, ack 161, win 254, options [nop,nop,TS val 217687321 ecr 2624377414], length 0
13 14:40:44.792766 IP 10.13.255.1.domain > 10.13.21.38.51700: Flags [F.], seq 161, ack 72, win 48, options [nop,nop,TS val 2624377415 ecr 217687321], length 0
14 14:40:44.792774 IP 10.13.21.38.51700 > 10.13.255.1.domain: Flags [.], ack 162, win 254, options [nop,nop,TS val 217687322 ecr 2624377415], length 0
15
16
17
18 14:41:10.737255 IP 10.13.21.38.21895 > 10.13.255.1.domain: 35380+ A? flow.nzkong.com. (33)
19 14:41:10.737760 IP 10.13.255.1.domain > 10.13.21.38.21895: 35380 1/0/0 A 106.75.178.212 (49)
20 14:41:10.737848 IP 10.13.21.38.54115 > 10.13.255.1.domain: 19011+ AAAA? flow.nzkong.com. (33)
21 14:41:10.738131 IP 10.13.255.1.domain > 10.13.21.38.54115: 19011| 0/0/0 (33)
22 14:41:10.738300 IP 10.13.21.38.51704 > 10.13.255.1.domain: Flags [S], seq 4213947779, win 64952, options [mss 1412,sackOK,TS val 217713268 ecr 0,nop,wscale 8], length 0
23 14:41:10.738668 IP 10.13.255.1.domain > 10.13.21.38.51704: Flags [S.], seq 1145570450, ack 4213947780, win 24160, options [mss 1220,sackOK,TS val 2624403361 ecr 217713268,nop,wscale 9], length 0
24 14:41:10.738693 IP 10.13.21.38.51704 > 10.13.255.1.domain: Flags [.], ack 1, win 254, options [nop,nop,TS val 217713268 ecr 2624403361], length 0
25 14:41:10.738855 IP 10.13.21.38.51704 > 10.13.255.1.domain: Flags [P.], seq 1:71, ack 1, win 254, options [nop,nop,TS val 217713268 ecr 2624403361], length 7035380+ A? flow.nzkong.com. (68)
26 14:41:10.738980 IP 10.13.255.1.domain > 10.13.21.38.51704: Flags [.], ack 71, win 48, options [nop,nop,TS val 2624403361 ecr 217713268], length 0
27 14:41:10.739073 IP 10.13.255.1.domain > 10.13.21.38.51704: Flags [P.], seq 1:52, ack 71, win 48, options [nop,nop,TS val 2624403361 ecr 217713268], length 5135380 1/0/0 A 106.75.178.212 (49)
28 14:41:10.739081 IP 10.13.21.38.51704 > 10.13.255.1.domain: Flags [.], ack 52, win 254, options [nop,nop,TS val 217713269 ecr 2624403361], length 0
29 14:41:10.747067 IP 10.13.255.1.domain > 10.13.21.38.51704: Flags [P.], seq 52:161, ack 71, win 48, options [nop,nop,TS val 2624403369 ecr 217713269], length 10919011 0/1/0 (107)
30 14:41:10.747076 IP 10.13.21.38.51704 > 10.13.255.1.domain: Flags [.], ack 161, win 254, options [nop,nop,TS val 217713277 ecr 2624403369], length 0
31 14:41:10.747175 IP 10.13.21.38.51704 > 10.13.255.1.domain: Flags [F.], seq 71, ack 161, win 254, options [nop,nop,TS val 217713277 ecr 2624403369], length 0
32 14:41:10.747387 IP 10.13.255.1.domain > 10.13.21.38.51704: Flags [F.], seq 161, ack 72, win 48, options [nop,nop,TS val 2624403370 ecr 217713277], length 0
33 14:41:10.747394 IP 10.13.21.38.51704 > 10.13.255.1.domain: Flags [.], ack 162, win 254, options [nop,nop,TS val 217713277 ecr 2624403370], length 0
34
35
36
37 14:41:20.125630 IP 10.13.21.38.32649 > 10.13.255.1.domain: 4265+ A? flow.nzkong.com. (33)
38 14:41:20.125851 IP 10.13.255.1.domain > 10.13.21.38.32649: 4265 1/0/0 A 106.75.178.212 (49)
39 14:41:20.125935 IP 10.13.21.38.28137 > 10.13.255.1.domain: 63155+ AAAA? flow.nzkong.com. (33)

  1.抓包分析发现其会解析该域名的AAAA记录,然后卡住
  2.判断是因为没有ipv6的地址导致的原因
  3.考虑添加失败缓存

原因

  • 该域名没有ipv6的地址,但是dns先会查询ipv6,如果没有的话再查ipv4,
  • ipv6查询不到的话会有个超时以及重试时长,然后这个时长在我们的机器上默认差不多是2到3秒,总的来说就是缓存的问题,失败缓存时长设置小了

解决方案    

  设置bind参数:min-ncache-ttl 600  # 设置否定答案的缓存时长

排查过程

抓包分析:tcpdump -i eth0 -n -s 500 port domain

1 18:19:59.737280 IP 10.35.170.87.38128 > 10.35.255.200.domain: 15552+ A? rec.vnugc.com. (31)
2 18:19:59.737296 IP 10.35.170.87.38128 > 10.35.255.200.domain: 24775+ AAAA? rec.vnugc.com. (31)
3 18:19:59.738138 IP 10.35.255.200.domain > 10.35.170.87.38128: 24775 0/1/0 (107)
4 18:19:59.755957 IP 10.35.255.200.domain > 10.35.170.87.38128: 15552 7/0/0 CNAME cloudbase-sg.vnugc.com., A 118.194.235.163, A 118.194.233.238, A 118.194.235.140, A 118.194.233.192, A 118.194.234.167, A 118.194.235.150 (154)
5 18:19:59.758176 IP 10.35.170.87.42698 > 10.35.255.200.domain: 21552+ PTR? 163.235.194.118.in-addr.arpa. (46)
6 18:20:01.010025 IP 10.35.255.200.domain > 10.35.170.87.42698: 21552 ServFail 0/0/0 (46)
7 18:20:01.010324 IP 10.35.170.87.58235 > 10.35.255.200.domain: 21552+ PTR? 163.235.194.118.in-addr.arpa. (46)
8 18:20:01.010665 IP 10.35.255.200.domain > 10.35.170.87.58235: 21552 ServFail 0/0/0 (46)
9 18:20:08.268323 IP 10.35.170.87.41102 > 10.35.255.200.domain: 28888+ A? rec.vnugc.com. (31)
10 18:20:08.268339 IP 10.35.170.87.41102 > 10.35.255.200.domain: 64223+ AAAA? rec.vnugc.com. (31)
11 18:20:08.268660 IP 10.35.255.200.domain > 10.35.170.87.41102: 28888 7/0/0 CNAME cloudbase-sg.vnugc.com., A 118.194.234.167, A 118.194.233.192, A 118.194.235.150, A 118.194.235.163, A 118.194.233.238, A 118.194.235.140 (154)
12 18:20:08.268703 IP 10.35.255.200.domain > 10.35.170.87.41102: 64223 1/1/0 CNAME cloudbase-sg.vnugc.com. (134)
13 18:20:08.270051 IP 10.35.170.87.39988 > 10.35.255.200.domain: 5211+ PTR? 167.234.194.118.in-addr.arpa. (46)
14 18:20:09.220012 IP 10.35.255.200.domain > 10.35.170.87.39988: 5211 ServFail 0/0/0 (46)
15 18:20:09.220288 IP 10.35.170.87.47297 > 10.35.255.200.domain: 5211+ PTR? 167.234.194.118.in-addr.arpa. (46)
16 18:20:09.220560 IP 10.35.255.200.domain > 10.35.170.87.47297: 5211 ServFail 0/0/0 (46)

  1.抓包分析发现DNS会进行反解
  2.发现该反解地址如果不存在,会导致servfail
  3.考虑设置servfail的缓存时长

原因

  • ping的时候会进行dns解析,然后dns会对该地址进行反解,而该地址没有PTR记录,导致servFail,超时时间有点长,需要servFail的缓存

解决方案

  设置bind参数:servfail-ttl 30 # 设置servfail的缓存时长

手机扫一扫

移动阅读更方便

阿里云服务器
腾讯云服务器
七牛云服务器

你可能感兴趣的文章