Missing DHCP Replies
After I upgraded my WRT3200ACM router to OpenWrt v23.05.3, my laptop took minutes to get an IPv4 address when connected to my WRT3200ACM router’s WiFi. It only takes seconds to get an IPv4 address when connected to my other WiFi.
I ended up downgrading OpenWrt on the WRT3200ACM, but first I did a lot of work trying to track down the problem.
Here’s what a portion of my network looks like:
I have the kea DHCP server running on the Qotom server. I have dhcpcd running on the laptop.
The WRT3200ACM runs OpenWrt, and has a bridged LAN,
which means the radios and RJ-45 ethernet ports are one broadcast segment.
If I cabled the Qotom server to one of the LAN RJ-45 ports,
kea
can reply to DCHP requests made by my laptop
after it’s authenticated to the radio(s).
Running tcpdump -i wlp1s0
on my laptop showed that dhcpcd
made
numerous DHCP requests,
but didn’t get any answers for quite a while.
Below, tcpdump
output.
My laptop’s MAC address is e4:a4:71:30:21:86.
kea-dhcp4
is running on 172.24.0.1.
1 2024-04-17 15:34:26.350086 IP6 :: > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28
2 2024-04-17 15:34:26.525390 IP6 :: > ff02::1:ff6d:957b: ICMP6, neighbor solicitation, who has fe80::9402:3c1e:566d:957b, length 32
3 2024-04-17 15:34:26.989267 IP6 :: > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28
4 2024-04-17 15:34:27.415887 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from e4:a4:71:30:21:86, length 300
5 2024-04-17 15:34:27.565167 IP6 fe80::9402:3c1e:566d:957b > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28
6 2024-04-17 15:34:27.565887 IP6 fe80::9402:3c1e:566d:957b > ff02::2: ICMP6, router solicitation, length 16
7 2024-04-17 15:34:27.917358 IP6 fe80::9402:3c1e:566d:957b > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28
8 2024-04-17 15:34:30.573376 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from e4:a4:71:30:21:86, length 300
9 2024-04-17 15:34:31.567336 IP6 fe80::9402:3c1e:566d:957b > ff02::2: ICMP6, router solicitation, length 16
10 2024-04-17 15:34:32.440654 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from e4:a4:71:30:21:86, length 300
11 2024-04-17 15:34:35.570931 IP6 fe80::9402:3c1e:566d:957b > ff02::2: ICMP6, router solicitation, length 16
12 2024-04-17 15:34:35.974216 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from e4:a4:71:30:21:86, length 300
13 2024-04-17 15:34:39.574937 IP6 fe80::9402:3c1e:566d:957b > ff02::2: ICMP6, router solicitation, length 16
14 2024-04-17 15:34:43.917214 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from e4:a4:71:30:21:86, length 300
15 2024-04-17 15:34:43.937481 IP 172.24.0.1.67 > 172.24.0.7.68: BOOTP/DHCP, Reply, length 334
16 2024-04-17 15:34:43.938082 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from e4:a4:71:30:21:86, length 300
17 2024-04-17 15:34:43.946884 IP 172.24.0.1.67 > 172.24.0.7.68: BOOTP/DHCP, Reply, length 334
18 2024-04-17 15:34:43.947433 ARP, Request who-has 172.24.0.7 tell 0.0.0.0, length 28
19 2024-04-17 15:34:45.206090 ARP, Request who-has 172.24.0.7 tell 0.0.0.0, length 28
20 2024-04-17 15:34:47.072357 ARP, Request who-has 172.24.0.7 tell 0.0.0.0, length 28
21 2024-04-17 15:34:49.077026 ARP, Request who-has 172.24.0.7 tell 172.24.0.7, length 28
22 2024-04-17 15:34:51.079467 ARP, Request who-has 172.24.0.7 tell 172.24.0.7, length 28
Lines 4, 8, 10, 12 and 14 show my laptop sending out DHCPREQUEST messages,
asking someone, anyone, for IPv4 address 172.24.0.7,
which is apparently what that interface had before.
dhcpcd
(on my laptop) would have sent a DHCPDISCOVER message otherwise.
The format above doesn’t show all the details of the packets.
The important aspect to notice is that dhcpcd
on the laptop
makes 5 DHCPREQUESTs to get a single DHCPOFFER.
appearing on line 15, after the fifth DHCPREQUEST.
What the hell is going on?
That’s 16 seconds after the first DHCPREQUEST.
kea
not responding was my first guess.
The corresponding dhcpcd
log from my laptop:
1 Apr 17 15:34:26 hazard dhcpcd[321]: wlp1s0: carrier acquired
2 Apr 17 15:34:26 hazard dhcpcd[321]: wlp1s0: connected to Access Point: bongrips4jesus42069
3 Apr 17 15:34:26 hazard dhcpcd[321]: wlp1s0: IAID 71:30:21:86
4 Apr 17 15:34:26 hazard dhcpcd[321]: wlp1s0: adding address fe80::9402:3c1e:566d:957b
5 Apr 17 15:34:26 hazard dhcpcd[321]: wlp1s0: soliciting an IPv6 router
6 Apr 17 15:34:27 hazard dhcpcd[321]: wlp1s0: rebinding lease of 172.24.0.7
7 Apr 17 15:34:32 hazard dhcpcd[321]: wlp1s0: DHCP lease expired
8 Apr 17 15:34:32 hazard dhcpcd[321]: wlp1s0: DHCP lease expired
9 Apr 17 15:34:32 hazard dhcpcd[321]: wlp1s0: soliciting a DHCP lease
10 Apr 17 15:34:39 hazard dhcpcd[321]: wlp1s0: no IPv6 Routers available
11 Apr 17 15:34:43 hazard dhcpcd[321]: wlp1s0: offered 172.24.0.7 from 172.24.0.1
12 Apr 17 15:34:43 hazard dhcpcd[321]: wlp1s0: probing address 172.24.0.7/16
13 Apr 17 15:34:49 hazard dhcpcd[321]: wlp1s0: leased 172.24.0.7 for 3600 seconds
14 Apr 17 15:34:49 hazard dhcpcd[321]: wlp1s0: adding route to 172.24.0.0/16
15 Apr 17 15:34:49 hazard dhcpcd[321]: wlp1s0: adding default route via 172.24.0.1
Line 6 of the dhcpcd
log chronologically corresponds to line 4 of
the tcpdump
decoding above.
Line 11 of the dhcpcd
log chronologically corresponds to line 17 of
the tcpdump
decoding.
This confirms that dhcpcd
doesn’t see any
DHCPOFFER, DHCPACK, DHCPNAK or any other reply from kea
.
Conveniently, I’m not documenting my first test.
I had tcpdump -i enp4s0 -w some.file.pcap
running on the Qotom server.
Here’s the decoded PCAP:
1 2024-04-17 15:34:26.361338 e4:a4:71:30:21:86 > ff:ff:ff:ff:ff:ff Null Unnumbered, xid, Flags [Response], length 6: 01 00
2 2024-04-17 15:34:26.525843 IP6 :: > ff02::1:ff6d:957b: ICMP6, neighbor solicitation, who has fe80::9402:3c1e:566d:957b, length 32
3 2024-04-17 15:34:26.989719 IP6 :: > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28
4 2024-04-17 15:34:27.416662 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from e4:a4:71:30:21:86, length 300
5 2024-04-17 15:34:27.419969 IP 172.24.0.1.67 > 172.24.0.7.68: BOOTP/DHCP, Reply, length 334
6 2024-04-17 15:34:27.566643 IP6 fe80::9402:3c1e:566d:957b > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28
7 2024-04-17 15:34:27.567030 IP6 fe80::9402:3c1e:566d:957b > ff02::2: ICMP6, router solicitation, length 16
8 2024-04-17 15:34:27.917795 IP6 fe80::9402:3c1e:566d:957b > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28
9 2024-04-17 15:34:30.574167 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from e4:a4:71:30:21:86, length 300
10 2024-04-17 15:34:30.577152 IP 172.24.0.1.67 > 172.24.0.7.68: BOOTP/DHCP, Reply, length 334
11 2024-04-17 15:34:31.567694 IP6 fe80::9402:3c1e:566d:957b > ff02::2: ICMP6, router solicitation, length 16
12 2024-04-17 15:34:32.443312 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from e4:a4:71:30:21:86, length 300
13 2024-04-17 15:34:32.446095 IP 172.24.0.1.67 > 172.24.0.7.68: BOOTP/DHCP, Reply, length 334
14 2024-04-17 15:34:35.571498 IP6 fe80::9402:3c1e:566d:957b > ff02::2: ICMP6, router solicitation, length 16
15 2024-04-17 15:34:35.974960 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from e4:a4:71:30:21:86, length 300
16 2024-04-17 15:34:35.978001 IP 172.24.0.1.67 > 172.24.0.7.68: BOOTP/DHCP, Reply, length 334
17 2024-04-17 15:34:39.576770 IP6 fe80::9402:3c1e:566d:957b > ff02::2: ICMP6, router solicitation, length 16
18 2024-04-17 15:34:43.919164 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from e4:a4:71:30:21:86, length 300
19 2024-04-17 15:34:43.922934 IP 172.24.0.1.67 > 172.24.0.7.68: BOOTP/DHCP, Reply, length 334
20 2024-04-17 15:34:43.938673 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from e4:a4:71:30:21:86, length 300
21 2024-04-17 15:34:43.944427 IP 172.24.0.1.67 > 172.24.0.7.68: BOOTP/DHCP, Reply, length 334
22 2024-04-17 15:34:43.947755 ARP, Request who-has 172.24.0.7 tell 0.0.0.0, length 46
23 2024-04-17 15:34:45.207743 ARP, Request who-has 172.24.0.7 tell 0.0.0.0, length 46
24 2024-04-17 15:34:47.074012 ARP, Request who-has 172.24.0.7 tell 0.0.0.0, length 46
25 2024-04-17 15:34:49.078737 ARP, Request who-has 172.24.0.7 tell 172.24.0.7, length 46
26 2024-04-17 15:34:51.081142 ARP, Request who-has 172.24.0.7 tell 172.24.0.7, length 46
Line 4 in the server’s tcpdump
chronologically matches
line 4 in the laptop’s tcpdump
.
Laptop records 2024-04-17 15:34:27.415887 as the timestamp
for the DHCPREQUEST, which the server records 2024-04-17 15:34:27.416662,
0.000775 seconds later.
NTP for the win.
Lines 4, 9, 12, 15, 18 match the five DHCPREQUESTs appearing in
the laptop’s tcpdump
, lines 4, 8 , 10, 12 and 14.
Lines 5, 10, 13, 16 and 19 show DHCPACK messages from kea
on the Qotom server to 172.24.0.7, with the laptop’s MAC address.
That’s the goofy, ugly part of DHCP:
a UDP packet get sent to an IPv4 address and a MAC address
where the computer with that MAC address doesn’t have the IPv4 address yet.
I don’t have a PCAP from running tcpdump
on the OpenWrt 23.05.3 br-lan
bridged network,
but you’ll have to believe me that the DHCPACK messages
from kea
appear on the bridged network.
Other Things I tried
The same sort of “missing DHCPACK” problem showed up
when I used other USB WiFi dongles,
and when I used dhclient
instead of dhcpcd
on the laptop.
I put kea on my production server
to see if kea
misbehaved in some subtle way.
I downgraded my current laptop to
linux-zen-5.13.5.zen1-1
linux-zen-headers-5.13.5.zen1-1
linux-firmware-20210716.b7c134f-1
because an old laptop from my scrap heap with those package versions subjectively seemed to get an address faster. No luck, still took ages to get an address most of the time.
The dhtest
program would exhibit the same problem between when wpa_supplicant
connected to the WRT3200ACM’s radio,
and when dhcpcd
acquired an address.
That is, it would send repeated DHCPDISCOVER packets and not get any kind of reply.
I should have turned off dhcpcd
on my laptop, then run dhtest
.
It sends a DHCPDISCOVER every 5 seconds until it gets an answer.
Annoying aspects of the problem
- It didn’t occur every time I switched my laptop to the WRT3200ACM’s network
- It didn’t occur if I used a CAT-5 cable to connect my laptop to one of the WRT3200ACM’s LAN ports
- It took various lengths of time to get a DHCPACK from
kea
, from 0 to 389 seconds, with a mean of 71.3 seconds and a median of 25 seconds. The first quartile is 7 seconds, which means at least 25% of connections take a normal amount of time. - I couldn’t definitively tell if other devices had the problem. This is a consequence of (1) and things like not having a DHCP log file for my Google Pixel 6a phone.
Others with the same problem
Found a reddit post where someone complains about packet loss and lower speed with OpenWrt versions greater than 19.07 on a WRT3200ACM router. The reddit post even says that cabled connections work perfectly.
A fair amount of packet loss could explain the missing DHCPACK/DHCPOFFER messages.
DHCP is conducted using UDP, there’s no attempt at reliable delivery.
The only impediment to adopting this theory is IPv6 and ARP packets.
There’s an exact correspondence between IPv6 packets in the two PCAPs,
and the ARP packets broadcast by dhcpcd
when it finally receives a DHCPACK.
Only the DHCPACK packets are missing in the laptop’s PCAP.