Bug 16

Summary: Receiving UDPv6 message is flaky
Product: passt Reporter: Alona Paz <alkaplan>
Component: UDPAssignee: Stefano Brivio <sbrivio>
Status: RESOLVED FIXED    
Severity: normal    
Priority: Normal    
Version: unspecified   
Hardware: All   
OS: Linux   

Description Alona Paz 2022-07-06 12:58:28 UTC
At least one UDPv6 message has to be sent from a machine before it can receive UDPv6 messages

Steps to reproduce:
1. Start to VMs with passt interface.
2. Run UDP server on one of the VMs (`nc -klp 3000 -u`)
3. Run UDP client on the second VM (`nc server_ipv6_addr 3000 -u`)
4. Try to send text client -> server, server -> client.

Desired result:
The test should be received by the server/client.

Actual result:
The test is not received.

Workaround:
Send any UDPv6 message from the server so the passt process will be familiar with its address. For example ping from the server to the client's IPv6 address.
Comment 1 Stefano Brivio 2022-07-06 17:00:56 UTC
Additional information from checking this on the environment: the instance of passt associated with the server VM does this:

recvmmsg(6, [{msg_hdr={msg_name={sa_family=AF_INET6, sin6_port=htons(52906), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "fd10:244::c48d", &sin6_addr), sin6_scope_id=0}, msg_namelen=28, msg_iov=[{iov_base="Hello Server", iov_len=65487}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, msg_len=12}], 32, 0, NULL) = 1
sendmmsg(73, [{msg_hdr={msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\0\0\0JRT\0\0224V\346\23\6b\276\305\206\335`\0\0\0\0\24\21\377\375\20\2D\0\0\0\0\0\0\0\0\0\0\304\215\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\316\252\v\270\0\24\22AHello Server", iov_len=78}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, msg_len=78}], 1, MSG_DONTWAIT|MSG_NOSIGNAL) = 1

and the server VM sees a message like this from the client:

21:17:45.417060 IP6 (hlim 255, next-header UDP (17) payload length: 20) fd10:244::c48d.36276 > ::.3000: [udp sum ok] UDP, length 12
	0x0000:  6000 0000 0014 11ff fd10 0244 0000 0000  `..........D....
	0x0010:  0000 0000 0000 c48d 0000 0000 0000 0000  ................
	0x0020:  0000 0000 0000 0000 8db4 0bb8 0014 5337  ..............S7
	0x0030:  4865 6c6c 6f20 5365 7276 6572            Hello.Server

note that the destination address is ::, so the server doesn't answer. If we now ping the client VM from the server VM:

# ping fd10:244::c48d
PING fd10:244::c48d(fd10:244::c48d) 56 data bytes
64 bytes from fd10:244::c48d: icmp_seq=1 ttl=255 time=2.45 ms

and then send another message, passt sees the guest address and maps :: correctly:

21:19:30.743466 IP6 (hlim 255, next-header UDP (17) payload length: 20) fd10:244::c48d.56504 > fe80::5054:ff:fe12:3456.3000: [udp sum ok] UDP, length 12
	0x0000:  6000 0000 0014 11ff fd10 0244 0000 0000  `..........D....
	0x0010:  0000 0000 0000 c48d fe80 0000 0000 0000  ................
	0x0020:  5054 00ff fe12 3456 dcb8 0bb8 0014 81f5  PT....4V........
	0x0030:  4865 6c6c 6f20 5365 7276 6572            Hello.Server
21:19:30.744479 IP6 (flowlabel 0x000dd, hlim 255, next-header UDP (17) payload length: 20) fe80::5054:ff:fe12:3456.3000 > fd10:244::c48d.56504: [udp sum ok] UDP, length 12
	0x0000:  6000 00dd 0014 11ff fe80 0000 0000 0000  `...............
	0x0010:  5054 00ff fe12 3456 fd10 0244 0000 0000  PT....4V...D....
	0x0020:  0000 0000 0000 c48d 0bb8 dcb8 0014 91fd  ................
	0x0030:  4865 6c6c 6f20 436c 6965 6e74            Hello.Client

in this case, that's a link-local address because the global unicast address via NDP expired. After refreshing the unicast address via NDP:

21:21:26.362242 IP6 (hlim 255, next-header UDP (17) payload length: 20) fd10:244::c48d.60670 > fd10:244::5054:ff:fe12:3456.3000: [udp sum ok] UDP, length 12
	0x0000:  6000 0000 0014 11ff fd10 0244 0000 0000  `..........D....
	0x0010:  0000 0000 0000 c48d fd10 0244 0000 0000  ...........D....
	0x0020:  5054 00ff fe12 3456 ecfe 0bb8 0014 70db  PT....4V......p.
	0x0030:  4865 6c6c 6f20 5365 7276 6572            Hello.Server
21:21:26.363079 IP6 (flowlabel 0x22ce8, hlim 255, next-header UDP (17) payload length: 20) fd10:244::5054:ff:fe12:3456.3000 > fd10:244::c48d.60670: [udp sum ok] UDP, length 12
	0x0000:  6002 2ce8 0014 11ff fd10 0244 0000 0000  `.,........D....
	0x0010:  5054 00ff fe12 3456 fd10 0244 0000 0000  PT....4V...D....
	0x0020:  0000 0000 0000 c48d 0bb8 ecfe 0014 80e3  ................
	0x0030:  4865 6c6c 6f20 436c 6965 6e74            Hello.Client

and this gets finally relayed back to the client:

21:21:53.760948 IP6 (flowlabel 0x39e04, hlim 255, next-header UDP (17) payload length: 20) fe80::5054:ff:fe12:3456.34328 > fd10:244::8c5d.3000: [udp sum ok] UDP, length 12
	0x0000:  6003 9e04 0014 11ff fe80 0000 0000 0000  `...............
	0x0010:  5054 00ff fe12 3456 fd10 0244 0000 0000  PT....4V...D....
	0x0020:  0000 0000 0000 8c5d 8618 0bb8 0014 10c6  .......]........
	0x0030:  4865 6c6c 6f20 5365 7276 6572            Hello.Server
21:21:53.764261 IP6 (hlim 255, next-header UDP (17) payload length: 20) fd10:244::8c5d.3000 > fd10:244::5054:ff:fe12:3456.34328: [udp sum ok] UDP, length 12
	0x0000:  6000 0000 0014 11ff fd10 0244 0000 0000  `..........D....
	0x0010:  0000 0000 0000 8c5d fd10 0244 0000 0000  .......]...D....
	0x0020:  5054 00ff fe12 3456 0bb8 8618 0014 1ffa  PT....4V........
	0x0030:  4865 6c6c 6f20 436c 6965 6e74            Hello.Client

Probably, the checks for address mapping of :: for UDP should be made less strict somehow. This happened on Alpine, which is relatively quiet network-wise -- the current mechanism works on noisier VMs running Fedora or Debian, but not here.
Comment 3 Stefano Brivio 2022-10-15 14:37:19 UTC
Definitely fixed, tested over and over again with Podman test suite (pending upstream).