Bug 36 - In two_guests/basic test, one instance of pasta fails to open target namespace
Summary: In two_guests/basic test, one instance of pasta fails to open target namespace
Status: RESOLVED FIXED
Alias: None
Product: passt
Classification: Unclassified
Component: pasta (show other bugs)
Version: unspecified
Hardware: All Linux
: Normal quite bad
Assignee: nobody
URL:
Depends on:
Blocks:
 
Reported: 2022-11-07 18:39 UTC by Stefano Brivio
Modified: 2023-03-08 21:49 UTC (History)
1 user (show)

See Also:


Attachments

Description Stefano Brivio 2022-11-07 18:39:42 UTC
This happens about every third time on the two_guests/basic test, and on that test only: we clone() twice, first to spawn a child, then to spawn a thread to check that we can enter the target network namespace.

In this thread, we open a file descriptor associated to the target namespace. It might happen that it doesn't exist yet: the kernel can legitimately take its time to create one, after clone(). In this case, at least on a 5.15 Linux kernel, trying to open that file again always yields EACCES, and we get stuck there.

This only occurs if we spawn two instances of pasta very close together, as it's done in the two_guests/basic case.

See also:
  https://archives.passt.top/passt-dev/20221104015328.3831630-1-sbrivio@redhat.com/

...but that patch doesn't actually fix the issue. I think it makes the issue less likely to occur because we wait a bit longer before open() and setns() in the parent.

I tried synchronising explicitly (with SIGSTOP, SIGCONT) child and parent, both ways, that doesn't help either. What reliably hides the issue is a sleep(1) at the beginning of pasta_wait_for_ns().
Comment 1 Stefano Brivio 2023-03-08 21:49:14 UTC
It turns out this is also fixed by commit d8921dafe506 ("pasta: Wait for tap to be set up before spawning command").

Note You need to log in before you can comment on or make changes to this bug.