Bug 56 - pasta rejects matching IPv4+IPv6 port forwarding rules as overlapping
Summary: pasta rejects matching IPv4+IPv6 port forwarding rules as overlapping
Status: RESOLVED FIXED
Alias: None
Product: passt
Classification: Unclassified
Component: general (show other bugs)
Version: unspecified
Hardware: All Linux
: Normal normal
Assignee: David Gibson
URL:
Depends on:
Blocks:
 
Reported: 2023-06-24 14:24 UTC by Daniel Rudolf
Modified: 2023-08-15 03:56 UTC (History)
1 user (show)

See Also:


Attachments
podman run --log-level=debug (13.70 KB, text/plain)
2023-06-24 14:24 UTC, Daniel Rudolf
Details

Description Daniel Rudolf 2023-06-24 14:24:07 UTC
Created attachment 17 [details]
podman run --log-level=debug

Passing two matching port forwarding rules of a single port to Podman for both IPv4 and IPv6, pasta bails with an "Overlapping port specifier" error:

```
$ podman run --rm --network=pasta --publish=127.0.0.1:8080:8080/tcp --publish=[::1]:8080:8080/tcp alpine sleep 3
Error: failed to start pasta:
No external routable interface for IPv6
Overlapping port specifier ::1/8080-8080:8080-8080
```

Please note that the "No external routable interface for IPv6" notice is unrelated, it's just the result of me creating a reproducer in a virtual machine.

Pasta is invoked using the following arguments. Attached you can find the full output of the above podman run with `--log-level=debug`.

```
--config-net -t 127.0.0.1/8080-8080:8080-8080 -t ::1/8080-8080:8080-8080 -u none -T none -U none --no-map-gw --netns /run/user/1000/netns/netns-c3be99f6-fc2d-eccb-dcd1-8cb6b9186715
```

Running pasta 0^20230603.g429e1a7-1.fc38.x86_64 on Fedora 38.
Comment 1 Daniel Rudolf 2023-06-24 15:24:01 UTC
For the record: Removing the overlapping port check in `conf.c` fixes this issue, i.e. the actual functionality is there and works, the checks in `conf.c` are just too strict.
Comment 2 David Gibson 2023-07-05 04:02:35 UTC
So, long term this should be fixed by the improved forwarding model we're gradually working on (https://pad.passt.top/p/NewForwardingModel).  That's a fair way off, though.

Short term, it's not feasible for us to explicitly keep track of bindings on different addresses, because it would require an essentially unbounded amount of memory with the current structures.

We could simply remove the overlap check, as suggested in comment 1 and this case would work.  However, it would have some consequences:

1) Exact overlaps would result in a less explicit error message:

Before: $ ./pasta --config-net -t 8000 -t 8000
        Overlapping port specifier 8000

After: $ ./pasta --config-net -t 8000 -t 8000
       Failed to bind any port for '-t 8000', exiting

2) Partial overlaps would not be reported

We don't give an error as long as we're able to bind some ports in a range so:

Before: $ ./pasta -t 8000 -t 7000-9000
        Overlapping port specifier 7000-9000

After: $ ./pasta -t 8000 -t 7000-9000
       #

3) Partial overlaps with different target port range deltas will behave confusingly

Before: you can't even attempt to map overlapping ranges to different target ranges, because you'll get the overlapping port range error.

After: This will succeed, and the results will be.. counterintuitive.  For TCP I think it will be basically "first mapping wins" for each port, since the target port gets baked into the socket epoll reference.  For UDP it will be the same for incoming packets, but for reply packets, the reverse mapping is calculated after all the mappings are inserted, so it will be "last mapping wins", meaning that in this case UDP reply packets are likely to be misdirected.


I think (1) is pretty harmless, (2) isn't great, but we could probably live with.  (3) makes me pretty uncomfortable.

Stefano, any opinions on the above?

Before: $ ./pasta -t 8000:8000 -t 8000-8001:8010-8011
        Overlapping port specifier 8000-8001:8010-8011
After: $ ./pasta -t 8000:8000 -t 8000-8001:8010-8011
       #
Comment 3 Daniel Rudolf 2023-07-05 13:49:04 UTC
What's the reason that (1) yields an error, but (2) doesn't? Is it because passt passes the range as range to the system and the API happens to fail silently if one mapping succeeds? If this is the case, would changing this to subsequent calls, i.e. adding one port mapping after another (e.g. forward 7000 to 7000, then 7001 to 7001, …), bring back the error? If yes, wouldn't this also solve (3)? If not, still considering to add one mapping after another, is there an easy way to detect such conflicts, e.g. by comparing the new mapping with the existing mappings in the system?

Might be silly questions, I've no idea about how any of this works, so if they are silly questions don't feel obligated to even answer, I just wanted to give an outsider's perspective :-)
Comment 4 David Gibson 2023-07-06 00:40:07 UTC
(In reply to Daniel Rudolf from comment #3)
> What's the reason that (1) yields an error, but (2) doesn't? Is it because
> passt passes the range as range to the system and the API happens to fail
> silently if one mapping succeeds?

No. This an explicit policy choice in passt itself.  We don't give an error on range mappings if we bind any of the ports in the range, even if some did fail.  The reasoning for this is that you can specify a broad range which might include ports already bound by other processes on the host system, and passt will still run with everything it can forward forwarded.  It's certainly debatable whether that's the right choice - what you want really kind of depends on the use case - but it's the policy for the time being.

> If this is the case, would changing this
> to subsequent calls, i.e. adding one port mapping after another (e.g.
> forward 7000 to 7000, then 7001 to 7001, …), bring back the error? If yes,
> wouldn't this also solve (3)? If not, still considering to add one mapping
> after another, is there an easy way to detect such conflicts, e.g. by
> comparing the new mapping with the existing mappings in the system?

No, changing that policy wouldn't fix (3), at least not in all cases.  The fundamental problem is that our data structures only accommodate one target port per incoming port.  Checking for any conflicting binds would mean you get an error for the example given in (3) above, however then there's this case:
      $ ./pasta -u 127.0.0.1/8000:8001 -u ::1/8000:8002

In that case, both binds would succeed (that's the whole point of this bug), but we'd have conflicting information about what the target port is.  For TCP it would more or less work, because we have some extra information squirreled away in the epoll references that would (I think) make us do the right thing despite the incorrect information in the main data structure.  For UDP however, I'm pretty sure things would go wrong.

Ultimately, we really need to fix that data structure.  We have plans to do that, but it's going to be a fairly long road to get there.

> Might be silly questions, I've no idea about how any of this works, so if
> they are silly questions don't feel obligated to even answer, I just wanted
> to give an outsider's perspective :-)
Comment 5 Daniel Rudolf 2023-07-06 11:59:54 UTC
Thank you very much for the comprehensive explanation, this cleared many things up! Okay, so it's definitely no easy choice in the short-term.

Unless someone has an easy solution up her/his sleeves I don't think that putting too much work into this is reasonable considering that there are plans to solve it in the long-term. So, what about just switching from an error to a warning? Like "Warning: Overlapping port specifier: Ambiguities in the port specifier might lead to unexpected behaviour; check whether your mappings are unambiguous". Possibly complemented by some short notice with the explanations and examples you just gave in the man page? This would allow users to check whether they might be affected, especially when they see issues. Should be fine as a temporary solution.
Comment 6 Daniel Rudolf 2023-08-09 18:50:20 UTC
Any updates on this yet? Unfortunately this is a blocking issue for me, I can't use passt right now due to this issue :/ I guess just switching from an error to a warning would be fine for now :+1: An additional notice in the man page would be great, but since I'm apparently the only user with this issue I guess such notice isn't strictly necessary either (plus: if we replicate the error message here people will find this report using their favourite search engine). A long-term solution is planned anyway, so the most simple solution is probably the best.
Comment 7 David Gibson 2023-08-10 00:33:01 UTC
(In reply to Daniel Rudolf from comment #6)
> Any updates on this yet? Unfortunately this is a blocking issue for me, I
> can't use passt right now due to this issue :/ I guess just switching from
> an error to a warning would be fine for now :+1: An additional notice in the
> man page would be great, but since I'm apparently the only user with this
> issue I guess such notice isn't strictly necessary either (plus: if we
> replicate the error message here people will find this report using their
> favourite search engine). A long-term solution is planned anyway, so the
> most simple solution is probably the best.

Nothing that's likely to be useful to you, I'm afraid.  I've made a start on the big rework that will eventually fix this, but actually addressing this aspect remains quite some way away,
Comment 8 Daniel Rudolf 2023-08-10 08:53:24 UTC
Thanks for the quick response!

Do you think that you'll finish the big rework in the next couple of weeks, or is it rather far away? If it's rather far away, could we implement a quick fix in meantime? Like switching this from a fatal error to a warning. I don't know pasta's code, actually I'm not even a C dev, but since removing the error message worked I presume that this would be a rather small and quick change? Is this feasible in your opinion or are there major downsides I might have simply missed?

As said, this rather small issue currently prevents me from using pasta in production. However, I want to use it, pasta is just amazing, so I'd appreciate an intermediate fix very much.
Comment 9 David Gibson 2023-08-11 05:05:28 UTC
(In reply to Daniel Rudolf from comment #8)
> Thanks for the quick response!
> 
> Do you think that you'll finish the big rework in the next couple of weeks,
> or is it rather far away?

No, it's a fair way away.  I'd say a couple of months at least.

> If it's rather far away, could we implement a
> quick fix in meantime? Like switching this from a fatal error to a warning.
> I don't know pasta's code, actually I'm not even a C dev, but since removing
> the error message worked I presume that this would be a rather small and
> quick change? Is this feasible in your opinion or are there major downsides
> I might have simply missed?

So, I'm not neccessarily against dropping this down to a warning.  I'm a bit concerned by the possible weird edge cases this might open up, but at least for inbound TCP forwarding, it's probably goint to be ok in practice.

Stefano, do you have an opinion?

That said.. I thought about this a bit more.  I had been thinking that a more robust solution done now would be some unneeded churn before the more sweeping changes that are in practice.  But... I have an idea for an approach that would be based on some mechanisms that we will need anyway for the larger changes, and shouldn't add *too* much extra churn.  That's an approach that I think *could* be tackled in the next couple of weeks (to the point of being written and maybe merged upstream, probably not put into all the downstream packages).  I'll have a look and see what I can do.

> As said, this rather small issue currently prevents me from using pasta in
> production. However, I want to use it, pasta is just amazing, so I'd
> appreciate an intermediate fix very much.
Comment 10 David Gibson 2023-08-11 07:36:30 UTC
After some discussion with Stefano, decided to go with the quick and easy option for now: demote the error to a warning.

Patch posted.
Comment 11 Daniel Rudolf 2023-08-13 19:41:24 UTC
And it got merged already. Thank you very much David for the quick fix! Looking forward to use passt in production now; considering my previous experience I don't think that it will take very long till the patch lands in Fedora CoreOS.

Still looking forward to the big rework resp. a possible bigger intermediate solution though; you can keep this issue updated if you like, I could give the changes a try then. Shall we keep this issue open or mark as resolved in the meantime?
Comment 12 David Gibson 2023-08-15 03:56:43 UTC
I think mark this resolved.  The new model will come along in due course.

Note You need to log in before you can comment on or make changes to this bug.