Release IPv6 address if IPv6 is disabled on an interface#48971
Release IPv6 address if IPv6 is disabled on an interface#48971robmry merged 5 commits intomoby:masterfrom
Conversation
190b4b4 to
6a1d94f
Compare
|
@robmry was this accidentally closed or superseded by the other PRs? |
Oh - thank you! How do you manage to spot this stuff?! I thought it was superseded by the unsolicited ARPs PR, but you prompted me to double-check - the last commit here is just stacked on that other change. I've forgotten where I got to with this one, but it probably needs resurrecting. |
baa0db9 to
6d1c268
Compare
LOL, any activity gets in my notifications inbox; I tend to check up quickly on all of them, but also look for "this looks non-standard" things (like closing a PR); I didn't see a comment "why" it was closed, but I've had it myself that I accidentally pushed to the wrong PR branch, and GitHub automatically closing a PR, so thought "let me check!" ❤️ |
5903609 to
d88aa44
Compare
|
As this was already broken in 27.x, it fails with an error, and it's a corner case (disabling IPv6 on a single endpoint in an IPv6 network, rather than disabling it in the container) ... I've dropped the 28.0 milestone. It's working, but I'll give it a check over before marking it as ready for review. |
d88aa44 to
c5da96d
Compare
c5da96d to
7440d06
Compare
7440d06 to
634c7fa
Compare
1f84a50 to
082d73e
Compare
corhere
left a comment
There was a problem hiding this comment.
LGTM aside from the review comments.
daemon/libnetwork/sandbox_linux.go
Outdated
| if eps := sb.Endpoints(); len(eps) > 0 { | ||
| sb.finishEndpointConfig(ctx) |
There was a problem hiding this comment.
TOCTTOU: sb.Endpoints() is called here, then immediately called again in finishEndpointConfig. Use the same copy of the endpoints slice for both the test and the loop. Either pass the slice of endpoints into sb.finishEndpointConfig, or move the len(eps) > 0 check into finishEndpointConfig and call it unconditionally.
There was a problem hiding this comment.
I've moved it into finishEndpointConfig to avoid creating the extra slice. Although, now sb.joinLeaveMu is held at this point, the set of endpoints won't be modified by anything else.
daemon/libnetwork/sandbox_linux.go
Outdated
| // If the Sandbox already has endpoints it's because sbJoin has been called for | ||
| // them - but configuration of addresses/routes (and so on) didn't complete because | ||
| // there was nowhere for it to go before the osSbox was set up. So, finish that | ||
| // configuration now. | ||
| if eps := sb.Endpoints(); len(eps) > 0 { | ||
| sb.finishEndpointConfig(ctx) | ||
| } |
There was a problem hiding this comment.
Is populating network resources always an idempotent operation? Otherwise we might have a problem. If (*Endpoint).sbJoin is called concurrently with SetKey, the joined endpoint could be configured twice. Since (*Sandbox).canPopulateNetworkResources() returns true as soon as sb.osSbox != nil, the sbJoin may populate the network resources before SetKey returns, which would result in the resources getting populated twice in a row if the sbJoin call adds the endpoint to the store before SetKey calls finishEndpointConfig.
There was a problem hiding this comment.
SetKey now acquires sb.joinLeaveMu before updating sb.osSbox to make sure the list of endpoints is stable.
| return nil | ||
| } | ||
|
|
||
| func (d *driver) ReleaseIPv6(ctx context.Context, nid, eid string) error { |
There was a problem hiding this comment.
Since it would lead to buggy behaviour if a driver which intended to implement IPv6Releaser was to no longer implement it, please add a compile-time assertion that the bridge driver implements the interface.
var _ driverapi.IPv6Releaser = (*driver)(nil)082d73e to
ad33bc4
Compare
When the SetKey hook is used (by a build container) it's called after Endpoint.sbJoin, which will have called Sandbox.populateNetworkResources to set up address, routes, sysctls and so on - but it's not able to do any config until the osSbox exists. So, Sandbox.populateNetworkResources is called again by SetKey to finish that config. But, that means the rest of Endpoint.sbJoin has already happened before the osSbox existed - it will have configured DNS, /etc/hosts, gateways and so on before anything was set up for the OS. So, if the osSbox configuration isn't applied as expected (for example, a sysctl disables IPv6 on the endpoint), that sbJoin configuration is incorrect. To avoid unnecessary config+cleanup in thoses cases - delay the config currently done by sbJoin until the osSbox exists. Signed-off-by: Rob Murray <[email protected]>
DNS is set up when the endpoint is joined to a network. It was added in commit 4850c5f (Avoid duplicate entries in /etc/hosts) then simplified in bcca214 (libnetwork: open-code updating svc records) and seems to be related to setting up a name on a swarm node that isn't running the container with the endpoint. But, all callers of Network.createEndpoint follow up with an Endpoint.Join, which also sets up the DNS entry. Those callers are: Network.createLoadBalancerSandbox Network.CreateEndpoint - called by Daemon.connectToNetwork - called by Sandbox.setupDefaultGateway - called by builder-net/executor.go: iface.init None of them bail out before the Join for a Swarm case. So, it looks like enough has changed that the createEndpoint code is no longer needed (it predates the internal DNS server) ... remove it. Signed-off-by: Rob Murray <[email protected]>
And move the Endpoint.populateNetworkResources code into the all-platforms part of the Sandbox method. Signed-off-by: Rob Murray <[email protected]>
Signed-off-by: Rob Murray <[email protected]>
When running:
docker network create --ipv6 b46
docker run --rm -ti \
--network name=b46,driver-opt=com.docker.network.endpoint.sysctls=net.ipv6.conf.IFNAME.disable_ipv6=1 \
busybox
IPv6 is enabled in the container and the network, so an IPv6 address
will be allocated for the endpoint.
But, when the sysctl is applied, the IPv6 address will be removed
from the interface ... so, no unsolicited neighbour advertisement
should be (or can be) sent and, the endpoint should not be treated
as dual-stack when selecting a gateway endpoint and, if it is
selected as the gateway endpoint, setting up an IPv6 route via the
network will fail.
So, if the IPv6 address disappears after sysctls have been applied,
release the address and remove it from the endpoint's config.
Signed-off-by: Rob Murray <[email protected]>
ad33bc4 to
2bb0443
Compare
akerouanton
left a comment
There was a problem hiding this comment.
One minor nit but feel free to ignore it. LGTM otherwise.
| // Current endpoint(s) providing external connectivity for the Sandbox. | ||
| // If ep is selected as a gateway endpoint once it's been added to the Sandbox, | ||
| // these are the endpoints that need to be un-gateway'd. | ||
| gwepBefore4, gwepBefore6 := sb.getGatewayEndpoint() |
There was a problem hiding this comment.
This can be moved inside of the condition below. We can also consider swapping the condition - call storeEndpoint and early return when !sb.canPopulateNetworkResources().
There was a problem hiding this comment.
In the case where the old gateways are needed, they have to be looked up before sb.addEndpoint(ep) - because adding the endpoint might change the gateway selection. And, addEndpoint needs to be called when they're not needed.
So, it could be guarded with sb.canPopulateNetworkResources() ... but I'm not sure it's worth updating - in the common case, "canPopulate" is true, and in the buildkit case there's probably only one gateway endpoint to select from anyway (so getGatewayEndpoint is cheap).
I remember dithering about whether to just store the endpoint and early return - but decided to keep the store at the end, just in case the endpoint is updated by the populate/update calls, and those updates don't make it into the store.
- What I did
When running:
IPv6 is enabled in the container and the network, so an IPv6 address will be allocated for the endpoint.
But, when the sysctl is applied, the IPv6 address will be removed from the interface ... so, no unsolicited neighbour advertisement should be (or can be) sent and, the endpoint should not be treated as dual-stack when selecting a gateway endpoint and, if it is selected as the gateway endpoint, setting up an IPv6 route via the network will fail.
In 27.x (the release that added per-endpoint sysctls), the error is ...
In 28.0 (because the
SetKeyhook isn't used any more, in this case), it's ...- How I did it
See the individual commit messages ... refactoring, with the bug fix in the final commit.
The refactoring means Sandbox and Endpoint config happens in the same order, whether or not the OS Sandbox has been created when the Endpoint is added.
- How to verify it
Exiting and new integration tests.
- Description for the changelog
- Fix an issue preventing container startup or selection of its network gateway when IPv6 is only disabled on a specific interface.