virtio/net: Advertise MTU to the guest via VIRTIO_NET_F_MTU#5828
Conversation
Codecov Report❌ Patch coverage is Please upload reports for the commit 9fce4f3 to get more accurate results.
Additional details and impacted files@@ Coverage Diff @@
## main #5828 +/- ##
==========================================
+ Coverage 82.87% 83.08% +0.20%
==========================================
Files 276 275 -1
Lines 29728 29481 -247
==========================================
- Hits 24637 24494 -143
+ Misses 5091 4987 -104
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Hi Anastassios, Thank you for your PR. We'll need some integration test for this. Perhaps a test where different TAPs are configured with different MTUs in the host and then checking that the guest sees the MTU we expect for each of them? Also, we should probably mention this change in the CHANGELOG.
This seems to purely be advertisement rather than negotiation, right? Best, |
63ad995 to
060fd39
Compare
58e8459 to
de8ca22
Compare
| .map_err(NetError::TapSetVnetHdrSize)?; | ||
|
|
||
| Self::new_with_tap(id, tap, guest_mac, rx_rate_limiter, tx_rate_limiter) | ||
| let mtu = tap.mtu().map_err(NetError::TapGetMtu)?; |
There was a problem hiding this comment.
I think perhaps it's better to not fail here, but instead log a warning and create the device without VIRTIO_NET_F_MTU.
| } | ||
|
|
||
| /// Get the MTU of the tap interface. | ||
| pub fn mtu(&self) -> Result<u16, TapError> { |
There was a problem hiding this comment.
Perhaps it would be easier to just read /sys/class/net/{iface}/mtu?
| import pytest | ||
| from tenacity import Retrying, stop_after_attempt, wait_fixed | ||
|
|
||
| import host_tools.network as net_tools # pylint: disable=import-error |
There was a problem hiding this comment.
This error seems to be globally suppressed in tests/pyproject.toml
|
|
||
| def test_tap_mtu_advertised_to_guest(uvm_plain_any): | ||
| """ | ||
| Verify that VIRTIO_NET_F_MTU correctly advertises the TAP MTU to the guest. |
There was a problem hiding this comment.
Can we also check that VIRTIO_NET_F_MTU is set?
| number of suppressed messages. | ||
| - [#5828](https://github.com/firecracker-microvm/firecracker/pull/5828): | ||
| Advertise TAP interface MTU to the guest via `VIRTIO_NET_F_MTU`. The | ||
| virtio-net device now reads the host TAP MTU using `SIOCGIFMTU` and stores it |
There was a problem hiding this comment.
nitpick: How the MTU is obtained (e.g. via SIOCGIFMTU ) is an implementation detail and perhaps it's not very important for the Changelog.
|
Hi Anastassios. Thanks for sending the new version. Could you rebase on top of main and resolve the conflicts so that we can run the full testing pipeline and make sure everything passes? |
f048435 to
4ee57a6
Compare
running the tests locally ATM, trying to resolve some linting issues - will let you know when I'm ready to kick off the CI. |
ok, linters fail with a timeout, but I'm not sure if its this PR changes or some glitch. Can you kick-off the CI and we can iterate? thanks! |
|
Thinking further about this. I had a quick look at what QEMU does and it doesn't seem to try and detect the MTU. It rather lets the user optionally specify it in the command line. This also works when the backend is something like Also a practical difference I noticed is that with |
totally agree, happy to include this as well as part of this PR.
are you concerned about not being able to use larger MTU sizes even when the host is bound to 1500, when(/if) FC virtio-net supports this kind of functionality (LRO/TSO etc.)? I'm assuming the config option you suggested resolves this issue. If the config is set, then we take into account the MTU set by the host interface. If not, we ignore it (not advertise it via the driver) and the guest can set whatever they want. |
Sounds good.
Yep.
I guess the current behaviour allows you to do weird things such as increasing both the host TAP MTU and the guest interface MTU after starting the VM. I don't know if anyone does this, but if we tried to set
To be clear, the config option will carry the MTU value, so it's on the administrator/operator to specify the MTU value (rather than Firecracker trying to determine it via sysfs or otherwise).
Yes, thus preserving the current behaviour by default. |
7efe2a8 to
12ad4c3
Compare
Add an optional `mtu` field to the `NetworkInterface` API (both the REST API and JSON config file). When set, the virtio-net device advertises VIRTIO_NET_F_MTU and places the value in config space at offset 10 (matching the virtio_net_config layout), allowing the guest driver to configure the interface MTU automatically. When `mtu` is absent (the default), VIRTIO_NET_F_MTU is not advertised and the guest uses its own default MTU — preserving existing behaviour. Implementation details: - Extend `ConfigSpace` with padding fields `_status` and `_max_virtqueue_pairs` at offsets 6 and 8 to correctly place `mtu` at offset 10 per the virtio spec (#[repr(C)], ByteValued safe) - `Net::new()` and `new_with_tap()` accept `mtu: Option<u16>`; the feature bit is set only when `Some` - `Net::mtu()` accessor reflects the advertised value for use in snapshot save and API GET responses - `NetConfigSpaceState` gains `mtu: Option<u16>` with `#[serde(default)]` for backward-compatible snapshot restore - Swagger, docs/device-api.md, and CHANGELOG updated accordingly Signed-off-by: Anastassios Nanos <[email protected]>
|
made an attempt -- have a look at your convenience and let me know if you think the implementation makes sense. |
| mtu: | ||
| type: integer | ||
| description: | ||
| MTU to advertise to the guest via VIRTIO_NET_F_MTU. When set, the guest |
There was a problem hiding this comment.
nit: Perhaps we can say "a compatible guest driver will configure the interface..."
| /// Returns the configured MTU if `VIRTIO_NET_F_MTU` is advertised, otherwise `None`. | ||
| pub fn mtu(&self) -> Option<u16> { | ||
| if self.avail_features & (1 << VIRTIO_NET_F_MTU) != 0 { | ||
| Some(self.config_space.mtu) |
There was a problem hiding this comment.
What happens if the guest overwrites the mtu field in write_config()? Should we ignore writes to it?
| pub host_dev_name: String, | ||
| /// Guest MAC address. | ||
| pub guest_mac: Option<MacAddr>, | ||
| /// MTU to advertise to the guest via VIRTIO_NET_F_MTU. When set, the guest driver |
There was a problem hiding this comment.
I feel this comment is a bit too verbose for this location. Perhaps "Maximum Transmission Unit" is enough here?
| # The iperf version to run this tests with | ||
| IPERF_BINARY = "iperf3" | ||
|
|
||
| # VIRTIO_NET_F_MTU feature bit index (virtio spec 5.1.3) |
There was a problem hiding this comment.
This references a section of the spec, but it doesn't specify the spec version
| rate-limiter support to virtio-pmem device to allow control over I/O bandwidth | ||
| generated by the FLUSH requests from the guest. | ||
| - [#5828](https://github.com/firecracker-microvm/firecracker/pull/5828): | ||
| Advertise TAP interface MTU to the guest via `VIRTIO_NET_F_MTU`. |
There was a problem hiding this comment.
via a new optional mtu option in the network-interfaces API endpoint
| for idx, mtu in iface_mtus: | ||
| iface = net_tools.NetIfaceConfig.with_id(idx) | ||
| vm.add_net_iface(iface, api=False) | ||
| vm.api.network.put( |
There was a problem hiding this comment.
This call is done by add_net_iface, why don't we append the mtu option above?
Changes
Add MTU negotiation support to the virtio-net device. The tap interface MTU is read using SIOCGIFMTU and stored in the ConfigSpace alongside the MAC address, with the intermediate padding fields (_status, _max_virtqueue_pairs) added to match the virtio_net_config layout.
VIRTIO_NET_F_MTU is advertised in the available features so the guest driver can read the MTU from the config space and configure the network interface accordingly, avoiding fragmentation or oversized-packet issues when the underlying tap MTU differs from the default.
Reason
An MTU != 1500 is not honored in the VM, causing packet drops from the VM.
License Acceptance
By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.PR Checklist
tools/devtool checkbuild --allto verify that the PR passesbuild checks on all supported architectures.
tools/devtool checkstyleto verify that the PR passes theautomated style checks.
how they are solving the problem in a clear and encompassing way.
in the PR.
CHANGELOG.md.Runbook for Firecracker API changes.
integration tests.
TODO.rust-vmm.