|
|
Subscribe / Log in / New account

The first half of the 6.8 merge window

Please consider subscribing to LWN

Subscriptions are the lifeblood of LWN.net. If you appreciate this content and would like to see more of it, your subscription will help to ensure that LWN continues to thrive. Please visit this page to join up and keep LWN on the net.

By Jonathan Corbet
January 12, 2024
The 6.8 merge window has gotten off to a relatively slow start; reasons for that include a significant scheduler performance regression that Linus Torvalds stumbled into and has spent time tracking down. Even so, 4,282 non-merge changesets have found their way into the mainline repository for the 6.8 release as of this writing. These commits have brought a number of significant changes and new features.

Some of the more interesting changes merged so far include:

Core kernel

  • The deadline servers mechanism has been added as a way to prevent the starvation of normal tasks when realtime tasks are using all available CPU time.
  • The zswap subsystem has gained the ability to force cold pages out to (real) swap when memory gets tight. This commit includes some documentation on how to opt into or out of this feature.

    There is also a new zswap mode that disables writing back to swap entirely; see this commit for details.

  • The DAMON memory-management facility now supports an auto-tuning mechanism; see this changelog for more information.
  • The new TRANSPARENT_HUGEPAGE_NEVER configuration option causes the use of transparent huge pages to be disabled by default.
  • Transparent huge pages can now be allocated in multiple sizes below the normal huge-page size. See this commit for some documentation on how to control this feature.
  • The new UFFDIO_MOVE operation for userfaultfd() allows pages to be moved within a virtual address space; see this commit for details.
  • The "KSM advisor" feature allows for automated tuning of the kernel samepage merging subsystem; see this commit and this documentation patch for details.
  • The BPF verifier has seen a considerable amount of work that should result in successful verification of a wider range of correct programs.

Filesystems and block I/O

  • The kernel is now able to prevent direct writes to block devices that contain mounted filesystems. This feature, controlled by the BLK_DEV_WRITE_MOUNTED configuration option, is disabled by default but seems likely to be enabled by distributors if it is shown to not break existing workloads. Writes to devices containing mounted Btrfs filesystems remain unrestricted in any case for now, pending the merging of some support patches into that filesystem. (See this article for some background on this change).
  • The listmount() and statmount() system calls have been merged; they allow user space to obtain detailed information about mounted filesystems. See this changelog for more information.
  • The XFS filesystem continues to accumulate changes adding support for the eventual online-repair feature.
  • The SMB filesystem has gained the ability to create block and character special files.
  • Bcachefs now has a partial (but functional) online filesystem check and repair mechanism.

Hardware support

  • Miscellaneous: DesignWare PCIe performance-monitoring units, Intel IAA compression accelerators, Intel QAT_420xx crypto accelerators, and Lantiq PEF2256 (FALC56) pin controllers.
  • Networking: Lantiq PEF2256 (FALC56) framers and Texas Instruments DP83TG720 Ethernet 1000Base-T1 PHYs. Also: a number of ancient wireless drivers (atmel, hostap, zd1201, orinoco, ray_cs, wl3501, rndis_wlan, and libertas 16-bit PCMCIA) have been removed.

Miscellaneous

  • Rust support has been added for the creation of network PHY drivers. This work includes a set of abstractions making the driver API available and a reference driver for Asix PHYs. This is the first user-visible Rust code added to the kernel, though it duplicates the functionality of an existing driver and thus does not add new features — yet.

Networking

  • There has been a fair amount of low-level work to reorganize a number of core networking data structures for better cache efficiency. This may seem like a small change but, as the networking pull request noted: "This improves TCP performances with many concurrent connections up to 40%".
  • The bpfilter subsystem was meant to be a way of writing firewall rules using BPF; it was first merged for the 4.18 kernel in 2018, but never got to a point where it was usable and has seen little development in recent years. The bpfilter code has now been removed, though development is said to continue in an external repository. The associated "usermode blob" mechanism (which was transformed into "usermode driver" in 2020) remains in the kernel, though there are no users for it.

Security-related

  • There are three new system callslsm_list_modules(), lsm_get_self_attr(), and lsm_set_self_attr() - for working with Linux security modules. See Documentation/userspace-api/lsm.rst for details.
  • The BPF token mechanism, which allows fine-grained delegation of BPF-related permissions, was initially merged into the networking tree for inclusion in 6.8. That code ran into trouble, though, when Torvalds realized that it was still treating file descriptor zero as being special; suffice to say he was not pleased. So this code was reverted for repairs; discussions are still underway and it will not be ready for this kernel release.

Internal kernel changes

  • The scope-based resource management mechanism feature has gained some new guards for conditional locks (as obtained with mutex_trylock() and the like). See this commit for a bit more information.
  • As expected, the venerable SLAB memory allocator has been removed, leaving SLUB as the only object-level allocator in the kernel. According to the merge message: "Removing the choice of allocators has already allowed to simplify and optimize the code wiring up the kmalloc APIs to the SLUB implementation".
  • The MAX_ORDER macro is no more; see this article for the whole story.
  • The kernel now builds with -Wmissing-prototypes (which generates warnings for calls to functions that have not had a prototype declared for them) on all architectures.

The 6.8 merge window can be expected to remain open through January 21. Tune back in once it has closed for a summary of the remaining changes merged for the next kernel release.

Index entries for this article
KernelReleases/6.8


to post comments

The first half of the 6.8 merge window

Posted Jan 12, 2024 15:27 UTC (Fri) by bluca (subscriber, #118303) [Link]

> The listmount() and statmount() system calls have been merged; they allow user space to obtain detailed information about mounted filesystems.

Party time!

The first half of the 6.8 merge window

Posted Jan 12, 2024 16:26 UTC (Fri) by adobriyan (subscriber, #30858) [Link] (4 responses)

> As expected, the venerable SLAB memory allocator has been removed

It was shipped in 2.2, so will be 25 years old by the time 6.8 released. End of an era!

The first half of the 6.8 merge window

Posted Jan 14, 2024 18:37 UTC (Sun) by jd (guest, #26381) [Link] (3 responses)

I feel kinda sad. Yes, very small memory machines don't really exist any more, but there was something comforting about the fact that Linux could run perfectly well in a wider range of environments than any other OS in history.

You could also tailor the OS to your very specific needs, optimising it for any imaginable parameter. There were almost zero assumptions, everything cound be tuned, and there were even third-party patches for tuning even further.

Bereft of abandoned architectures and barely-used schedulers, some of that absolute freedom has been lost.

Having said that, it's the little-used stuff that accumulates the defects. Linux has as high a defect density as it has because some parts just aren't exercised enough.

The source code is also very very big and navigating round it to understand the consequences of additions will be perilous.

I shudder to think of how big the tree would be if someone created a "holistic" tree containing all the abandoned projects, obsolete filesystems, and deleted sections, updated to work with the rest of the kernel as it now stands. It's possible to imagine it would be 10-15 megs larger, maybe more.

It would certainly have a lot of defects that you could never debug because the necessary combination of hardware would no longer exist anywhere.

(Having said that, the obsolete formats archive that stores all obsoleted technologies probably should have just such a Linux kernel, as that would provide the best chance of actually using any archaic technology. If it has ever been for a computer, odds are soneone wrote Linux support for it at some point.)

The first half of the 6.8 merge window

Posted Jan 14, 2024 18:56 UTC (Sun) by willy (subscriber, #9762) [Link]

Before SLOB was removed, there was considerable effort put into making sure that SLUB worked as well on low memory machines as SLOB did. But this is SLAB. I don't think there are any remaining workloads where SLAB outperformed SLUB. This simplification allows us to make more optimisations going forward -- which will also benefit low memory machines! There is nothing to be sad about here.

Having unused code in the tree holds us back. I'm just looking at NTFS and wondering whether I need to put in the effort to convert it to folios or whether we should delete it, since we now have NTFS3 in-tree. It'll probably save three months to just delete it.

The first half of the 6.8 merge window

Posted Jan 14, 2024 22:05 UTC (Sun) by pm215 (subscriber, #98099) [Link]

I think in such a hypothetical "holistic kernel" project, the size of the code base would be no issue but the "updated to work with the rest of the kernel as it now stands" part would be a colossal amount of work, even to get to "compiles, maybe in theory could work, untested" state. It's often the prospect of that work that causes those obsolete, abandoned or deleted lumps of code to be abandoned in the first place... If you actually wanted to use any of that code then doing so in the context of the kernel it was developed in would have much higher chances of success IMHO.

The first half of the 6.8 merge window

Posted Jan 15, 2024 12:35 UTC (Mon) by farnz (subscriber, #17727) [Link]

There's a pattern that I've seen, more than once, in software, that the removal of SLAB and SLOB allocators to leave just SLUB fits.

First, you need a hard problem where there's no obvious solution without compromising on at least one "performance" axis; be that memory consumed, CPU time taken, complexity of understanding the solution, whatever. You need it to be hard, because otherwise people will find the optimum solution quickly.

Second, you need a solution to the problem that's a local optimum; in other words, if you exclude certain cases (such as low memory systems), this solution is optimal, and it looks like it's impossible to make it work for those cases. SLAB met this.

Third, someone needs to come forwards with a second solution to the problem that falls into a different local optimum, and is better than the previous solution for some cases, but worse for others. This gives you two ways to solve the problem - e.g. SLOB for low memory systems, SLAB for big systems - depending on your use case, and leaves most people content.

With all of the above in place, sooner or later someone will come along who's able to learn from both existing solutions and provide one that can be made equal to or better than the best of both existing solutions on all axes you care about. And once you have that solution, you might as well use it for every case that the old solutions were used for.

Note, too, that in this case, you end up not needing the old solutions (e.g. SLOB, SLAB) at all, because the new solution has now been optimized to the point where it's always the right choice. Nothing of significant value has been lost by scrapping the old solutions, because the new solution beats the old solution in every respect (bar backwards compatibility with something that assumes details about the old solution that weren't set in stone to begin with - e.g. something assuming that it can inline a hand-written free method for a SLAB-allocated object "knowing" how SLAB's data structures are laid out).

The first half of the 6.8 merge window

Posted Jan 12, 2024 16:37 UTC (Fri) by NightMonkey (subscriber, #23051) [Link] (3 responses)

What does an "empty" kernel build mean in this context (referring to Linus' message warning of regressions linked to above)? I searched a bit but found nothing definitive. Thanks.

The first half of the 6.8 merge window

Posted Jan 12, 2024 17:03 UTC (Fri) by adobriyan (subscriber, #30858) [Link] (1 responses)

allnoconfig is about 20-30 seconds on many-cores desktop machines.

The first half of the 6.8 merge window

Posted Jan 12, 2024 18:40 UTC (Fri) by NightMonkey (subscriber, #23051) [Link]

Ah, 'allnoconfig'. Thank you.

The first half of the 6.8 merge window

Posted Jan 13, 2024 0:21 UTC (Sat) by PhilippeRoussel (subscriber, #23227) [Link]

Make in a tree when there is nothing to build because nothing changed since last build ?


Copyright © 2024, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds