The 3.11 merge window closes
Please consider subscribing to LWNSubscriptions are the lifeblood of LWN.net. If you appreciate this content and would like to see more of it, your subscription will help to ensure that LWN continues to thrive. Please visit this page to join up and keep LWN on the net.
![[New logo]](https://static.lwn.net/images/2013/3.11-logo.png)
Of those 9,494 changes, 1,219 were pulled since last week's summary. User-visible changes in that final batch of patches include:
- The new O_TMPFILE ABI has changed slightly in response to concerns expressed by Linus. In short,
open() ignores unknown flags, so software using
O_TMPFILE on older kernels has no way of knowing that it is
not, in fact, getting the expected temporary file semantics.
Following a suggestion from Rasmus
Villemoes, Al Viro changed the user-space view of O_TMPFILE
to include the O_DIRECTORY and O_RDWR bits — a
combination that always results in an error on previous kernels. So
applications should always get an error if they attempt to use
O_TMPFILE on a kernel that does not support that option.
- The zswap compressed swap cache has
been merged into the mainline. The changes to make the memory
allocation layer modular, called for
at this year's Storage, Filesystem, and Memory Management Summit,
appear not to have been made, though.
- The "blk-throttle" I/O bandwidth controller now properly supports
control group hierarchies — but only if the non-default
"sane_behavior" flag is set.
- The "dm-switch" device mapper target maps I/O requests to a set of
underlying devices. It is intended for situations where the mapping
is more complicated than can be expressed with a simple target like
"stripe"; see Documentation/device-mapper/switch.txt
for more information.
- New hardware support includes:
- Systems and processors:
ARM System I/O memory management units (hopefully pointing to an
era where ARM processors ship with a standard IOMMU) and
Broadcom BCM3368 Cable Modem SoCs.
- InfiniBand:
Mellanox Connect-IB PCI Express host channel adapters.
- Miscellaneous:
Intel's "Rapid Start Technology" suspend-to-disk mechanism and
Intel x86 package thermal sensors (see Documentation/thermal/x86_pkg_temperature_thermal
for more information).
- Video4Linux:
OKI Semiconductor ML86V7667 video decoders,
Texas Instruments THS8200 video encoders, and
Fushicai USBTV007-based video capture devices.
- Watchdog:
Broadcom BCM2835 hardware watchdogs and
MEN A21 VME CPU carrier board watchdog timers.
- Staging graduations: TI OMAP thermal management subsystems.
- Systems and processors:
ARM System I/O memory management units (hopefully pointing to an
era where ARM processors ship with a standard IOMMU) and
Broadcom BCM3368 Cable Modem SoCs.
Changes visible to kernel developers include:
- Module loading behavior has been changed slightly in that the
load will no longer fail in the presence of unknown module
parameters. Instead, such parameters will be ignored after the
issuing of a log message. This change allows system configurations to
continue working after a module parameter is removed or when an older
kernel is booted.
- The MIPS architecture now supports building with -fstack-protector buffer overflow detection.
Recent development cycles have lasted for about 70 days (though 3.10, at 63
days, was significantly shorter). If that pattern holds for this cycle,
the 3.11 kernel can be expected around September 9.
Index entries for this article | |
---|---|
Kernel | Releases/3.11 |
Posted Jul 16, 2013 18:03 UTC (Tue)
by yokem_55 (guest, #10498)
[Link] (3 responses)
Posted Jul 16, 2013 19:48 UTC (Tue)
by smoogen (subscriber, #97)
[Link] (2 responses)
Posted Jul 16, 2013 20:10 UTC (Tue)
by djc (subscriber, #56880)
[Link]
Posted Jul 18, 2013 15:31 UTC (Thu)
by mgross (guest, #38112)
[Link]
Posted Jul 16, 2013 18:44 UTC (Tue)
by geofft (subscriber, #59789)
[Link] (18 responses)
Posted Jul 17, 2013 3:10 UTC (Wed)
by proski (subscriber, #104)
[Link] (17 responses)
Posted Jul 17, 2013 3:42 UTC (Wed)
by geofft (subscriber, #59789)
[Link]
The additional feature here is that there's no effort spent picking a unique name, just to stop using that name.
Posted Jul 17, 2013 4:52 UTC (Wed)
by viro (subscriber, #7872)
[Link] (15 responses)
So there are two uses - one is race-free temp files (deleted when closed, never reachable from any directory, not subject to symlink attacks, not requiring to come up with unique names, etc. - basically, tmpfile(3) done right) and another is "create an initially unreachable file, write whatever you want into it, fchmod()/fchown()/fsetxattr() it as you wish, then atomically link it in, already fully set up".
Posted Jul 17, 2013 6:15 UTC (Wed)
by epa (subscriber, #39769)
[Link] (14 responses)
The piece that's missing is an atomic link-and-unlink operation where you link a file into a directory with a given name, at the same time unlinking any file that was previously there with that name (or even renaming the existing file atomically).
Posted Jul 17, 2013 6:16 UTC (Wed)
by Cyberax (✭ supporter ✭, #52523)
[Link]
Posted Jul 17, 2013 9:27 UTC (Wed)
by Karellen (subscriber, #67644)
[Link] (12 responses)
You can do that already by creating a file with the "wrong" name (e.g. "config.cfg.new") and calling rename(2) when you're finished. e.g.
rename("config.cfg.new", "config.cfg");
"instead open with O_TMPFILE and atomically link it with a certain name once you're finished."
Unfortunately, there currently is no way to create a new link to a file for which you only have a file handle.
There are some cases in which it wouldn't make sense, such as trying to link a socket fd into a filesystem, or creating a link in a mountpoint where the file does not reside, but a hypothetical fdlink() could return EXDEV as per rename() in that case. In any case, no-one's implemented it yet.
"The piece that's missing is an atomic link-and-unlink operation where you link a file into a directory with a given name, at the same time unlinking any file that was previously there with that name"
rename(2) already does this.
"(or even renaming the existing file atomically)."
You can do this with link(2), by e.g.
link("config.cfg", "config.cfg.old");
Posted Jul 17, 2013 10:12 UTC (Wed)
by epa (subscriber, #39769)
[Link] (2 responses)
I believe that
However, it is atomic in the looser sense that the filename at any moment links to either the old file or the new one. That may be good enough for many applications.
Posted Jul 17, 2013 10:40 UTC (Wed)
by Karellen (subscriber, #67644)
[Link]
Doh! That'll teach me to skim some messages.
Wow. That's incredibly neat. Hadn't thought of/seen that before. Thanks.
Posted Jul 17, 2013 17:22 UTC (Wed)
by dlang (guest, #313)
[Link]
as long as the target path always exists, and always points at either the old or the new, you should be in good shape.
now, to be crash safe, you need to fsync the file before doing the rename, and you need to not be using ext3 which has such horrid fsync behavior.
Posted Jul 17, 2013 10:43 UTC (Wed)
by mjg59 (subscriber, #23239)
[Link] (8 responses)
Posted Jul 17, 2013 23:35 UTC (Wed)
by Karellen (subscriber, #67644)
[Link] (2 responses)
I've seen that argument before, but it's always confused me. Surely that's only wanted as protection against an unexpected system crash/failure? Except - I didn't think that POSIX made any guarantees at all in that event. I thought your OS was "allowed" to overwrite your partition tables and FS journals completely in the event of a crash and still be POSIX-compliant.
(If not, how does POSIX expect to guarantee otherwise, unless POSIX compliance requires the absence of certain classes of bugs?)
Looking at the rationale section of POSIX fsync[0] documentation, fsync() is allowed to be the null operation, or to not cause data to actually be written, and that fsync() correctness could be considered a QoI issue.
However, the Open Group website documentation the closest thing I have to the actual POSIX spec. If there is another section somewhere dealing with the general problem of compliance in the face of bugs/power outages which is more enlightening, I would welcome a link to it, or a quote from it.
(FWIW, I think that Linux writing metadata before data is a poor QoI decision, and that the filesystem devs should strive to do otherwise, no matter what POSIX allows. However, IANA Kernel/FS developer, and am not properly informed on how hard, impractical or pessimal that might be.)
[0] http://pubs.opengroup.org/onlinepubs/009695399/functions/...
Posted Jul 18, 2013 0:25 UTC (Thu)
by mjg59 (subscriber, #23239)
[Link]
Posted Jul 20, 2013 17:56 UTC (Sat)
by giraffedata (guest, #1954)
[Link]
The thing about adherence to any standard is that one specifies the very adherence with myriad conditions, most of them implied. So POSIX doesn't say, "if the system crashes, a read doesn't have to get back the same data that was written." Rather, the system designer says, "the system is POSIX-compliant as long as the system never crashes." And as I said, that condition is usually not actually spoken. There are tons of similar conditions: the superuser does not write directly to the disk; the disk drive never makes a mistake; cosmic rays don't change magnetic state; etc.
Of course, designers do whatever they can to reduce the conditions; few systems today are offered on a "if the power ever goes out, nothing in the POSIX standard applies" basis.
Fsync drives us into the awkward territory of robustness. Robustness is a system's ability to work when it is broken. That contradiction in terms is why any specification of fsync is bound to be fuzzy. It's like saying, "I will pay you back by Tuesday. If I don't, ..."
Posted Jul 18, 2013 14:06 UTC (Thu)
by Tobu (subscriber, #24111)
[Link] (4 responses)
The unavailability of good (O_PONIES) semantics continues to amaze me.
The only option right now seems to be a combination of f(data)sync and deferred threads; but introducing threads has a nasty engineering cost.
The last I've seen of these issues (on the XFS list), maintainers were willing to take a new flag (don't know if that's possible; the VFS seems misdesigned to ignore new flags, see O_TMPFILES above) or a new VFS syscall that might be progressively implemented.
Posted Jul 18, 2013 14:23 UTC (Thu)
by viro (subscriber, #7872)
[Link] (3 responses)
See the talk by Michael Kerrisk re ABI suckitude a while ago - this is a prime example of such ;-/
Posted Jul 18, 2013 15:18 UTC (Thu)
by Cyberax (✭ supporter ✭, #52523)
[Link] (2 responses)
Just create a new syscall, say open2(), with a better-designed ABI. Old programs can still use open() and new ones can use the new syscall to get new features.
Posted Jul 18, 2013 19:58 UTC (Thu)
by paulj (subscriber, #341)
[Link]
Posted Jul 21, 2013 22:05 UTC (Sun)
by nix (subscriber, #2304)
[Link]
No, what you do if you really think programs will care is introduce a new open2(), wire it to a new version of open() in glibc, change the values of all the O_* constants in glibc (but *not* the kernel) to some new value range that doesn't intersect the old, and have glibc redirect all calls using any old flag values to the old open() and all new ones to open(), mapping the 'new' flag values in the userspace API to the kernel values (probably by subtracting a constant). You can also expose the old flags under new names, OBS_EXCL and the like,. That way, old apps get the old syscall, new ones get the new syscall, and new apps that really, really want the old semantics can get them.
If you thought it mattered that much, and really needed to do it, that's how you'd do it. No uglifying programs with horrible open2() nonsense. (Yes, you need a new glibc version to use this, but you need a new glibc to use any new syscall *anyway*.)
Posted Jul 16, 2013 22:43 UTC (Tue)
by roskegg (subscriber, #105)
[Link] (5 responses)
Was this patch series to fix swap problems included in the 3.11 merge?
Posted Jul 16, 2013 22:48 UTC (Tue)
by corbet (editor, #1)
[Link] (4 responses)
Posted Jul 17, 2013 4:57 UTC (Wed)
by roskegg (subscriber, #105)
[Link] (3 responses)
Posted Jul 17, 2013 17:04 UTC (Wed)
by jimparis (guest, #38647)
[Link] (2 responses)
Given a git commit ID and a local up-to-date clone, you can figure
that out with:
Posted Jul 17, 2013 17:58 UTC (Wed)
by mathstuf (subscriber, #69389)
[Link] (1 responses)
Posted Jul 17, 2013 18:27 UTC (Wed)
by jimparis (guest, #38647)
[Link]
I sent a mail to the kernel.org webmaster about a month ago, because they said "If you notice that something that used to work with gitweb no longer works for you with cgit, please drop us a note", but haven't received a reply. I haven't sent anything to cgit upstream though.
Posted Jul 17, 2013 3:05 UTC (Wed)
by timrichardson (subscriber, #72836)
[Link] (9 responses)
Posted Jul 17, 2013 8:00 UTC (Wed)
by rvfh (guest, #31018)
[Link]
Posted Jul 17, 2013 8:16 UTC (Wed)
by dsommers (subscriber, #55274)
[Link] (1 responses)
Posted Jul 17, 2013 9:04 UTC (Wed)
by dgm (subscriber, #49227)
[Link]
Posted Jul 17, 2013 16:18 UTC (Wed)
by drag (guest, #31333)
[Link] (3 responses)
I also expect that most people here already know this, but Microsoft's first operating systems were Unix variants, not DOS.
Posted Jul 17, 2013 19:14 UTC (Wed)
by Cyberax (✭ supporter ✭, #52523)
[Link]
And by the time 3.11 was released, NT was almost done. And the NT group had been dogfooding their own builds for more than a year by then.
Posted Jul 17, 2013 22:59 UTC (Wed)
by man_ls (guest, #15091)
[Link] (1 responses)
Posted Jul 18, 2013 13:33 UTC (Thu)
by anselm (subscriber, #2796)
[Link]
When Microsoft got started, the BASIC interpreter on many computers doubled up as the operating system since it was the (ROM-based) program that was booted when the computer was switched on and supported commands for operating-system-like tasks. Since Microsoft was behind most BASIC interpreters of late-70s/early-80s vintage computers, that would make those Microsoft's first »operating systems«.
{MS,PC}-DOS, when it was new, was basically a CP/M-80 knockoff for the 8088 CPU. It would probably have been reasonable at the time to do DOS development on CP/M, at least to the degree necessary to get a kernel running on the 8088 machine. It was possible to translate Z80 assembly language to 8088 assembly language automatically, so once you had a call-compatible CP/M-like kernel around you could probably get most of your other stuff across fairly quickly.
Posted Jul 25, 2013 13:37 UTC (Thu)
by Wol (subscriber, #4433)
[Link] (1 responses)
WfW 3.11 was the first version to contain a TCP/IP stack, and the TCP/IP update disk for Windows 3.1 was released at the same time. It effectively upgraded 3.1 to 3.11.
I was there - the network stack (was designed to?) break WordPerfect 6, and I remember the grief it gave us, forcing us to choose between networking or word processing. That was a major factor in our office shifting from WordPerfect to Word.
Cheers,
Posted Jul 25, 2013 20:19 UTC (Thu)
by BenHutchings (subscriber, #37955)
[Link]
Posted Jul 18, 2013 14:30 UTC (Thu)
by Tobu (subscriber, #24111)
[Link] (2 responses)
Posted Jul 19, 2013 6:00 UTC (Fri)
by rusty (guest, #26)
[Link] (1 responses)
You were never supposed to remove module parameters, but that rule has proven difficult to enforce. This was chosen as a lesser evil, but of I'm wrong I will revert it before release.
Cheers,
Posted Jul 19, 2013 16:37 UTC (Fri)
by pflugstad (subscriber, #224)
[Link]
<moduleName>.blacklist=yes
on the kernel command line. I'm guessing this works because of the old behavior (fail module load due to unknown param). So this change would break that? Or is actual "blacklist" parameters handled differently?
[1]: http://askubuntu.com/questions/110341/how-to-blacklist-ke...
Posted Jul 25, 2013 0:46 UTC (Thu)
by heijo (guest, #88363)
[Link] (1 responses)
It seems to me adding any such flags breaks compatibility, and that adding a new "openat2" syscall is needed due to the preexisting mistake of not returning an error on invalid flags.
Posted Jul 25, 2013 20:27 UTC (Thu)
by BenHutchings (subscriber, #37955)
[Link]
The 3.11 merge window closes
The 3.11 merge window closes
The 3.11 merge window closes
The 3.11 merge window closes
I missed the definition of O_TMPFILE. The best explanation seems to be the previous LWN article:
The 3.11 merge window closes
The new O_TMPFILE option to the open() and openat() system calls allows filesystems to optimize the creation of temporary files — files which need not be visible in the filesystem. When O_TMPFILE is present, the provided pathname is only used to locate the containing directory (and thus the filesystem where the temporary file should be). So, among other things, programs using O_TMPFILE should have fewer concerns about vulnerabilities resulting from symbolic link attacks.
Pretty useful.
I hope the files would still be visible in the filesystem. Otherwise we would get DOS-style "hidden" and "system" files.
The 3.11 merge window closes
The 3.11 merge window closes
The 3.11 merge window closes
_Without_ O_EXCL you are able to create a link to them later with something like linkat(AT_FDCWD, "/proc/self/fd/42", AT_FDCWD, pathname, AT_SYMLINK_FOLLOW) (if 42 is a descriptor of normal opened-and-unlinked file, you'd get -ENOENT from that, and so you would with O_TMPFILE|O_EXCL opened descriptors).
The 3.11 merge window closes
The 3.11 merge window closes
The 3.11 merge window closes
rename("config.cfg.new", "config.cfg");
The 3.11 merge window closes
Unfortunately, there currently is no way to create a new link to a file for which you only have a file handle.
I thought the linkat
trick described by Al Viro in the grandparent comment would achieve that.
rename
is not fully atomic. As the manual page says, "However, when overwriting there will probably be a window in which both oldpath and newpath refer to the file being renamed.". It's also not atomic over NFS (though enhancements to the NFS protocol may be out of scope for Linux kernel discussions).
The 3.11 merge window closes
The 3.11 merge window closes
The 3.11 merge window closes
The 3.11 merge window closes
The 3.11 merge window closes
The meaning of fsync
I thought your OS was "allowed" to overwrite your partition tables and FS journals completely in the event of a crash and still be POSIX-compliant.
The 3.11 merge window closes
The 3.11 merge window closes
The 3.11 merge window closes
The 3.11 merge window closes
The 3.11 merge window closes
Was this patch series included?
I'm not sure I understand the question...you gave a link to Linus's repository, so you're aware that the patch is in the mainline. Yes, it was pulled in during the 3.11 merge window, if that's the question.
Was this patch series included?
Was this patch series included?
Was this patch series included?
$ git describe --contains 75485363ce8552698bfb9970d901f755d5713cca
v3.11-rc1~99^2~392
which tells you that the first tag containing this commit was
v3.11-rc1, so it made the window. It used to be that you could get this information from kernel.org's "raw"
gitweb output, saving the need for a local clone, but they recently switched to cgit which doesn't seem to provide that info.
Was this patch series included?
Was this patch series included?
The 3.11 merge window closes
The 3.11 merge window closes
The 3.11 merge window closes
The 3.11 merge window closes
The 3.11 merge window closes
The 3.11 merge window closes
First Microsoft OS
I also expect that most people here already know this, but Microsoft's first operating systems were Unix variants, not DOS.
I found a detailed article which does not agree: apparently the first Xenix release was late in 1982, while DOS was released in August 1981. It is true that Microsoft had been working on Xenix since before they started with DOS, though. So it depends on how you define "first operating systems".
First Microsoft OS
The 3.11 merge window closes
Wol
The 3.11 merge window closes
The 3.11 merge window closes
Module loading behavior has been changed slightly in that the load will no longer fail in the presence of unknown module parameters. Instead, such parameters will be ignored after the issuing of a log message. This change allows system configurations to continue working after a module parameter is removed or when an older kernel is booted.
That's short-sighted; it trades easy-to-diagnose failures for hard-to-diagnose failures. Udev rules won't be able to unambiguously require module features anymore. Surely the fact that previously working functionality now requires a human to read a logfile would be a red flag.
The 3.11 merge window closes
Rusty.
module changes
The 3.11 merge window closes
O_TMPFILE now includes multiple bits that were already defined and could not be used together, plus one new bit (__O_TMPFILE). All of those bits must be set to create an unnamed temporary file. So, new userland that uses O_TMPFILE will fail cleanly on old kernels, and old userland that sets __O_TMPFILE will run unchanged on new kernels.
The 3.11 merge window closes