The Wayback Machine - https://web.archive.org/web/20100621034215/http://lwn.net/Articles/269120/
LWN.net Logo

Advertisement

E-Commerce & credit card processing - the Open Source way!

Advertise here

linux-next and patch management process

By Jonathan Corbet
February 13, 2008
The kernel development process operates at a furious pace, merging on the order of 10,000 changesets over the course of a 2-3�month release cycle. There have been many changes over the last few years which have helped to make this level of patch flow possible, and the process has been optimized significantly. An ongoing discussion on the kernel mailing list has made it clear, though, that a truly optimal solution has not yet been found.

It started with the announcement of the linux-next tree. This tree, to be maintained by Stephen Rothwell, is intended to be a gathering point for the patches which are planned to be merged in the next development cycle. So, since we are currently in the 2.6.25 cycle, linux-next will accumulate patches for 2.6.26. The idea is to solve the patch integration issues there and reduce the demands on Andrew Morton's time.

The question which was immediately raised was this: how do we deal with big API changes which require changes in multiple subsystems? These changes are already problematic, often requiring maintainers to rework their trees in the middle of the merge window. Trying to integrate such changes earlier, in a separate tree, could bring a new set of problems. There will be a lot of conflicts between patches done before and after the API change, and somebody is going to have to put the pieces back together again. Andrew does some of that now, but the problem is big enough that not even Andrew can solve it all the time. The bidirectional SCSI patches merged for 2.6.25 were held up as an example; that change required coordinated SCSI and block layer patches, and it never was possible to get the whole thing working in -mm.

Arjan van de Ven asserted that the only way to make large API changes work is to merge them first, at the beginning of the merge window. The merged patch would fix all in-tree users of the changed API, as is the usual rule. Maintainers of all other trees could then merge with the updated mainline, fixing any new code which might be affected by the API change. This is, essentially, the approach which was taken for the big device model changes in 2.6.25; they hit the mainline at the beginning of the merge window, then everybody else got to adapt to the new way of doing things.

Greg Kroah-Hartman worries that this approach is not sufficient, especially when live trees are being merged. If an API change in one tree forces a change to a separate tree, the coordination issues just get hard. Keeping the secondary changes in the primary tree risks conflicts with patches in the proper subsystem tree. Patches which reach across trees are also, increasingly, being discouraged as making life harder for everybody. But the fixup patch will not apply to its nominal subsystem tree as long as the API change itself is not there. In the -mm tree, this sort of problem is glued together by a series of fixup patches maintained by Andrew; Greg says that the linux-next tree would need something similar.

David Miller's suggestion was to resolve this sort of conflict through frequent rebasing of the -next tree. Rebasing is an operation (supported by git and other code management tools) which takes a set of patches against one tree and does what's required to make them apply to a different version of the tree. It can be quite useful for maintaining patches against a moving target - which kernel trees tend to be. David talked about how he rebases his (networking subsystem) trees frequently as a way of eliminating conflicts with the mainline and, in the process, cleaning some cruft out of the development history.

It turns out, though, that this frequent rebasing is not popular with the developers who are downstream of David. Rebasing the tree forces all downstream contributors to do the same thing, and to deal with any merge conflicts that result. It makes it much harder to prepare trees which can be pulled upstream and creates extra work.

This was where Linus jumped into the conversation and expressed his dislike of rebasing. He echoed the complaints from downstream developers that a constantly-rebased tree is hard to prepare patches against. It also confuses the development history, making changes to other developers' patches in silent ways. After somebody's patch set has been rebased, it is no longer the patches that were sent. So, says Linus:

So there's a real reason why we strive to *not* rewrite history. Rewriting history silently turns tested code into totally untested code, with absolutely no indication left to say that it now is untested.

It is about here that Andrew Morton commented that git does not appear to be matching entirely well with the way that kernel developers work. Some of the solution may be found in tools more oriented toward the management of patch queues - such as quilt. There may be a renewed push to get more quilt-like functionality built into git (along the lines of the stacked git project) in the near future.

Linus is also not entirely pleased with how the integration of patches only happens in the mainline:

I'm also a bit unhappy about the fact you think all merging has to go through my tree and has to be visible during the two-week merge period. Quite frankly, I think that you guys could - and should - just try to sort API changes out more actively against each other, and if you can't, then that's a problem too.

His suggestion is that a separate git tree should be created to contain a large API change - and nothing else. Affected subsystem maintainers could then merge that tree and develop against the result. In the end, all of the pieces should merge nicely in the mainline.

This approach raises a number of interesting issues. The API-change tree has to be agreed upon by everybody, and it must be quite stable - lots of changes at that level will create downstream trouble. There must also be a high degree of confidence that this API-change tree will, in fact, get merged into the mainline; should Linus balk, everybody else's trees will no longer be applicable to the mainline. Replacing the current "tree of trees" patch flow with something messier could create a number of coordination issues. And there are fears that a mainline tree built from this process would fail to build in many of its intermediate states, which would make tools like "git bisect" much harder to use. Even so, it could be part of the long-term solution.

Linus also took the opportunity to complain about large-scale API changes in general:

Really. I do agree that we need to fix up bad designs, but I disagree violently with the notion that this should be seen as some ongoing thing. The API churn should absolutely *not* be seen as a constant pain, and if it is (and it clearly is) then I think the people involved should start off not by asking "how can we synchronize", but looking a bit deeper and saying "what are we doing wrong?"

He also stated that the costs of big API changes are high enough that we should, more often, stay with older interfaces, even if they are not as good as they could be. Others disagreed, claiming that Linux must continue to evolve if it is to stay alive and relevant.

The rate of change seems unlikely to fall in the near future. There may be some changes to how big changes are done, though. As suggested by Ted Ts'o, more changes could be done by creating entirely new interfaces rather than breaking old ones. With Ted's scheme, the old interface would be marked "deprecated" at the beginning of the merge window. Developers would then have the entire development cycle to adjust to the change, and the deprecated interface would be removed before the final release.

There is resistance to this approach, based on the observation that getting rid of deprecated interfaces tends to be harder than one would expect. But, still, it is a relatively painless way of making changes. The current transition (in the memory management area) from the nopage() VMA operation to fault() is an example of how it can work. Nick Piggin has been slowly changing in-tree users with the eventual goal of removing nopage() altogether. For now, though, both interfaces coexist in the tree and nothing has been broken.

Like the kernel itself, its development process is undergoing constant change and (hopefully) improvement. As the development community and the rate of change continues to grow, the process will have to adjust accordingly. What changes come out of this discussion remain to be seen. But it's worth noting that Andrew Morton fears that the biggest problem - regressions and bugs - will be relatively unaffected.


(Log in to post comments)

Copyright � 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds