Linux for little systems

[Posted December 17, 2003 by corbet]

Matt Mackall has picked up a new project: making the 2.6 kernel work on very small systems. This is, he says, "an area Linux mainstream has been moving away from since Linus got a real job." To this end, he has released a tree called 2.6.0-test11-tiny which incorporates a large set of patches aimed at slimming down the kernel. It's worth a look as an expression of just what needs to be done if you want to run Linux on small systems.

So what's required? The -tiny patch includes, among others, the following:

Building the kernel with the -Os compiler option, which instructs gcc to optimize for size. This option results in a smaller kernel; interestingly, there have also been reports that -Os yields better performance on large systems as well, since the resulting executable has better cache behavior.
The 4k kernel stack patch cuts the runtime per-process memory use significantly.
Various patches shrink the size of internal data structures to their minimum values. Target structures include the block and char device names hash tables, the maximum number of swapfiles, the maximum number of processes, the futex hash table, CRC lookup tables, and many others.
For truly daring users, the -tiny kernel has an option to remove printk() from the kernel entirely, along with its associated buffers and most of the strings passed to printk(). The space savings will be considerable; you just have to hope that the kernel has nothing important to tell you. Strings for BUG() and panic() calls can also be removed.
Various subsystems which are not normally optional become so. With the -tiny kernel, it is possible to configure out sysfs (which can take a lot of run-time memory), asynchronous I/O, /proc/kcore, ethtool support, core dump support, etc.
Inline functions are heavily used in the kernel; they can improve performance, and, in some situations, the use of inline code is mandatory. Excessive use of inline functions can bloat the size of the kernel considerably, however. The -tiny kernel includes a patch which makes the compiler complain about the use of inline functions, allowing a size-conscious developer to find which ones are invoked most often.

There are almost 80 separate patches in all. Matt claims that his kernel, when configured with a full networking stack, fits "comfortably" on a 4MB box, which is, indeed, considered small these days. Matt has some ambitious future plans, including cutting functionality out of the console subsystem and (an idea that is sure to raise some eyebrows) making parts of the kernel be pageable. It remains to be seen whether things will get that far, but there is no doubt that making Linux work on small systems is a worthy goal.

Linux for little systems

Posted Dec 18, 2003 2:34 UTC (Thu) by flewellyn (subscriber, #5047) [Link] (3 responses)

Building the kernel with the -Os compiler option, which instructs gcc to optimize for size. This option results in a smaller kernel; interestingly, there have also been reports that -Os yields better performance on large systems as well, since the resulting executable has better cache behavior.

Actually, this makes some sense. With the wide disparity between modern CPU speeds (blazingly fast) and memory bus speeds (rather slow), anything which helps improve cache coherence is going to improve performance greatly. It may even outweigh the improvements from "speed" optimizations such as inlining, loop unrolling, etc. Some benchmarks in this area alone would be interesting.

Linux for little systems

Posted Dec 18, 2003 9:35 UTC (Thu) by gnb (subscriber, #5132) [Link] (2 responses)

>Some benchmarks in this area alone would be interesting.
Yes, provided they were for a system very like the one you cared about. The trouble
is there is no one right answer for a kernel expected to run on everything from
an ARM with 8k + 8k of L1 cache and nothing else to a Xeon that can probably
get the whole kernel into L2.

Linux for little systems

Posted Dec 18, 2003 9:49 UTC (Thu) by phip (guest, #1715) [Link]

Or a PA-RISC that has 1.5M + 750K L1 (D & I respectively) cache and nothing else...

Linux for little systems

Posted Dec 30, 2003 21:30 UTC (Tue) by joern_engel (guest, #4663) [Link]

Actually, cache size shouldn't matter too much, as it is de-facto zero anyway.

Unless things go horribly wrong, most CPU-time is spend in userspace, not in the kernel. Therefore, the userspace will flush out most kernel instructions from the cache before switching to kernelspace again. Therefore, the cache is always cold, when it comes to the kernel.

With a cold cache, smaller code is also faster code. Your bottle-neck is the memory-bus.