LFCS: The LLVMLinux project

Please consider subscribing to LWN

Subscriptions are the lifeblood of LWN.net. If you appreciate this content and would like to see more of it, your subscription will help to ensure that LWN continues to thrive. Please visit this page to join up and keep LWN on the net.

By Jake Edge
May 7, 2013

The Linux Foundation Collaboration Summit (LFCS) seems to be a likely venue for an update on the status of building the kernel with Clang/LLVM. Both in 2011 and 2012, we covered those updates. LFCS 2013 continued the trend as LLVMLinux project lead Behan Webster presented the status and plans for the project at LFCS. The gathering lived up to its name as well, since two problems faced by the project were solved through collaboration at the summit.

Webster is a computer engineer who has worked in a lot of different industries: automotive, telecommunications, embedded systems, etc. For 20 years or so, he has worked on Linux, using it either as a development environment or as the operating system of a shipping product. He is the project lead for the LLVMLinux project, which is an effort to get the Linux kernel to build with Clang and LLVM.

LLVM is a set of libraries that can be used to build tools, he said. Those tools could be compilers, linkers, or JITs, or they could be tools for source code analysis, metadata extraction, code refactoring, and so forth. Most of the latter kinds of tools don't directly relate to LLVMLinux, but the toolchain for building C, C++, and Objective C programs does. The toolchain starts with Clang, the frontend for C-family programs, but there are other pieces as well, including the compiler-rt library, lld and MCLinker linkers, checker static analyzer, LLDB debugger, and others.

Why?

There are a number of reasons, beyond just simply having a choice, that make compiling Linux with LLVM a useful endeavor. For one thing, it compiles faster, which is important to reduce the time of the code-compile-run-debug cycle. LLVM is a fast-moving project, Webster said, and it is "amazing how fast Clang has caught up" to GCC in some areas. It generates faster code than GCC in some areas—slower code in others.

LLVM is already being used in many different domains, like audio and video in such projects as llvmpipe, CUDA, and Renderscript. There is an advantage to having a single toolchain used in all of these different domains as compiler extensions only need to be written once and can then apply in lots of different places. LLVM also has a different license—BSD-style—which is not better or worse than others, Webster said, but it allows tool vendors to do things with LLVM that they couldn't with GCC. LLVM is also well-supported by a large group of full-time developers.

The static analyzer for LLVM is "amazing", though it does not yet work on the kernel code. This tool has traditionally done better than other, similar tools. One of the headline features is "fix-it hints", which are suggestions for fixing small, localized problems in the source; a similar feature now appears in GCC 4.8. That is one example where we are seeing "more and better" development from both projects because of the competition between them, Webster said.

The Google compiler team has built a tool around LLVM that shows a novel use of the compiler toolkit. The tool looks for common problems that appear as certain patterns in the LLVM intermediate representation and can map them back to the C++ code to show the programmer where they made an error. This shows things that you can do that we haven't seen before, he said.

Beyond Android's use of LLVM for Renderscript and Gallium3D's llvmpipe driver, there are also distributions looking at building all or parts of their repositories using LLVM. For Debian, Sylvestre Ledru has been building the repository using Clang. His most recent results show that there are more failures than before, Webster said, because new versions of Clang are more strict. Gentoo is looking at support for LLVM/Clang, as is FreeBSD.

LLVMLinux

The goal of the LLVMLinux project is to get to the point where the kernel will build with LLVM. Ultimately, that means that LLVM would have everything it needs to build the kernel, while the kernel would get any changes it requires so that it can be built with LLVM.

To that end, the project has a Git repository for its build framework that contains scripts and patches to build the kernel. The scripts fetch, patch, and build various pieces, like LLVM/Clang, toolchains for cross assembly and linking, the kernel, and QEMU.

Several cross-toolchains are supported, including Codesourcery (which is the default), Linaro/Ubuntu, and Android. Those are necessary for the GNU assembler and linker, as those pieces for LLVM are not yet mature enough to use for building the kernel. There is support in the project's tree for various targets, including Versatile Express, x86-64, Raspberry Pi, and Nexus 7. There are other targets in progress, including the Galaxy S3 and BeagleBone.

The project is using Buildbot for continuous integration. Any time there is a commit in the project, LLVM, or kernel repositories, a full build is kicked off. In addition, the Linux Test Project suite is run nightly using QEMU for the Versatile Express.

Problem areas

Getting to this point has been "challenging", Webster said, as there are a number of difficulties the project has run into. To start with, LLVM's integrated assembler (IA) can't be used on kernel code because it doesn't handle the format used by the kernel's assembly code. In addition, IA does not handle 16-bit code. Building would be faster using IA, he said, but that's just not possible right now, which results in a dependency on the GNU toolchain.

The Linux code is GCC-specific in a number of ways. GCC (and thus the kernel) conforms to the gnu89 standard, while Clang conforms to gnu99 (which is "essentially C99" with some GCC additions). Webster thinks that a future version of GCC will move to C99, which will help. Kernel developers have long said that the standard is insufficient, so the kernel code goes beyond it in various ways. But, the standards have largely caught up, he said, though there are still some "notable exceptions". It is almost as if the Linux kernel has been driving the C standards.

Beyond that, there are some GCC flags that the kernel uses which are not supported by Clang. Some of the built-in functions are very different between Clang and GCC. Another problem area is Kbuild, which is also GCC-specific. In particular, unsupported flags cause different return values from GCC and Clang. In both cases, a warning is issued, but the different return code causes problems.

There are several GCC flags that are not supported by Clang, including -fconserve-stack, -fdelete-null-pointer-checks, -fno-inline-functions-called-once, and -mno-thumb-interwork. More details on these can be found in the slides [ODP] from Webster and Mark Charlebois's 2012 Linux Plumbers Conference talk.

There are a handful of GCC language extensions that will never be part of LLVM/Clang, he said, including variable-length arrays in structures (VLAIS). Zero-length arrays at the end of structures are supported, as are variable-length arrays outside of structures, but constructs like:

    struct foo {
        char a[n];
    } bar;

are not allowed by C99 or the more recent C11. The LLVM developers are not inclined to add VLAIS, as it makes the compiler harder to maintain, so LLVMLinux has to try to convince kernel developers to remove them. Currently, iptables, cryptographic hashing (HMAC), and a few other places use VLAIS and the maintainers like the code the way it is, Webster said. But the project has been running some tests on alternatives and getting the same or slightly better performance by switching away from VLAIS.

A member of the audience asked if it made sense to fork Clang to add VLAIS support, but Webster downplayed that option. All of the Clang developers are of a single mind about this particular feature, he said, though they would be supportive of a fork if someone decided to take that path. In the end, it comes down to changing "dozens of lines" of kernel code versus a "significant architectural change" in the LLVM toolchain. Turning the arrays into pointers is an easy way to fix the problem, but some maintainers don't want to take changes just to support a different compiler, he said, in answer to follow-on questions.

Nested functions are another problem area, but they are used infrequently. In general, when patches come in that have nested functions, the kernel developers have pushed back and asked for a rewrite. However, the Thinkpad ACPI driver still has nested functions, though Webster sent a patch to change that. He has not yet heard back, but it is just a rearrangement of the code, without any functional change.

Neither of those two features are in the C standard, nor are they used very frequently in the kernel. It would be relatively easy to not require them. The other extensions used by the kernel are "innocuous and simple", he said. For example, Clang does not support variables explicitly assigned to specific CPU registers, but he would like to see that support get added.

A recently discovered bug in Clang's C preprocessor is behind a problem with section reference mismatches. The attributes being used to put __init and __exit functions into different ELF sections are being mishandled. Webster "firmly believes" that the kernel pushes the tools harder than any other code base, and this is one example where that pushing has found bugs. It is probably just a corner case that had not been tested (or exercised by other code built with Clang), resulting in various "section reference mismatch" and "merged global" messages (as well as kernel module loading problems on x86) when the kernel was linked.

Webster then mentioned __builtin_const_p(), which GCC makes available to test whether the argument is a constant value, but LLVM cannot support as it cannot really make that distinction. At that point, Steven Rostedt started poking around in the kernel code, asking the occasional question. In a post-summit email, Webster said that Rostedt tracked the use of __builtin_const_p() down to a particular commit in mm/slab.c made by Christoph Lameter, who was also present at the summit. Within 24 hours, Rostedt and Lameter worked out a replacement; Rostedt was even able to run it by Andrew Morton, who agreed in principle to the change.

Status

For LLVM, all required patches from the project are now upstream. There are still some outstanding issues, including the section reference mismatch problem above. He believes that Clang 3.3 will mostly work out of the box for building kernels with the LLVMLinux patches.

Another outstanding issue is a segmentation fault in Clang when it is compiling arch/arm/mm/context.c. That problem seemed to be related to using atomic_64 operations along with nearby inline assembly code and Webster was having difficulty creating a simple test case. After the talk, another attendee, Konstantin Serebryany, suggested using creduce, which is an LLVM-based tool that can bisect code to try to find a minimal subset that reproduces a problem. By the time Webster started figuring out the tool later that day, Serebryany had attached a minimal test case to the bug report. Within another few days, it was fixed in the LLVM tree. In his email, Webster paraphrased the famous saying by noting that LFCS showed him that "the right eyes make bugs shallow".

There are still some kernel patches that need to be pushed upstream as well. Kbuild support, removing VLAIS use, handling __builtin_const_p(), and so on, all need solutions either in the upstream kernel or elsewhere.

It is not strictly necessary, but he would like to use the LLVM integrated assembler for its speed, too. Getting the checker static analyzer to run on the kernel would be useful as well.

Anyone wanting to help will find a variety of ways to do so, Webster said. Helping to push patches upstream or work on unsupported features is one way. In addition, LLVMLinux is just working for x86 and ARM right now, adding MIPS and other architectures would be nice. As might be guessed, interested folks will find information on mailing lists, IRC, and more on the project's web page.

Index entries for this article
Kernel	Development tools/LLVM
Conference	Collaboration Summit/2013

(Log in to post comments)

Linux driving C standards

Posted May 7, 2013 17:43 UTC (Tue) by ncm (subscriber, #165) [Link]

It is almost as if the Linux kernel has been driving the C standards.

There is literal truth to this. Our own Paul McKenney was deeply involved in defining and standardizing a memory model that rigorously supports C++ multi-threading, for C++11. (Up until now all multi-threaded programs have relied on not-too-clever optimizers, the over-specified x86 memory bus, not-too-many cores, and luck.) The ISO C Standard committee adapted the C++ memory model for their own use, with compatible semantics. The Linux kernel benefits from the newly rigorous C memory model.

The x86 over-specified memory bus design makes interaction of threads on more than a few cores impractical, as the bus spends a growing fraction of its time keeping all the caches up to date, leaving less time for real work. The new C++ and C memory models relax some consistency requirements to make more parallelism practical, but that makes reasoning about shared memory in multi-threaded programs even trickier. (When core A writes to address P, when might core M see the new value there?) While the rules are now rigorous, they are also too hard for mere mortals with other interests to use correctly: clever shortcuts fail for reasons that are hard to explain. Ultimately, better languages will apply the rules implicitly. In the meantime, we must rely on higher-level abstractions and avoid reasonable-seeming shortcuts.

Linux driving C standards

Posted May 7, 2013 18:06 UTC (Tue) by alonz (subscriber, #815) [Link]

Do you know of any source that explains this new memory model in a way mere mortals can understand (and most importantly—provides examples of "good" and "bad" interactions)?

I'd love to learn more, but from my experience with standards, they are generally the wrong source for study… (Except if you really want to spend significant time with the subject)

Linux driving C standards

Posted May 7, 2013 18:50 UTC (Tue) by ncm (subscriber, #165) [Link]

A good approximation is

http://www.google.com/search?q=memory+model+herb+sutter

The better you understand these topics, the less you will trust code that touches on them, and the more astonished you will be to see anything appear to work right.

Linux driving C standards

Posted May 8, 2013 16:18 UTC (Wed) by kjp (subscriber, #39639) [Link]

That two part talk by herb was scary awesome, at least for this mortal, user-space programmer. Thanks for the link.

Linux driving C standards

Posted May 8, 2013 23:19 UTC (Wed) by bgmarete (subscriber, #47484) [Link]

Rather tangentially, I was recently very intrigued to learn that under Go's memory model, if one goroutine writes to a global variable, another goroutine need not ever see the new value unless you use a lock or a channel for synchronization. Not sure but I guess that this simplifies implementation on various machines (or in C).

Linux driving C standards

Posted May 9, 2013 23:31 UTC (Thu) by jameslivingston (guest, #57330) [Link]

That's similar to Java's memory model, where only synchronization causes happens-before constraints, meaning you need to use it, volatile variable or Atomic* to guarantee the other threads see changes.

It both simplifies implementation and improves performance because unless you do something involving concurrency interactions you don't have to pay the cost for ensuring consistency between concurrent threads.

It allows the JIT compiler to make what would otherwise be completely unsafe optimisations. For a chuck of code which contains none of those concurrency interaction points, the JIT compiler can assume that all values untouched by that code are effectively constants (even across method calls).

In the mostly-uncontended case, the JIT compiler will even make known-unsafe optimisations, with a small "did another thread change anything" check which when it fails, causes it to bail out of the unsafely-optimised code into a "fix everything up" slow path. You can't really do that in C since you can do too many things in your own code :)

Linux driving C standards

Posted May 7, 2013 18:19 UTC (Tue) by tstover (guest, #56283) [Link]

Do you have good link for more reading on this?

Linux driving C standards

Posted May 13, 2013 18:42 UTC (Mon) by tvld (guest, #59052) [Link]

To me, the formalizations of the C++11 memory model by a research group at Cambridge are the clearest description:
http://www.cl.cam.ac.uk/~pes20/cpp/

They also have a tool with which tells you about all the possible executions of small pieces of code:
http://svr-pes20-cppmem.cl.cam.ac.uk/cppmem

LFCS: The LLVMLinux project

Posted May 8, 2013 9:57 UTC (Wed) by oak (guest, #2786) [Link]

> Nested functions are another problem area, but they are used infrequently.

Does GCC nowadays generate correct code for nested functions also on ARM?

LFCS: The LLVMLinux project

Posted May 8, 2013 23:33 UTC (Wed) by csamuel (✭ supporter ✭, #2624) [Link]

> LLVM also has a different license—BSD-style—which is not better
> or worse than others, Webster said, but it allows tool vendors
> to do things with LLVM that they couldn't with GCC.

Like change it and not tell you how it's been changed; so if you're on an OS that bundles it you can no longer replicate that compilation environment yourself as you won't necessarily be able to get their changes to the compiler.

I'm not sure that's an improvement (for me, at least).

LFCS: The LLVMLinux project

Posted May 9, 2013 5:03 UTC (Thu) by hpa (guest, #48575) [Link]

If LLVMLinux depends on the kernel removing all uses of __builtin_constant_p() then they might as well give up now. It is an incredibly useful construct when building certain types of low-level building blocks, and we will continue to use it.

LFCS: The LLVMLinux project

Posted May 9, 2013 13:49 UTC (Thu) by jake (editor, #205) [Link]

> If LLVMLinux depends on the kernel removing all uses of
> __builtin_constant_p()

Well, there must be something special about the one in mm/slab.c. I'm not sure what that is, but, as you point out, there are lots of uses of it in the kernel; there was no mention of removing them all.

jake

LFCS: The LLVMLinux project

Posted May 14, 2013 1:37 UTC (Tue) by nlewycky (subscriber, #63373) [Link]

The premise of builtin_constant_p is that if the input can be folded into a constant then it returns true, else it returns false. The first problem is that compiler optimizers are a stack of transformations that run -- they don't iterate until the program converges. Consequently there is a fixed set of optimizations that will run before builtin_constant_p is evaluated and another fixed set of optimizations that run afterwards. If you choose to evaluate builtin_constant_p too early then it may fold to false even though the compiler ends up resolving the expression to a constant, but if you evaluate it too late then you're missing the optimizations that folding it will expose.

The most common use of builtin_constant_p looks something like this:

  if (__builtin_constant_p(val)) {
    return __bswap_constant_64(val);
  } else {
    __asm__("bswap %0", ...);

which is a really bad idea. Inline assembly is a great way to tie your compiler's hands and prevent any optimizations. Even if you called this with a constant input, the compiler sure can't optimize it now. The right answer isn't even __builtin_bswap64, it's to write the functionality in C and file bugs if your compiler doesn't emit the assembly it should.

Of course you might think that this sample code is fine anyways, because if there were constant arguments it takes the __bswap_constant_64 branch which folds it away. Setting aside the optimization ordering problem above, builtin_constant_p doesn't return true on everything you might consider constant.

For instance “static const int x = 0x1234; bool test() { return __builtin_constant_p(x); }”, returns false at -O0 and true at -O2. Here's another “static const char *x = "abc"; bool test { return __builtin_constant_p(x); }” that returns false even at -O2. Why? You see, glibc uses builtin_constant_p in its macro definition of strcpy in such a way that if builtin_constant_p were to return true and builtin_strlen failed to calculate the string length, it would miscompile your program (reading off the end of the string). If builtin_strlen can't fold it, builtin_constant_p must return false. Except that it also returns true on integers. Sometimes. My point is that there's rules here that are undocumented except that if you changed gcc's builtin_constant_p to return true more often and violated those assumptions you'd notice when programs built with glibc were miscompiled, so that's good I guess?

As for Clang, it evaluates builtin_constant_p() before any optimizations have run, which is correct but conservative. It also makes sense because your code already needs to work when builtin_constant_p returns false, unless you're actually relying on gcc's optimizer not changing in the future.

LFCS: The LLVMLinux project

Posted May 14, 2013 5:08 UTC (Tue) by foom (subscriber, #14868) [Link]

> Here's another “static const char *x = "abc"; bool test { return __builtin_constant_p(x); }” that returns false even at -O2. Why?

I don't believe your explanation.

To start with, x isn't const in your example: it can be modified. However, it seems that the optimizer *is* smart enough to figure out that a variable not declared const is actually never modified, and treat it as if it was const. That's good and sensible, and thus not actually the issue here.

So, even with:
static const char * const x = "abc";
gcc still returns false for __builtin_constant_p.

This actually makes perfect sense, because you're actually asking if the pointer itself is a constant value, but said pointer value isn't known until link time. And __builtin_constant_p's contract is to return true only if it is a compile-time-constant (and thus if expressions containing it could be constant-folded).

This appears to have nothing to do with builtin_strlen, which *does* in fact evaluate to a constant 3 as one might expect.

LFCS: The LLVMLinux project

Posted May 14, 2013 7:16 UTC (Tue) by nlewycky (subscriber, #63373) [Link]

Sorry, I didn't mean to say that the only reason builtin_constant_p returned false for that example is because of its contract with builtin_strlen. I have no idea why it returns false in that case. I was trying to point out that these undocumented contracts exist.

>And __builtin_constant_p's contract is to return true only if it is a compile-time-constant (and thus if expressions containing it could be constant-folded).

Counterexample, __builtin_constant_p("string literal") is true, even though the address of that string isn't known at compile time. You already pointed out that the equivalent program written with a variable instead of a literal is not considered constant, which is surprising because it makes no difference to the optimizer or any program semantics. For more fun, try &"asd"[0] and &"asd"[1].

So if it does return true for some pointers, why not the one in your example? I thought it's because some builtins are guaranteed to fold when their arguments are literals but not when their arguments are variables, even const variables, hence builtin_constant_p must be conservative enough to work with all builtins—but also aggressive enough to run after inlining or else user code will break once again. I thought "some builtins" included builtin_strlen, but apparently I'm mistaken.

I remembered my glibc example a little wrong. The parts I'm thinking of are bits/string2.h, from the #define strncmp (which uses strlen and builtin_constant_p) and ultimately calls down into the #define'd __strcmp_gc which in turn dereferences string[0] through string[3]. And consider a user's program:

static unsigned char BOM[] = { 0xEF, 0xBB, 0xBF };
bool test() { return strncmp(..., (char *) BOM, 3)); }

The only thing preventing that from reading off the end of BOM is a contract between builtin_constant_p and strlen (really builtin_strlen through more macros) that it won't return true for any pointers unless builtin_strlen can analyze them.

By the way, if glibc simply didn't #define strncmp, gcc would recognize the function by name and do all the right optimizations itself, better than it can with these builtin_constant_p-using macros in the way.

llvm-based tools vs. clang-based tools

Posted May 11, 2013 5:57 UTC (Sat) by nlewycky (subscriber, #63373) [Link]

"The Google compiler team has built a tool around LLVM that shows a novel use of the compiler toolkit. The tool looks for common problems that appear as certain patterns in the LLVM intermediate representation and can map them back to the C++ code to show the programmer where they made an error. This shows things that you can do that we haven't seen before, he said."

This paragraph is mixing two sorts of tools. There are tools based on the clang's ASTs which are normally static tools, such as the compiler warnings, the AST matchers and clang mapreduce. None of these involve matching on LLVM IR, rather, these work by matching patterns in your C++ code and can thus point to the problems in the written C++ code.

Then there are tools based on LLVM IR which are (so far) all dynamic tools, address/thread/memory sanitizer, but they don't really map back to C++ code. They produce reports similar to valgrind.

We don't have any tools that match patterns in LLVM IR and then point out bugs in the C++ code. You can match LLVM IR and point out bugs in the final binary, or you can match clang ASTs and point out bugs in the incoming source code. Trying to go from LLVM IR back up to the original C++ code is hard, not designed for, and we don't do it. (In essence, you're relying on the correctness of optimized debug info, which LLVM doesn't do yet.)

LFCS: The LLVMLinux project

Posted May 12, 2013 16:58 UTC (Sun) by Lennie (subscriber, #49641) [Link]

One of the weirdest things I've seen LLVM being used for is Emscripten. Emscripten is way to compile code that can be compiled with LLVM. To be compiled as JavaScript-code. They've compiled things like OpenGL-games with Emscripten to run in the browser with WebGL.

They even went a step further as a fun side project and compiled LLVM itself:

http://badassjs.com/post/39573969361/llvm-js-llvm-itself-...

LFCS: The LLVMLinux project

Posted May 14, 2013 23:08 UTC (Tue) by marcH (subscriber, #57642) [Link]

> Turning the arrays into pointers is an easy way to fix the problem, but some maintainers don't want to take changes just to support a different compiler, he said,

Then "some" maintainers are not very clever.

Supporting an additional compiler is effectively supporting an additional static analysis tool (in one form or the others), which means finding more bugs. Tested and verified multiple times. Briefly mentioned in the article above. If and when that comes for cheap then you should certainly celebrate how lucky you are and pay the price immediately.

May	AUG	Feb
	10
2019	2020	2021