PIC code
A quick PIC howto
All userspace code in Linux is PIC code. It is currently not possible to mix non-PIC object files and PIC object files when linking. Therefore the customer needs to generate PIC objects.
Any PIC code function is invoked with its address in register $t9, that's register $25. Using the address in $t9 the callee can compute the address of the GOT (global offset table). This looks like this:
function: .set noreorder .cpload $25 .set reorder
.cpload computes the address of the GOT. The assembler requires .cpload to be used in a noreorder section so that's the two .set pseudos. Also note .cpload is only needed for functions which reference global data or addresses. That is the assembler equivalent of
int add(int a, int b) { return a + b; }
doesn't need to do a .cpload but
int var; int get_ptr(void) { return &var; }
would need to do a .cpload.
Next there is the .cprestore operation. The global pointer which is stored in the $gp register (aka $28) is a callee saved register. So if a function is doing something that clobbers it, the value needs to be restored before returning. The assembler does that automatically when .cprestore is used. .cpload takes an argument which is the offset into the stackframe where the GP is stored. Putting all this together we get:
foo: .set noreorder .cpload $25 .set reorder subu $29, $29, 32 .cprestore 16
But if you assemble this the assembler will complain about a missing .frame operator. And missing .ent and .end. So we need to provide those we get:
blah: .ent blah .frame $29, 32, $31 .set noreorder .cpload $25 .set reorder subu $29, $29, 32 .cprestore 16 ... .end blah
.ent foo and .end foo only mark the beginning and end of the code for function foo. The arguments of .frame are in order the framepointer which usually is just $29 the stackpointer, the size of the stack frame which is 32 and $31 which is the return address register.
So let's complete this into a full hello world program:
.globl main main: .ent main .frame $29, 32, $31 .set noreorder .cpload $25 .set reorder subu $29, $29, 32 .cprestore 16 la $4, hello jal printf addiu $29, $29, 32 jr $31 .end main .data hello: .asciz "Hello world\n"
Note that the assembler code itself hasn't changed so much, it was just a bunch of pseudo ops we had to throw in. Also the -KPIC option needs to be passed to the assembler but by default gcc already does that. The big difference tonon-PIC code becomes visible when disassembling with objdump -d --reloc:
00000000 <main>: 0: 3c1c0000 lui gp,0x0 0: R_MIPS_HI16 _gp_disp 4: 279c0000 addiu gp,gp,0 4: R_MIPS_LO16 _gp_disp 8: 0399e021 addu gp,gp,t9 c: 27bdffe0 addiu sp,sp,-32 10: afbc0010 sw gp,16(sp) 14: 8f840000 lw a0,0(gp) 14: R_MIPS_GOT16 .data 18: 00000000 nop 1c: 24840000 addiu a0,a0,0 1c: R_MIPS_LO16 .data 20: 8f990000 lw t9,0(gp) 20: R_MIPS_CALL16 printf 24: 00000000 nop 28: 0320f809 jalr t9 2c: 00000000 nop 30: 8fbc0010 lw gp,16(sp) 34: 03e00008 jr ra 38: 27bd0020 addiu sp,sp,32 3c: 00000000 nop
As you notice the la and jal instructions are macro instructions and have been expanded into machine instructions in a rather different way than for non-PIC code.
If you wonder why the overhead - PIC code can easily be relocated without copying the entire code. That can save huge amounts of memory compared to the non-PIC code model when a binary is loaded multiple times.
See also
Though SGI has migrated away from MIPS processors for their systems their IRIX 5 documentation is a wealth of information on MIPS programming. Note that IRIX was using SGI's proprietery toolchain so not everything in these documents is directly aplicable to Linux/MIPS.