In Mac OS X 10.6, Apple introduced a syntax and runtime for using blocks (more commonly known as closures) in C and Objective-C. These were both later back-ported to Mac OS X 10.5 and the iPhone by Plausible Labs. However, there is very little documentation on blocks; even Apple's documentation that come with 10.6 is very vague on what memory management rules apply. I hope that you can avoid all the trial-and-error that I went through with the help of this guide.
This is sort of a bottom-up tutorial, going from the absolute basics in C, and moving towards higher abstractions in Objective-C. I doubt that reading just the Objective-C part will do you much good; at least, read the chapter on memory management in C first.
Have fun!
Joachim Bengtsson
Blocks are like functions, but written inline with the rest of your code, inside other functions. They are also called closures, because they close around variables in your scope. (They can also be called lambdas). Let me demonstrate what this means:
#include <stdio.h>
#include <Block.h>
typedef int (^IntBlock)();
IntBlock counter(int start, int increment) {
__block int i = start;
return Block_copy( ^ {
int ret = i;
i += increment;
return ret;
});
}
int main() {
IntBlock mycounter = counter(5, 2);
printf("First call: %d\n", mycounter());
printf("Second call: %d\n", mycounter());
printf("Third call: %d\n", mycounter());
Block_release(mycounter);
return 0;
}
/* Output:
First call: 5
Second call: 7
Third call: 9
*/
counter
is an ordinary C function that returns one of these fabled blocks. When you have
a reference to one, you can call it, just as if it was a function pointer. The difference is that the
block can use the variables in counter
even after the call to that function has returned!
The variables i
and increment
have become a part of the state of the block
mycounter
. I will go into more detail of how each part of this example works in
chapter 4.
In Ruby, they are used in place of many control structures. The standard
for
loop is replaced by an ordinary function taking a block:
[1, 2, 3, 4].each do |i|
puts i
end
# outputs the numbers 1, 2, 3, 4 on a line each
each
is an ordinary function that takes a block: the block is the code between the 'do' and the
'end'. This is much more powerful than one might think, because with the ability to write your own control
structures, you no longer need to depend on the evolution of the language to make your code terse and readable.
Functional programming has many such useful control structures, such as the
map
and reduce
functions; the
first maps each value in a list to another value, while the second reduces a list of values to a single value
(e g the sum of integers in the list). This is an entire topic in itself, and I encourage to learn more about
functional programming on Wikipedia as I won't be covering that anymore here.
In Erlang, they are used as a concurrency primitive together with the 'light-weight process', instead of the thread. This simple example looks through an array looking for a match, but each element is tested in its own separate process; each element being tested at the same time.
some_function() ->
lists:for_each([1, 2, 3, 4], fun(element) -> % (1)
spawn(fun -> % (2)
if
element > 1 andalso element < 4 -> % (3)
io:format("Found a match: ~p!~n", [element]); % (4)
true -> true
end
end
end),
io:format("This line appears after the above statement, "
"but still executes before the code in it."). % (5)
% Outputs:
% This line appears after the above statement, but still executes before the code in it.
% Found a match: 3!
% Found a match: 2!
% OR with the lines swapped in any order, depending on scheduling
Erlang looks weird to the uninitiated, so I'll step it through for you. On the line numbered (1),
we define an array with four numbers as elements, and calls the function lists:for_each
with that list as a first argument, and a block taking one argument as the second argument
(just as the function Enumerable#each
takes a block argument in the Ruby example above).
The block begins at the ->
and goes on until the last end
. All that first
block does is it spawn
s a new Erlang process (line (2)), again taking a block as an argument
to do the actual test, but now THIS block (line (2) still) is executing concurrently, and thus the test
on line (3) is done concurrently for all elements in the array.
In Cocoa, they can be used in both these ways, and more; for example, they are good for callbacks and delayed execution. Examples of these uses can be found at mikeash.com [4]. Apple's main reason for introducing blocks in C is most likely because they are perfect building blocks for concurrency. You can read more about Apple's Grand Central Dispatch in [5].
If you want to use blocks in applications targetting Mac OS 10.5 or iPhone OS 2 or 3, you need to use the third-party GCC fork Plausible Blocks. From their site:
Plausible Blocks (PLBlocks) provides a drop-in runtime and toolchain for using blocks in iPhone 2.2+ and Mac OS X 10.5 applications. Both the runtime and compiler patches are direct backports from Apple's Snow Leopard source releases.
PLBlocks is provided by Plausible Labs.
Variables pointing to blocks take on the exact same syntax as variables pointing to
functions, except *
is substituted for ^
. For example, this is a function
pointer to a function taking an int and returning a float:
float (*myfuncptr)(int);
and this is a block pointer to a block taking an int and returning a float:
float (^myblockptr)(int);
As with function pointers, you'll likely want to typedef those types, as it can get relatively
hairy otherwise. For example, a pointer to a block returning a block taking a block would be something
like void (^(^myblockptr)(void (^)()))();
, which is nigh impossible to read. A simple typedef
later, and it's much simpler:
typedef void (^Block)();
Block (^myblockptr)(Block);
Declaring blocks themselves is where we get into the unknown, as it doesn't really look like C, although they resemble function declarations. Let's start with the basics:
myvar1 = ^ returntype (type arg1, type arg2, and so on) {
block contents;
like in a function;
return returnvalue;
};
This defines a block literal (from after =
to and including }
),
explicitly mentions its return type, an argument list, the block body, a return statement, and
assigns this literal to the variable myvar1.
A literal is a value that can be built at compile-time. An integer literal (The
3
in int a = 3;
) and a string literal (The "foobar"
in
const char *b = "foobar";
) are other examples of literals. The fact that a block
declaration is a literal is important later when we get into memory management.
Finding a return statement in a block like this is vexing to some. Does it return from the enclosing function, you may ask? No, it returns a value that can be used by the caller of the block. See 'Calling blocks'. Note: If the block has multiple return statements, they must return the same type.
Finally, some parts of a block declaration are optional. These are:
myblock1 = ^ int (void) { return 3; }; // may be written as:
myblock2 = ^ int { return 3; }
myblock3 = ^ void { printf("Hello.\n"); }; // may be written as:
myblock4 = ^ { printf("Hello.\n"); };
// Both succeed ONLY if myblock5 and myblock6 are of type int(^)(void)
myblock5 = ^ int { return 3; }; // can be written as:
myblock6 = ^ { return 3; };
Calling blocks and returning values is as easy with a function or a function pointer. If you take the
value from calling the block, you will get the value returned with return
. Example:
typedef int(^IntBlock)();
IntBlock threeBlock = ^ {
return 3;
};
int three = threeBlock();
// Return values work just as in C functions. return needs to be explicit! This is not Ruby.
IntBlock fourBlock = ^ {
4;
};
// Yields on compile:
// error: incompatible block pointer types initializing 'void (^)(void)', expected 'IntBlock'
// This is because we neither specified the return type,
// nor provided a return statement, thus implying void return.
Using variables in the closure scope is very straight-forward if you're just reading them. Just use them,
and they will be magically managed by your block. However, if you want to be able to modify the variable, it need to be
prefixed by the __block
storage qualifier. Let's make the counter from the first example go forwards AND
backwards:
#include <stdio.h>
#include <Block.h>
typedef int (^IntBlock)();
typedef struct {
IntBlock forward;
IntBlock backward;
} Counter;
Counter MakeCounter(int start, int increment) {
Counter counter;
__block int i = start;
counter.forward = Block_copy( ^ {
i += increment;
return i;
});
counter.backward = Block_copy( ^ {
i -= increment;
return i;
});
return counter;
}
int main() {
Counter counter = MakeCounter(5, 2);
printf("Forward one: %d\n", counter.forward());
printf("Forward one more: %d\n", counter.forward());
printf("Backward one: %d\n", counter.backward());
Block_release(counter.forward);
Block_release(counter.backward);
return 0;
}
/* Outputs:
Forward one: 7
Forward one more: 9
Backward one: 7
*/
Note how we in the blocks use increment without doing any work, yet reference it outside the MakeCounter function.
However, we only read from it. We also use i
, but we modify it from inside the blocks. Thus,
we need the __block
keyword with that variable.
Sending and taking blocks as arguments is again like doing so with function pointers. The difference is that you can define your block inline, in the call.
#include <stdio.h>
#include <Block.h>
void intforeach(int *array, unsigned count, void(^callback)(int))
{
for(unsigned i = 0; i < count; i++)
callback(array[i]);
}
int main (int argc, const char * argv[]) {
int numbers[] = {72, 101, 108, 108, 111, 33};
intforeach(numbers, 6, ^ (int number) {
printf("%c", number);
});
printf("\n");
return 0;
}
/* Outputs:
Hello!
*/
Notice how we call intforeach with an inline block. (If you're wondering about the numbers: they are the ascii codes for the letters in the word "Hello!"), and how intforeach could have been an ordinary C function taking a function pointer and still be written the exact same way except for switching the ^
for a *
.
What does 'memory management' in the context of blocks mean? It doesn't mean storage for the actual code. The code of the block is compiled and loaded into the binary like any other function. The memory a block requires is that of the variables it has closed around; that is, any variables the block references need to be copied into the block's private memory.
So far, we have just assumed that the memory has somehow magically become part of the block, and will magically disappear. Unfortunately Apple haven't added a garbage collector to C, so that's not quite the case. However, to understand the following, you must have a basic understanding of stack and heap memory, and how they differ.
When you define a block literal, you create the storage for this literal on the stack. The variable pointing to this literal can still be considered a pointer, however. This means that the following code compiles, but does not work:
typedef void(^Block)(void);
Block blockMaker() {
int a = 3; // (1)
Block block = ^ { // (2)
return a;
}
return block; // (3)
}
int main() {
Block block2 = blockMaker(); // (4)
int b = block2(); // (5)
return 0;
}
This is basically what the code above does:
Block blockMaker() {
int a = 3; // (1)
struct Block_literal_1 *block;
struct Block_literal_1 blockStorage = ...; // (2)
block = &blockStorage; // (2b)
return block; // (3)
}
At (1), the value 3
is allocated on the stack and named a
, as expected.
At (2) and the following block, the value of a
is copied into the block literal. Then,
also at (2) (2b in the second example), a pointer to the literal is in turn assigned to the variable block
.
If you think of Block as just a pointer type, and the assignment in (2) as taking the address of a literal,
you might see why (3) is invalid, but we'll get to that.
In (4), the variable block2
gets a reference to the literal in (2). However, both
the variables a
and block
(together with block
s copy of
the value in a
) have now fallen off the stack, as
blockMaker
has returned. When we call block2
in (5), we might
segfault, get a corrupted value, or whatever — the behavior is undefined.
The same effect can be demonstrated without involving blocks:
int * intMaker() {
int a = 3; // (1)
return &a; // (2)
}
int main() {
int *b = intMaker(); // (3)
return 0;
}
intMaker
returns a pointer to an object on the stack (1), which will disappear together with the
rest of the state of the function call when intMaker
returns (2).
How do work around this problem? Simple — we move the block to the heap. The function
Block_copy()
takes a block pointer, and if it's a stack block, copies it to the
heap, or if it's already a heap block, increases its retain count (like an immutable object
in Cocoa would do). Exactly what happens is an implementation detail, but the thing to
take away is that you should never return a block literal from a function, but rather
a copy of it.
When we're done with the block, just release it with Block_release()
. Thus,
the correct way to implement the blockMaker example is like so:
typedef void(^Block)(void);
Block blockMaker() {
int a = 3;
Block block = ^ {
return a;
}
return Block_copy(block); // (1)
}
int main() {
Block block2 = blockMaker();
int b = block2();
Block_release(block2); // (2)
return 0;
}
Notice how we move the block to the heap in (1), and discard the block when we're done with it in (2). For anyone who has done Cocoa or CoreFoundation programming, this pattern should be familiar.
(An aside: If you try to return a literal like so: Block foo() { int i = 3; return ^ { printf("Fail %d", i); }; }
,
the compiler will complain that you are trying to return a stack literal, which isn't possible.)
Be careful! You don't have to return from a function for something to fall off the stack. The following example is equally invalid ([2] and [4]):
typedef void(^BasicBlock)(void);
void someFunction() {
BasicBlock block;
if(condition) {
block = ^ { ... };
} else {
block = ^ { ... };
}
...
}
// Basically equivalent of:
void someFunction() {
BasicBlock block;
if(condition) {
struct Block_literal_1 blockStorage = ...;
block = &blockStorage;
} // blockStorage falls off the stack here
else
{
struct Block_literal_1 blockStorage = ...;
block = &blockStorage;
} // blockStorage falls off the stack here
// and block thus points to non-existing/invalid memory
...
}
// Correct:
void someFunction() {
BasicBlock block;
if(condition) {
block = Block_copy(^ { ... });
} else {
block = Block_copy(^ { ... });
}
...
}
The really weird and wonderful thing about blocks is that blocks are actually Objective-C objects. Even if you create them from a C++ library, the specification says that every block must have the memory layout of an Objective-C object. The runtime then adds Objective-C methods to them, which allow them to be stored in collections, used with properties, and just generally work wherever you'd expect. These methods are:
-[Block copy]
: Exactly the same as Block_copy()
. This will give you a block
on the heap with an owning reference, just like calling -[NSObject copy]
on any other
object.
-[Block retain]
: This one is a bit weird, and almost the single reason why
I wrote this entire guide. This calls Block_retain(). However, the conventions for Objective-C
objects say that retain MUST return the same instance that it was called with. This means
that retain cannot call copy; I'll get to why this is really, really annoying.
-[Block release]
: Block_release()
.
-[Block autorelease]
: That's right, blocks can be autoreleased!
Blocks in Objective-C have one more very important difference from blocks in C in handling
variables that reference objects. All local objects are automatically retained as they
are referenced! If you reference an instance variable from a block declared in a method,
this retains self, as you're implicitly doing self->theIvar
. An example is in order:
typedef void(^BasicBlock)(void);
@interface LogMessage : NSObject {
NSString *logLevel;
}
@end
@implementation LogMessage
-(BasicBlock)printLater:(NSString*)someObject;
{
return [[^ {
NSLog(@"%@: %@",
logLevel, // (1)
someObject // (2)
);
} copy] autorelease]; // (3)
}
@end
Here's a method that simply returns a block that lets you print the given string, prefixed by the object's log level.
In (3), the block is copied, because as you remember, you can't return a block literal since it's on the stack.
Still, we want to follow common Cocoa patterns and never return an owning reference from a method called neither
copy
, retain
nor alloc
. This gives us the idiom [[^{} copy] autorelease]
,
which is what you should always use when returning blocks.
Now, for the auto retaining magic, notice how we reference the argument object someObject
in (2), and implicitly self
in (1) (which really says self->logLevel
).
When the block is copied in (3), Block_copy notices that logLevel and someObject are objects,
and retains them. When the block is released, it will free all its captured variables,
and if they are objects, release them.
With even self
being retained, this is a very easy way to accidentally create
reference cycles and thus memory leaks. What if you want to avoid this behavior? Just give the variable
__block
storage (Thanks to mikeash[4] for pointing this out). Example:
-(void)someMethod;
{
__block TypeOfSelf *blockSelf = self;
^ {
// Because blockSelf is __block, the following reference
// won't retain self:
blockSelf->myIvar += 3;
}
...
}
Why does this work? Well, if the variable is __block, it can be changed from within a block. If it's an object pointer, this means changing the object pointer itself, not the object. If the block autoretains object pointers, what should happen if the pointer is changed, pointing to another object? The concept is so hairy that they chose the simplest solution: __block storage objects simply aren't autoretained.
Finally, one more memory management gotcha before we move onto syntax. Remember that blocks are objects? And objects are automatically retained? Yes, that's right, blocks also automatically retain blocks they refer to! (My research.) This means that the following code will work fine:
typedef void(^BasicBlock)(void);
// Returns a block that aborts the process
-(BasicBlock)doSomethingAsynchronous;
{
BasicBlock cleanup = [[^{
// Do some common cleanup needed in all cases
} copy] autorelease];
__block AsyncDownloader *downloader = [AsyncDownloader fetch:@"http://domain/some.file" options:$dict(
@"success", [[^ {
[downloader.dataValue writeToFile:@"some/path" atomically:NO];
DisplayAlert(@"Download complete");
cleanup();
} copy] autorelease],
@"failure", [[^ {
DisplayAlert(@"Error: %@", downloader.error);
cleanup();
} copy] autorelease]
)];
return [[^ {
[downloader abort];
cleanup();
} copy] autorelease];
}
Notice how downloader
is declared __block
— otherwise me referencing it
in the callback blocks would retain it, and the downloader retains the blocks, thus creating a cycle.
$dict is a very handy macro
which creates a dictionary from key-value pairs (the NSDictionary constructor is too verbose for my taste).
Notice how all three blocks reference the cleanup block: it is thus retained and properly memory managed until the referring blocks disappear.
This example also highlights a problem with stack block literals — collections, such as NSDictionary above,
retain
s its values, but -[Block retain]
doesn't do anything on stack blocks!
Thus, we must move the blocks to the heap before we can insert them into the dictionary. For example, to
insert a few delayed actions into an array, you'd do it like this:
NSArray *someActions = [NSArray arrayWithObjects:
[[^ { NSLog(@"Hello"); } copy] autorelease],
[[^ { NSLog(@"World!"); } copy] autorelease],
[[^ { NSLog(@"Awesome.");} copy] autorelease],
nil
];
for (void(^block)() in someActions) {
block();
}
/* Outputs:
2009-08-23 13:06:06.514 arrayofblocks[32449:a0f] Hello
2009-08-23 13:06:06.517 arrayofblocks[32449:a0f] World!
2009-08-23 13:06:06.517 arrayofblocks[32449:a0f] Awesome.
*/
In the syntax department, you might have two questions. How do I use non-typedef'd blocks as method arguments, and can I use them as properties? This example should make that clear:
#import <Foundation/Foundation.h>
#import <stdlib.h>
@interface PredicateRunner : NSObject
{
BOOL (^predicate)();
}
@property (copy) BOOL (^predicate)();
-(void)callTrueBlock:(void(^)())true_ falseBlock:(void(^)())false_;
@end
@implementation PredicateRunner
@synthesize predicate;
-(void)callTrueBlock:(void(^)())true_ falseBlock:(void(^)())false_;
{
if(predicate())
true_();
else
false_();
}
-(void)dealloc;
{
self.predicate = nil;
[super dealloc];
}
@end
int main (int argc, const char * argv[]) {
NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];
srandom(time(NULL));
PredicateRunner *pr = [[PredicateRunner new] autorelease];
pr.predicate = ^ BOOL {
return random()%2;
};
[pr callTrueBlock:^ {NSLog(@"Yeah");} falseBlock:^ {NSLog(@"Nope");} ];
[pool drain];
return 0;
}
Typedefs will make that much easier to read, though.
See [1] and [2] for details on how blocks work with C++ variables. In essence: Copy constructors and destructors are being called as expected. Read-only variables are const-copied into the block.
Code snippets, libraries and frameworks using blocks to simplify working with Objective-C are popping up here and there as Mac and iPhone developers are figuring Blocks out and applying them to their common tasks. I'll add them here as I find them. Post in the comments if there's something I'm missing!
void *theBlock = ^(NSString *keyPath, CALayer *self, NSDictionary *change, id identifier) {
self.position = CGPointMake(self.viewPoint.scale * self.modelObject.position.x,
self.viewPoint.scale * self.modelObject.position.y);
};
CALayer *theLayer = ...;
[theLayer addKVOBlock:theBlock forKeyPath:@"modelObject.position" options:0 identifier:@"KVO_IDENTIFIER_1"];
- (void)registerObservation
{
[observee addObserverForKeyPath:@"someValue"
task:^(id obj, NSDictionary *change) {
NSLog(@"someValue changed: %@", change);
}];
}
If you want to read more about blocks, the following links are great places to keep reading:
This site is versioned, so to see exactly what has changed, or to download the entire site with samples and all, check out the repository.