Here be dragons. This bug was the bane of my existence for two weeks. The dreaded EXC_BAD_ACCESS.
The trouble with this crash is it gives you basically zero information, and often the frame (backtrace) is invalid (so, less than zero information: wrong information).
So the first step is to look at the backtrace. Unfortunately, the backtrace has only system calls, so this is going to be a long night…
Next step is debugging with NSZombies. Zombie is a special debugging mode that, instead of freeing objects when they are dealloced, replaces them with an object of type NSZombie that basically throws exceptions whenever you try to do anything with it. The easiest and most reliable way to turn on NSZombies is detailed over here. Sure enough, a run with Zombies enabled gives me one of these:
2009-03-30 02:30:36.172 appName[3997:20b] *** -[CALayer release]: message sent to deallocated instance 0x59bf670
So the class causing the error is (er, was, before it was Zombied) CALayer. Great.
This is going to be a really long night…
Next step is trying to find out something about the object’s lifecycle. If I can figure out where this phantom CALayer is being created, that may give me some insight into what I am doing that is causing the overrelease. Fortunately, Apple has some great tools for this. Unfortunately, this bug was the perfect storm and managed to break all of them.
First up: malloc_history. malloc_history is a command-line tool that parses malloc stack logs. By setting a couple of environment variables, the ObjC runtime will log every call to malloc and free complete with backtraces to a file.
Three strikes, you’re out!
Second up: Instruments. Instruments has Zombie support, plus it can (check the “Record reference counts” checkbox) record every retain/release call to every object (in theory). Hopefully, Instruments will help me figure out where this mythical CALayer is being over-released!
Four strikes, you’re out!
At this point, I was determined to get a reference-count-log, so I wrote a bunch of code to try to get the bug reproducing in the simulator. But the bug reproduces only when the user is taking a picture, and there’s no camera in the simulator. I had a hunch it was memory-related though, so I wrote a bunch of code to push fake camera view controllers onto the stack and take up lots of memory. Finally, I got an EXC_BAD_ACCESS and the reference log I had been waiting for…
Yeah… let’s just say that didn’t help at all. The object is referenced exactly two places (one retain, one release, both in system libraries), and yet somehow the reference count jumps automagically from 1 to -1.
Fast forward a week later, made some significant changes to the codebase that (by chance) give us a little better view into the problem. This time, instead of a CALayer that’s getting over-released, it’s suddenly a UIView (probably the UIView that owns the CALayer that was formerly causing the problem). Reference-count-log:
Why in the world is an NSKeyValueCoding internal call decrementing the reference count smack-dab in the middle of the object’s lifecycle? Let’s look at the code:
IBOutlet UIView *viewCausingCrash; //snip... UIView *viewCausingCrash = [[[UIView alloc] initWithFrame:old.frame] autorelease]; [old removeFromSuperview]; [scroller addSubview:viewCausingCrash]; old = viewCausingCrash;
What in the world could be wrong with this code? The object is retained by scroller, so it won’t get dealloced. All the reference counts are balanced. Where is this call to NSKeyValueCoding coming from?
Well, it turns out Interface Builder is magic. Really, really magic.
Recall that when an object is awoken out of a nib file, it has a reference count of 1. Normally you can think of IB objects as “alloced” because they will pretty much always be valid for the lifetime of your ViewController, barring low-memory situations.
But. Suppose you have things set up like I do, where you have an IB “object” that is replaced programmatically by another object. In this case, I have an IB object, old, which is really just a placedholder for where viewCausingCrash is going to go, because I like dragging rects around in IB over hardcoding digits into the code. So IB hands me an object, and I pretty much throw that object away, by making old point to some new object.
Since IB is a good memory citizen, it is eventually going to try to release the object that it originally unarchived (to balance the unarchive’s +1 to the reference count). However, instead of turning to some internal pointer to figure out where to send the release message, it just uses whatever the IBOutlet is connected to at the moment. An IBOutlet which, at the point the release is made, is not retained by IB, because it points to a very different object. Oops.
To further complicate matters, IB sends the release a lot later than you would expect. The release doesn’t happen on viewDidUnload (where I would have found it many days sooner), but in fact the release happens when/if the view loads again. So if you modify the value of the IBOutlet pointer in your viewDidLoad method, who knows what will happen? It actually varies from platform to platform. The perfect heisenbug.
For reference, the way IB’s outlet reconnections work is a whole topic in itself. See here for a discussion of how the memory management works on each platform. it’s just similar enough to convince you that it’s the same, and just different enough to cause you to lose hair. For instance, it looks like Mac OS retains everything first and then autoreleases only those objects that appear to IB as if they have a parent. Meanwhile iOS retains and then autoreleases everything, and then retains only those objects that appear to IB as if they have a parent. Maddeningly backwards. Seriously, go read the docs.
But here’s the fix, and apparently this design pattern works across all the hairy platforms and runtimes.
Is this a pain? Yes. Now I have to interact with every IBOutlet no less than SIX FREAKING DIFFERENT TIMES. (Variable declaration, IB connection, @property declaration, @synthesize, dealloc, and viewDidUnload). Yuck. But it did fix the bug.
Further reading:
Comments
Comments are closed.