Monday, August 09, 2010

Thread-based thread-level Mach exception handler and debugging

Apple implemented an awesome feature in their version of gdb in Xcode 3.2 (SnowLeopard is minimum OS requirement) for those (I hope few) of us1 who (1) have implemented a thread-based Mach exception handler, (2) actually use it with a mask for EXC_BREAKPOINT to conditionally self-handle breakpoints, and (3) still want to be able to debug the result: thread dont-suspend-while-stepping. The problem that occurs is that when gdb is stepping (either by user specified command or in an attempt to get to a safe-to-read-objc state), it freezes all the other threads in your task. Normally, this is exactly what you want—isolation from the other aspects of your program while you inspect and complete something under the watchful eye of the debugger.2 However, if one of those threads is the thread that is set up to handle a thread-level exception by looping receiving messages off of a Mach port, its Mach port gets sent the message (before gdb’s, which is a task-level exception handler, which gets second shot at such exceptions), and the thread never is woken up to handle it, so gdb deadlocks. Enter thread dont-suspend-while-stepping. You tell it to not bother suspending your exception thread, and now it is awake to handle the exception message, deal with it, and respond back to the kernel. Then, unless it intercepts a temporary breakpoint or a single-step exception that wasn’t set by it, the Mach exception message will then be sent to gdb’s handler, and then gdb will resume control nicely. If you want to do this automagically, you can set yourself a breakpoint on thread entry function for your exception-handling thread, have it prevent its own suspension, and then continue on. The general code to run would be:
thread dont-suspend-while-stepping on -port ((mach_port_t)pthread_mach_thread_np((pthread_t)pthread_self()))
pthread_self() will return the current pthread, and then pthread_mach_thread_np will return the Mach port given a pthread.
----
1 At the moment, I suspect that the people who are actually doing this are
  • us, i.e., Mac CoreCLR
  • Java
  • Flash
It’d be interesting to know if there were yet others, especially if there’s any likelihood you’ll find yourself in a browser process-space. There are some unfortunate interactions that occur when each of these apps either stomp on each others’ exception handling registration, or try to forward messages along the “chain” of handlers. To wit, if you’re calling thread_set_exception_ports, you’re definitely doing it wrong, and if you’re calling thread_swap_exception_ports, it’s only that it’s very likely that you’re doing it wrong. Less so if you’re doing it on entry/exit of your special code. Much more so otherwise.
2 It’s not a panacea, in that if you have complicated timing issues between two threads, you’ll need to be a little more inventive.

No comments: