Cyrus' New Completely Useless Blog

filed the dissertation Lisp

At Berkeley, when you file your dissertation, you get a lollipop that says "PhinisheD" on it.

Here's Olivia's impression of what I would look like when I filed:

cyrusphd

If you're interested in reading about an atlas of spatial patterns of gene expression in Drosophila melanogaster, the dissertation has been filed and can be found here.

The appendix describes some of the Common Lisp packages used in creating the atlas. Hopefully a paper describing these in more detail will appear soon.

Hmm... the search engines don't seem to be aware that dissertation == thesis. Perhaps this will help.

x86-64/macos SBCL porting progress Lisp

Well, after many months I finally decided to dust off the x86-64/macos SBCL port. After a couple days and some invaluable telepathic debugging help from Juho Snellman, I was finally able to get through make-target-2 and get a full core up and running. But, of course, there are still some problems. First, in order to compile make-target-2.lisp, I had to set sb-ext:evaluator-mode to :interpret. Second, there are still some rather major bugs that cause the system to drop into LDB far too often. But, at least there's some progress. I'll post a patch in the next few days and, with any luck, get these bugs fixed and the code into the tree before too long.

mach exception handlers for SBCL Lisp

Well, it's been far too long since I've updated this page to reflect the status of threads and what not on SBCL/MacOS. In the meantime, the lutex stuff has landed on the trunk and the threads stuff has been cleaned up a bit, but is still somewhat unstable and doesn't pass the threads tests without catching an illegal instruction error. To address this, I, along with some major help from Alastair Bridgewater, have added an experimental feature for SBCL to use mach exception handling instead of (just) BSD-style signals.

The good news, irrespective of threads, is that this fixes the long-standing "CrashReporter" problem that many have complained about and it makes it so that one can use GDB with SBCL. Previously, GDB choked on SBCL's strategy of using mprotect for protecting memory in non-exceptional cases by preventing GDB from stepping across the EXC_BAD_ACCESS (the mach exception equivalent of a SIGBUS or SIGSEGV) and mach using mach exception handlers gets around this. Anyway, this has been checked in to the SBCL trunk but, at least for the moment, should be considered experimental and must be enabled by building with the :mach-exception-handler feature. Oh, and there's no PPC port for this yet.

As for threads, they are still not quite there, but certainly seem better and I have been using them for development work for some time. Hopefully the added debuggability will help in tracking down the remaining issues.

SBCL MacOS/x86/threads update Lisp

Well, it's getting closer. The lutex branch no longer kernel panics, thanks to mutex locks around the i386_set_ldt calls, and the garbage collection-caused memory corruption seems to be fixed. So it builds, builds itself, and all tests pass, usually. Unfortunately, it's the "usually" that is the problem. About 10% of the time, the threads test hangs with a thread waiting for a mutex lock that it's never going to get. I'm not sure if the problem is a subtle race condition in the code or if there are problems with MacOS' pthreads mutex/condition variable implementation, but it happens often enough that there is definitely some sort of problem somewhere. Hopefully this will get merged onto the head before too long, after the 0.9.13 release. Oh, and slam.sh still doesn't work.

frobbing the EIP in a mach exception handler Lisp

A while back I asked something to the effect of "how do you frob the EIP in a mach exception handler?" Well, the answer is that you get the MACHINE_THREAD_STATE (i386_THREAD_STATE in this case) via thread_get_state, adjust the eip in the thread_state_t, and then call thread_set_state to get the changes to take effect. I wasn't calling thread_set_state before and assumed that this would behave like a Unix signal handler where you can just adjust the machine context and everything happens automatically. In the case of mach exception handlers, you need to explicitly set the state.

Mach Exception Handlers and SBCL Lisp

SBCL makes extensive use of POSIX signals for such tasks as garbage collection, error handling, and ensuring that atomic operations are executed atomically. MacOS X supports POSIX-style signals, but there are some problems with Apple's BSD-style, as they call it, signalling mechanism. The main problem, viz SBCL, is that MacOS X's signal implementation and GDB interact in such a way that renders GDB basically useless for debugging SBCL. SBCL's strategy of protecting memory pages with mprotect and then using a signal (usually SIGBUS or SIGSEGV, and SIGBUS in the case of MacOS X) handler to either adjust the memory protection mode and take apporpriate action or to signal an error causes MacOS to issue a mach exception (EXC_BAD_ACCESS) which is then caught be the kernel and causes a SIGBUS to be sent to the offending process. Unfortunately, GDB can't be used to continue past the offending mach exception, so the process just continues to send the exception and never issues the signal to the listening process or moves the program counter past the offending instruction.

In addition to the SIGBUS debugging problems, the signalling mechanism of MacOS X on Intel poses other problems in that MacOS' delivery of SIGTRAP signals is unreliable. It generally works, but only about 95% of the time. This is unacceptable for SBCL's use as a mechanism for signalling when operations that are supposed to be atomic have been interrupted and that approriate action needs to be taken. We have worked around the SIGTRAP problems by using the UDA2 instruction to cause a SIGILL signal to be delivered to the SBCL process. This works reliably, but causes the MacOS X on Intel code to differ from other Intel-based platforms.

Finally, Apple's crash reporter doesn't realize that we might be using memory protections and the associated SIGBUS messages for non-crashing, expected behavior and generates a crash log message or, depending on the Crash Reporter preferences, a dialog message to appear on the screen.

These issues are enough to motivate me to consider using Mach exceptions instead of or, more likely, in addition to POSIX/BSD-style signals.

If only it were that simple. Gory details to follow...

How MacOS X makes the life of a Lisp (SBCL) Programmer Difficult Lisp

So, now that SBCL works on MacOS X/Intel, and given that MacOS is proving to be rather recalictrant when it comes to running a threaded version of SBCL, I thought I would share my list of the top 10 reasons why the life of an SBCL developer is made needlesly difficult by the current state of MacOS X.

  1. No OS Source. The source to Darwin used to be available. Apparently that is no longer the case, at least for the Intel version. This is rather unfortunate. When developing for Linux one can dive down into the source to see if, for instance, user provided thread stacks are available to be freed after a pthread_join or not. (Note: in this case, the source to the pthread libraries is in fact available, but, given the design of MacOS' Mach kernel, this is mostly glue around calls down the to the Mach thread layer, the sources to which are not available).
  2. GDB can't step across an EXC_BAD_ACCESS/SIGSEGV. SBCL makes rather extensive use of signals (or, in Mach parlence, exceptions) in the "this is a somewhat unusual, but also expected event and should be appropriately dealt with"-sense, not the "this is an error, most likely caused by programmer error or system failure, maybe you should print an error message before you quit"-sense. One critical example of this is the use of memory protection to trigger a SIGSEGV (or SIGBUS depending on the archictecture) to inform the system when bits of memory are being written to. This is a normal event and it triggers a Mach Exception (EXC_BAD_ACCESS) that GDB cannot step across, attempting to do so just causes the event to be refired. Setting gdb to "handle pass noprint" this type of exception just causes GDB to hang there. This makes GDB basically unusable, except for certain cases where you can attach to a running core, which will then proceed to work until an EXC_BAD_ACCESS is triggered again.
  3. INT3 traps are not reliably delivered. Another example of where a signalling mechanism is used to handle slightly unusual, but totally expected, cases is the use of the INT3 trapping mechanism. SBCL uses this for error handling and, especially, as part of the mechanism for achieving fast atomic operations without having to go to the kernel for a lock. INT3 traps basically work on MacOS, except when they don't, which is about 2-5% of the time. Basically the trap signal is just lost and is not reliable delivered to the signal handler. This causes all sorts of problems for a system that expects to get these traps and was the source of major headaches in the MacOS X/Intel porting effort.
  4. sem_init is not implemented. sem_open is implemented, but this takes a pathname and is a much more expensive call than creating an anonymous semaphore. The ironic thing is that the underlying Carbon and Mach APIs do support anonymous semaphores and they machinery to associate file system path names to Mach semaphores (one presumes) for use with sem_open. It would seem trivial to support sem_init.
  5. The mach semaphores are a private API. Here's a useful API for doing semaphore stuff, but for some reason it's a private API. One can use Carbon semaphores, and can link in the Carbon framework, but this seems rather unneccessary.
  6. Problems freeing a thread's stack(?). If I provide a stack for a thread to use with pthread_attr_setstack or pthread_attr_setstackaddr, I see major problems with the threads test suite if I free the stack. If I don't free the stack, the test suite is happier, until the kernel panics. See below.
  7. Kernel panics. I have seen quite a number of kernel panics caused by (one presumes) use or misuse of the pthreads and semaphore APIs. I could understand it if this were a KEXT or even a root process, but these are all user processes. No user process should be able to so easily cause kernel panics.
  8. No futexes. MacOS has a whole bunch of different locking APIs, none of which seem as nice as futexes. It would be great if the kernel (or a KEXT?) could provide futex support for MacOS. I'm assuming that it's not just SBCL and that other language environments, databases, and other highly-concurrent applications will take advantage of futexes and their efficient locking properties on Linux. It will be a shame if those applications are relegated to only using pthread condition variables and mutexes on MacOS X.
  9. No POSIX RT signals. The POSIX RT signalling stuff provides for much more reasonable behavior of the delivery of signals than the original POSIX stuff. We have had to jump through hoops to get the threaded version of SBCL as far as it is without RT signals. It would be great if future versions of MacOS X supported RT signals (I would be happy to trade Spotlight for RT signals, although I have a hard time seeing a guy in jeans and a black turtleneck running around a stage talking about how great POSIX RT signals are :) ).
  10. LDTs are not reused even after being freed. This is a rather obscure bug, and certainly one can manange their own set of LDTs, but the LDT API provides a way to return these to the OS, however they are not recycled and one runs out of LDTs after 0x2000 LDTs, or so, have been set up.

Anyway, perhaps this degree of systems inadequacy is present on all Operating Systems and perhaps there is a certain amount of pain in making things work properly on new OS/architecture environments, but when one compares the UNIX guts of OS X to the guts of, say, Linux or Solaris, OS X appears lacking and makes life difficult for the UNIX application programmer. I'm sure there are many benefits to Apple's microkernel architecture, but there's still a long way to go before it catches up to the other modern UNIXes, at least inasmuch as one treats MacOS as a UNIX, which is what is needed for the low-level of a sophisticated, cross-platform language environment such as Lisp in general and SBCL in particular. If these areas were addressed, it would make it easy for SBCL developers (those who use SBCL, not just those who develop and maintain SBCL) to both develop cross-platform lisp tools and to use MacOS X's sophisticated features, such as Cocoa, QuickTime, Aqua, Bonjour, etc... to develop first-class MacOS applications using SBCL.

On the positive side, this MBP is blazingly fast for SBCL development. It compiles all of SBCL in a hair longer than it takes my desktop x86-64 box on linux, and in half the time of a 2x2GHz G5 desktop.

Resolving the x86/darwin (not-)trapping problem Lisp

Ok, I think I've got a fix for the stability problems I was seeing with x86/darwin SBCL. At first everything looked great, then I noticed some sporadic failures. Doing things like running SBCL while the system load was high seemed to exacerbate the problem and increase the likelihood of failing with a SIGSEGV. After much debugging, telepathic and otherwise, and thanks to the help of Juho Snellman, Alastair Bridgewater and the rest of the #lisp crew, it became apparent that the problem was that the SIGTRAP handler wasn't reliably being called. I made some test cases that showed this to be the case, independent of SBCL, and that also demonstrated that the problem exists with Mach exception handlers as well. So, now what?

Well, thankfully Andrew Pinski and Alastair Bridgewater both suggested using x86 instructions that would generate a SIGILL instead and using that instead of SIGTRAP. Sure enough, that seems to ensure that the signal handler is reliably called. This means that SBCL on x86/darwin finally seems to work, for real this time. Knock on wood... I'll commit the changes and roll a binary for public consumption sometime in the next day or so.

Thanks to everyone who helped me debug and workaround this problem. It would be great if Apple would consider making sure that SIGTRAP is reliably called when an 0xCC instruction is encountered on x86. I've got testcases if you want them.

Also, if anyone knows how to get at and modify the EIP inside of a mach exception handler, let me know. I suppose digging through the GDB sources should provide some insight.

Experimental x86/darwin support in SBCL source tree Lisp

Experimental support for x86/darwin has been added to the SBCL source tree. No need to use my patch anymore, just grab the latest from source. Of course one has to cross-build at this point. I'll put up a binary release shortly that will enable folks to grab the release and the source and build it themselves without resorting to a cross-build.

x86/darwin patch update Lisp

Ok, this patch seems to work pretty well. Feedback and more testing welcome.

There's one remaing issue which is that we should consider using a sigaltstack so that signal handlers get a stack that is properly (16 bytes per the ABI) aligned. Currently, we don't do so and this might leave the door open for bad things to happen.

Previous 1 2 3 4 5 6 7 Next