Cyrus' New Completely Useless Blog

Interesting article by John Timmer Science

In this article, John Timmer makes some interesting commentaries on the recent Science articles by Pevzner and Shamir and Robeva and Laubenbacher. Worth a read.

New hunchentoot-auth, hunchentoot-vhost, hunchenoot-cgi and nuclblog releases Lisp

To mark the milestone of finally bringing hunchentoot-auth, hunchentoot-vhost, hunchenoot-cgi and nuclblog into the present such that they work properly with the hunchentoot-1.0 release, I've rolled up the following packages:

Of course these are probably best gotten from the git repo's, but for those of you who like released versions, I figured I'd roll new ones since it had been quite some time (almost two years in some cases!).

hunchentoot and sb-ext:run-program Lisp

Well, after months of instability with my hunchentoot-based webserver, I finally, once again, got around to trying to figure out the source of the instability was. I had come to blame SBCL's sb-ext:run-program functionality as I was able to fairly reliably crash the server using apachebench. I was also seeing sporadic crashes somewhat randomly after the server being up for a week or so. So, this was a pretty strong hit that it had something to do with sb-ext:run-program. Folks who were much more knowledgeable than I about the SBCL internals, including Francois-Rene Rideau and Gabor Melis, looked at cleaning up possible sources of race conditions and generally robustifying sb-ext:run-program but none of the fixes seemed to make the situation better. Compounding my difficulties was the fact that I was running the server on FreeBSD, which doesn't see quite the level of SBCL testing/hacking that, say, linux does, so I thought it possible that there may be a bug either in the way SBCL handles signals on FreeBSD or in FreeBSD itself. Finally, I got around to replicating, roughly, my setup on another computer. In this case a MacOS box which, when subjected to the same stressful conditions, gave me a helpful error message that said something about being unable to open a pipe or perhaps that there were too many open pipes. This got me thinking "wait a minute, I'm just calling the program via sb-ext:run-program and getting a stream to read data back from the program; who's closing the stream and getting rid of the process?" Then it dawned on me that perhaps nobody was and perhaps these processes were sticking around, consuming scarce resources, like pipes, and, eventually, causing the server to crash. Sure enough, waiting for the process to finish and then closing the process cleared up my problem.

I should point out that SBCL's sb-ext:run-program has an argument that seems relevant here, which is the :wait arugment. One can specify :wait t which will wait until the process has finished. This seemed to work in some cases, but fail in others. Eventually, it occurred to me that it was failing in the cases where the output was larger than in the cases where it was succeeding. I think what was going on was that the external program was writing data to the stream which would fill up some buffer, which then blocked waiting for data to be read, which wasn't going to happen until after the process returned. There could be something else, going on here, but it seems to me that :wait t, while somewhat in spirit what I want, isn't going to do it from my. In this case, I'm just launching a process and expecting to get some data back from it, this isn't, say, a window manager that's going to live on for the life of the SBCL process, or beyond. But, :wait t didn't seem to do what I need either, so I was back to :wait nil. Now that I figured out I needed to close the process I came up with:

(defmacro with-input-from-program ((stream program program-args environment)
                                   &body body)
  "Creates an new process of the specified by PROGRAM using
PROGRAM-ARGS as a list of the arguments to the program. Binds the
stream variable to an input stream from which the output of the
process can be read and executes body as an implicit progn."
  #+sbcl
  (let ((process (gensym)))
    `(let ((,process (sb-ext::run-program ,program
                                          ,program-args
                                          :output :stream
                                          :environment ,environment
                                          :wait nil)))
       (when ,process
         (unwind-protect
              (let ((,stream (sb-ext:process-output ,process)))
                ,@body)
           (sb-ext:process-wait ,process)
           (sb-ext:process-close ,process)))))
  #-sbcl
  `(error "Not implemented yet!"))

which I can use a la with-input-from-string to read the data from the external process:

and now the server seems a lot happier.

(with-input-from-program (in path nil env)
  (loop for line = (chunga:read-line* in)
     until (equal line "")
     do (destructuring-bind
              (key val)
            (ppcre:split ": " line)
          (setf (hunchentoot:header-out key) val)))
  (let ((out (flexi-streams:make-flexi-stream
              (tbnl:send-headers)
              :external-format tbnl::+latin-1+)))                   
    (copy-stream in out 'character)))
Chemistry Science

So now that I have more than a passing interest in chemistry (and therefore cheminformatics/chemi-informatics/chemoinformatics/whatever-you-call-it), I'd like to see what the state of the art is for representing chemical information and if there are any decent libraries for working with these representations. At first glance, I'm in luck. There's CML, the Chemical Markup Language, there's the Blue Obelisk set of projects for open source/open data/open standards in chemsitry, and there's the CDK, the Chemistry Development Kit. This all sounds promising. Let's dive in.

Working in reverse order, let's start with CDK. Google CDK and it shows up as the first hit -- even before the Cyclin Dependent Kinases. This is good. Now let's follow the link. Uh oh. It's a link to sourceforge. That's alright, we'll click through and hope for the best. Ah, not only is it on sf.net, but it's a wiki site: http://apps.sourceforge.net/mediawiki/cdk/index.php?title=Main_Page.

Ugh. Alright, I'll try to overcome my biases and keep plugging away. It's not that all wikis are bad, but rather that, IMO, they are a poor substitute for a properly designed web site for a project. They certainly have a place, but the idea that all web content gets wiki-ized can lead to some rather difficult-to-follow web pages, again, IMO. The wikipedia example is a good counterexample to my claim, but, most other wiki sites don't have the complete, polished feel of wikipedia. In any event, let's keep plugging ahead with CDK.

Ah, here we go. Two publications in the peer-reviewed literature. This should help give us an overview of what CDK has and where it is going. One is in the Journal of Chemcal Information and Modeling (although it seems that when the article was published it was called the Journal of Chemcal Information and Computer Sciences. Alright, sounds promising. Click through the DOI link, which takes us to a page of American Chemical Society, the world's largest scientific society. Surely, this being a paper about on open-source toolkit and ACS being a society for the betterment of society, this is going to be an open-access journal, right? Or at least an open-access publication in a mixed-access journal, right? Click on the link to get the PDF... Get PDF -- WRONG! $30 for 48 hours of access. Damn. Ok, well, let's get the other paper, there were two on the CDK wiki. The next one is in something called Current Pharmaceutical Design. Uh oh. This doesn't sound promising. And, sure enough: "The full text electronic article is available for purchase. You will be able to download the full text electronic article after payment. $55.10 plus tax." This isn't getting us anywhere. At least CDK is open source. Let's go get the source. Well, first let's browse the documentation anyway.

Click over to the documentation look on sourceforge: http://apps.sourceforge.net/mediawiki/cdk/index.php?title=Documentation. How is the documentation different than what I was looking at before, or what's the difference between the documentation and the main page? Who knows. In any case, this looks promising: "A great source of CDK documentation or introductory reading is the CDK News, the quarterly newsletter of the CDK team." Click through. Ok, there's a picture of the (presumably) most recent issue, which is a link to the table of contents and a note about getting the PDF: "The full issue can be downloaded as PDF from http://sf.net/projects/cdk/". Hmm... Ok, click through that... And we're back on sf.net. Oh wait, that's not a link. Just some text. Cut the URL and paste into the nav bar in the browser... Hmm. Now we're at another sf.net page. So far we've got the "main" page, the "documentation" page and now the, presumably, "project" page. But now that we're there, we see that there is no mention of CDK News on this page. Damn. Alright, let's start clicking and see what we find. Ok, under the "Download -- Browse All Packages" link we get to a page that has CDK News on it. Maybe now we're getting somewhere. Click through that and we have a nice list of the various "Releases" of CDK News (this seems like an abuse of the Release mechanism, if you ask me -- these aren't different versions of the same thing, rather distinct issues, all of which shoud live on, but, OK, I think I see what they did there). Let's start at the beginning. Click on 1/1. Hmm... That just expands the HTML a bit to show another link for cdknews1.1.pdf. Ok, click that. Ah yes, no I remember why I hate sf.net. Do I get the PDF in my browser? NO! I get a window with a whole bunch of orange reminding me the name of the web site I know hate so much, some links to "share" the project (whatever the hell that means!), for related stuff and for forums (yet another set of pages to not get the info I want?) some google ads and some guy who looks like he hasn't slept in a week carrying stacks of cash, presumably, in an ad for nortel. Oh, and I link telling me who's providing this oh-so-handy mirror. Oh, and I almost forgot, some nice sponsor links! As opposed to the ads, I suppose. Where's my damn PDF? Who knows. Ah, this is helpful. Please use this "direct link": http://voxel.dl.sourceforge.net/sourceforge/cdk/cdknews1.1.pdf. Why the hell didn't they just give me that link the first place??? Ok, finally got CDK News 1.1. I suppose a (direct) link off of the CDK home page (or at least one of them) would have been to helpful for irritating newbies like myself. Ok, time to go read the first CDK News, since I can't get the peer reviewed articles about the project. More later.

retrospectiff Lisp

The TIFF image file format has been around a long time, and lisp even longer. Yet, I couldn't find any common lisp libraries for reading TIFF images. Perhaps there's one out there I missed, but the only one I could find was my previous attempt, tiff-ffi, which consists of some wrapper functions around FFI calls to libtiff. I wanted a native common lisp TIFF library that wouldn't require the libtiff library so, at Robert Strandh's urging, I wrote retrospectiff.

The retrospectiff git repository can be found here.

Currently, there is no support for writing TIFF files, and only a fraction of the TIFF image formats are supported, but RGB and ARGB images, both uncompressed and with LZW compression, can be read. Grayscale support should come next and, hopefully, support for writing TIFF files before too long.

Enjoy.

Fortune General

From tonight's Kirin takeout: "You will be showered with good luck before your next birthday." Well, let's hope so anyway.

git migration Lisp

Ok, so I've finally gotten around to moving (at least some of) my projects over to git. The good news, besides having the code in a modern version control system, is that the repos are now publicly accessible. The list of projects can be found at:

http://git.cyrusharmon.org/cgi-bin/gitweb.cgi

Yet more hunchentoot-{auth,vhost} and nuclblog Lisp

Ok, a number of bugs and design flaws have been fixed. One can now be logged into multiple blogs on the same server with different user names and can log in and out of one without effecting the status of the other blogs. Also, some internal API cleanup for the blog handler functions. Oh, also the realm stuff in hunchentoot-auth was simplified and nuclblog now does a better job of keeping track of the information regarding which ports to use.

Comparative Analysis of Spatial Patterns of Gene Expression in Drosophila melanogaster Imaginal Discs Computational Biology

The slides from my recent RECOMB2007 talk Comparative Analysis of Spatial Patterns of Gene Expression in Drosophila melanogaster Imaginal Discs can be found here.

SBCL/x86-64/darwin Lisp

After some more help from Juho Snellman in tracking down some nasty bugs, including one in the debugging code down in print.c, I was able to get sbcl/x86-64/darwin up and running without the sb-ext:*evaluator-mode* hacks. This experimental version has been checked into the SBCL tree as version 1.0.3.16 and most of the tests pass. There are still test failures in float.pure.lisp, debug.impure.lisp, foreign-stack-alignment.impure.lisp and run-program.impure.lisp.

Test reports on x86-64/darwin (and other platforms to make sure I didn't break anything) are most welcome.

Previous 1 2 3 4 5 6 7 Next