A Better Getmail IDLE client

Updates

(2013-10-15) And like that I've broken it again. Fixing the crash on IMAP disconnect actually broke IMAP disconnect handling. The problem here is that IMAPClient's exceptions are not documented at all, so a time-based thing like IDLE requires some guessing as to what IMAPClient will handle and what you need to handle. This would all be fine if there was a way to get Gmail to boot my client after 30 seconds so I could test it easily.

I've amended the code so that anytime the code would call _imaplogin() it explicitely dumps the IMAPClient object after trying to log it out, and recreates it. Near as I can tell this seems to be the safe way to do it, since the IMAPClient object does open a socket connection when created, and doesn't necessarily re-open if you simply re-issue the login command.

There's an ongoing lesson here that doing anything that needs to stay up with protocol like IMAP is an incredible pain.

(2013-10-14) So after 4 days of continuous usage I'm happy with this script. The most important thing it does is crash properly when it encounters a bug. I've tweaked the Gist a few times in response (a typo meant imaplogin didn't recover gracefully) and added a call to notify_mail on exit which should've been there to start with.

It's also becoming abundantly clear that I'm way to click-happy with publishing things to this blog, so some type of interface to show my revisions is probably in the future (a long with a style overhaul).

Why

My previous attempt at a GetMail IDLE client was a huge disappointment, since imaplib2 seems to be buggy for handling long-running processes. It's possible some magic in hard terminating the IMAP session after each IDLE termination is necessary, but it raises the question of why the idle() function in the library doesn't immediately exit when this happens - to me that implies I could still end up with a zombie daemon that doesn't retreive any mail.

Thus a new project - this time based on the Python imapclient library. imapclient uses imaplib behind the scenes, and seems to enjoy a little bit more use then imaplib2 so it seemed a good candidate.

The script

Dependencies

The script has a couple of dependencies, most easily installed with pip:

$ pip install psutil imapclient

Get it from a Gist here - I'm currently running it on my server, and naturally I'll update this article based on how it performs as I go.

Design

The script implements a Unix daemon, and uses pidfiles to avoid concurrent executions. It's designed to be stuck in a crontab file to recover from crashes.

I went purist on this project since I wanted to avoid as many additional frameworks as possible and work mostly with built-in constructs - partly as just an exercise in what can be done. At the end of the day I ended up implementing a somewhat half-baked messaging system to manage all the threads based on Queues.

The main thread, being the listener for signals, creates a "manager" thread, which in turn spawns all my actual "idler" threads.

Everything talks with Queue.Queue() objects, and block on the get() method which efficiently uses CPU. The actual idle() function, being blocking, runs on its own thread and posts "new mail" events back to the idler thread, which then invokes getmail.

The biggest challenge was making sure exceptions were caught in all the right places - imapclient has no way to cleanly kill off an idle() process, so a shutdown involves causing the idle_check() call to return an exception.

I kind of hacked this together as I went - the main thing I really targeted was trying to make sure failure modes caused crashes, which is hard to do with Python-threading a lot of the time. A crashed script can be restarted, a zombie script doing nothing looks like it's correctly alive.

Personal thoughts

Pure Python is not the best for this sort of thing - an evented IMAP library would definitely be better but this way I can stick with mostly single file deployment, and I don't want to write my own IMAP client at the moment.

Of course IMAP is a simple enough protocol in most respects, so it's not like it would be hard but the exercise was still interesting. But if I want a new project with this, I would still like to tackle it in something like Haskell.