Paul Makepeace ;-)

July 5, 2005

MT-Blacklist switch to mod_perl

Posted in: Movable Type

MT-Blacklist is dog slow even on a fast machine running as a standard CGI script with (in our case), 3,500+ entries. The time spent is parsing the YAML blacklist; here about 10s. This 10s pause is passed onto commentators who often bang the Submit button twice presumably, and quite reasonably, wondering WTF is happening.

MT-Blacklist isn't mod_perl friendly. I don't really understand people writing web apps in perl without expressly thinking "I need to write this as a mod_perl or even Apache::Registry app" but hey ho. I managed to get MT-BL mostly working under mod_perl by doing something quite sneaky, see below.

The problem is that this is old unsupported software so there's a question over the ongoing benefit in these hacks. Even MT-BL's author appears to be trialling life without MT-BL in favour of apparently excellent SpamLookup, an MT3 phenomenon (which incidently includes many features I'd done prototypes for last year - Brad Choate's gone beyond that even; cool!).

In summary, ongoing direction now is really down to hosted users.

How it works

[skip to the Where next? if tech ain't for you.]

The problem is that MT-BL uses CGI's HTML generation code as well as the usual query/POST parsing. Apache::Request does the latter in a drop-in stylee but not the former. So we need to intercept calls and see if A::R can do it, else pass over to CGI.

We make a stub class,

package paulm::Request;
use CGI;

my ($sub) = $AUTOLOAD =~ /.*::(.*)/;
return if $sub eq 'DESTROY';
if (Apache::Request->can($sub)) {
no strict 'refs';
} elsif (CGI->can($sub)) {
shift; CGI->$sub(@_);
} else {
die "Can't do $sub\n";

Then in blacklist.cgi a one-liner:

    bless $app->{query}, 'paulm::Request' if $ENV{MOD_PERL};

There's more though: there's no benefit right now as the YAML is parsed per request. We need to do that in the Apache parent process so each child gets a copy. While we're at it, pull in various modules in the parent so they're preloaded and shared too.

        use lib qw(/home/mt/cgi-bin/lib /home/mt/cgi-bin/extlib);
        use MT;
        use Yaml; # MTBL's broken idea of what YAML is
        use jayallen::Blacklist;
        $jayallen::Blacklist::_cache->{blacklist} = jayallen::Blacklist::_getBlacklist();

We're still not out of the woods yet. For reasons I haven't quite figured out (this was a quick hack after all), that internal blacklist array's first element is futzed with each request. The effect of this is devastating: Perl initiates a copy-on-write for the whole array blowing away the shared memory (although we still save on the initial load). Before long we have a half dozen Apache children scampering around with 30MB of RSS, each. Dammit!

Further, we need to add in some code to see if another process wrote to the blacklist, and if so reload it. Either that or do some IPC/shmem/Cache::Cache/etc tricks. At this point it's feeling more than a quick cheeky hack and more like a Real Project, which on senescent software is a questionable use of time.

Where next, then?

Next moves are properly investigating MT3; other blog software; getting MT-BL to serialise its BL (I think it oughta--no idea how?); or back porting some of SpamLookup but I don't even know if the license permits it (not to particularly worry, I've already written a load of DNSBL code and even run a DNSBL). I think I'll let the users decide...

Upsides is I know even more about the MT code-base, fixed a couple of my minor misconceptions about mod_perl, have my own MT2 running under mod_perl which is really nice, and I had the chance to write a brief flash of non-boring perl :) It's the little things in life...

Posted by Paul Makepeace at July 5, 2005 19:58 | TrackBack
Post a comment

Remember personal info?