Skip to main content

Anger management 2

Currently, I’m angry about: FreeBSD’s SMP support in combination with walrus.boinkor.net (the machine hosting this blog, lemonodor, sbcl boinkmarks and a few other things).

The reason for this anger is that every now and then (e.g., yesterday and this morning), it stops working. This is very interesting to watch (for the first few times): the machine answers ICMP ECHO requests, and it is possible to open TCP connections. What doesn’t work is spawning new processes. That means that no HTTP request, no shell command line and not a lot else is ever processed. It also means that previously running processes like top(1) and IRC bouncers etc. continue to work until they try something stupid like fork(2).

top(1) output is also very interesting to watch: it shows processes that never leave the “sbwait” state. My current explanation is that there is a race condition that leads to a coarse-grained lock never being unlocked. A Big Giant Deadlock!

My previous prime suspect was the adaptec driver, but that doesn’t seem right, because after walrus starts misbehaving, things are still written to and read from disk just fine. Having run out of ideas, I am now preparing to replace both the hardware (with something that contains only one CPU) and the operating system running on it (with something that is easier to maintain).

This will hopefully lead to a set of problems that is less frustrating to deal with.