Fix for random frequent freezes in GTK applications (Debian 'testing')

Sometimes software configuration problems are solved thanks to a deep insight into how things work and methodical experimentation. At other times they are solved through a shotgun approach in which hunches, sheer luck and brute force are combined to get rid of a problem without ever truly understanding its cause. This leaves no useful information behind for avoiding the problem in the future, but maybe we don't care. The following describes such a situation.

After upgrading some packages, notably xorg, in Debian ('testing'), most GTK applications (e.g. Firefox, SeaMonkey, Eclipse, ...), started crashing intermittently, with just the infamous unresponsive UI symptom (aka freeze, hang). After some Googling I noticed that adding the --sync command line option alleviated the problem, and for a period of time I just used this as a crude but effective workaround. I got motivated to look for a real fix when I ran into the same symptoms while trying to run a hosted Eclipse workbench (a PDE JUnit test suite). Here the hosted workbench would freeze almost immediately after startup (the window rendered, but without the menu bar, after which no more repaint events were processed). It would happen always without the --sync option, but also very often despite it being supplied. I observed in the JDT debugger that the lock-up would always occur when the main thread was callnig a GTK/GDK function - not always the same function (sometimes gdk_flush, sometimes gdk_pixbuf_render_to_drawable, sometimes the main event loop). Stepping through the code would cause the problem to disappear (tell-tale signs of a timing bug; those are always the best ones). To make it even more "interesting", sometimes Eclipse would crash altogether with a "BadLength (poly request too large..." error from Xlib. Intuitive remedies, such as upgrading GTK/GDK packages or xorg or xlib, have not been successful, but this idea turned out to be (almost) right.

The breakthrough came after running ldd libswt-pi-gtk-3735.so (this is the SWT library used by Eclipse, which in turn depends on GTK), next determining which package each of the prerequisite libraries shown by ldd belonged to, and successively upgrading each of them. I didn't take care to repeat the test after each upgrade, but here is the list of packages that I replaced or tried to replace - after which the problem disappeared:

alsa-oss
binutils
gdk-imlib11
libatk1.0-0
libc-dev-bin
libc0.1-dev
libcairo2
libcairo2-dev
libfontconfig1
libfreetype6
libgdk-pixbuf-dev
libgdk-pixbuf-gnome2
libgdk-pixbuf2
libgdk-pixbuf2.0-0
libgdk-pixbuf2.0-dev
libglib2.0-0
libgtk2.0-0
libgtk2.0-common
libnspr4-0d
libpango1.0-0
libpango1.0-common
libpango1.0-dev
libpth20
libpthread-stubs0
libpthread-stubs0-dev
libx11-6
libx1106
libxcb1
libxcomposite1
libxdamage1
libxext6
libxfixes3
libxi6
libxrender1
libxtst6
xorg

No comments:

Post a Comment