Sometimes software configuration problems are solved thanks to a deep insight into how things work and methodical experimentation. At other times they are solved through a shotgun approach in which hunches, sheer luck and brute force are combined to get rid of a problem without ever truly understanding its cause. This leaves no useful information behind for avoiding the problem in the future, but maybe we don't care. The following describes such a situation.
After upgrading some packages, notably xorg, in Debian ('testing'), most GTK applications (e.g. Firefox, SeaMonkey, Eclipse, ...), started crashing intermittently, with just the infamous unresponsive UI symptom (aka freeze, hang). After some Googling I noticed that adding the --sync
command line option alleviated the problem, and for a period of time I just used this as a crude but effective workaround. I got motivated to look for a real fix when I ran into the same symptoms while trying to run a hosted Eclipse workbench (a PDE JUnit test suite). Here the hosted workbench would freeze almost immediately after startup (the window rendered, but without the menu bar, after which no more repaint events were processed). It would happen always without the --sync
option, but also very often despite it being supplied. I observed in the JDT debugger that the lock-up would always occur when the main thread was callnig a GTK/GDK function - not always the same function (sometimes gdk_flush
, sometimes gdk_pixbuf_render_to_drawable
, sometimes the main event loop). Stepping through the code would cause the problem to disappear (tell-tale signs of a timing bug; those are always the best ones). To make it even more "interesting", sometimes Eclipse would crash altogether with a "BadLength (poly request too large..." error from Xlib. Intuitive remedies, such as upgrading GTK/GDK packages or xorg or xlib, have not been successful, but this idea turned out to be (almost) right.
The breakthrough came after running ldd libswt-pi-gtk-3735.so
(this is the SWT library used by Eclipse, which in turn depends on GTK), next determining which package each of the prerequisite libraries shown by ldd belonged to, and successively upgrading each of them. I didn't take care to repeat the test after each upgrade, but here is the list of packages that I replaced or tried to replace - after which the problem disappeared:
alsa-oss binutils gdk-imlib11 libatk1.0-0 libc-dev-bin libc0.1-dev libcairo2 libcairo2-dev libfontconfig1 libfreetype6 libgdk-pixbuf-dev libgdk-pixbuf-gnome2 libgdk-pixbuf2 libgdk-pixbuf2.0-0 libgdk-pixbuf2.0-dev libglib2.0-0 libgtk2.0-0 libgtk2.0-common libnspr4-0d libpango1.0-0 libpango1.0-common libpango1.0-dev libpth20 libpthread-stubs0 libpthread-stubs0-dev libx11-6 libx1106 libxcb1 libxcomposite1 libxdamage1 libxext6 libxfixes3 libxi6 libxrender1 libxtst6 xorg
No comments:
Post a Comment