Kleknev: a coarse-grained profiler for build systems

Monday 11 March 2013, 19:12

When I was preparing the Tsukurimashou 0.7 release, I had to build the entire package several times from scratch, to verify that all the necessary pieces really were included in what I was preparing to ship. When I run the build on my development machine it normally re-uses a lot of previously-built components, only updating the parts I have recently changed. That kind of incremental compilation is one of the main functions of GNU Make. But if I'm shipping a package for others to use, it has to work on their systems which don't have a previous history of successful builds; so I need to verify that it will actually build successfully in such an environment, and verifying that means copying the release-candidate package into a fresh empty directory on my own system and checking that the entire package (including all optional features) can build there.

Tsukurimashou is a big, complicated package. It's roughly 92,000 lines of code, which may not sound like so much. For comparison, the current Linux kernel is about 15,000,000. Tsukurimashou's volume of code is roughly equivalent to an 0.99 version of Linux (not clear which one - I couldn't find numbers I trusted on the Web just now, and am not motivated to go downloading old kernel sources just to count the lines). However, as detailed in one of my earlier articles, Tsukurimashou as a font meta-family is structured much differently from an orthodox software package. Things in the Tsukurimashou build tend to multiply rather than adding; and one practical consequence is that building from these 92,000 lines of code, when all the optional features are enabled, produces as many output and intermediate files and takes as much computation as we might expect of a much larger package. A full build of Tsukurimashou maxes out my quad-core computer for six or eight hours, and fills about 4G of disk space.

So after a few days of building over and over, it occurred to me that I'd really like to know where all the time was going. I had a pretty good understanding of what the build process was doing, because I created it myself; but I had no quantitative data on the relative resource consumption of the different components, I had no basis to make even plausible guesses about that, and quantitative data would be really useful. In software development we often study this sort of thing on the tiny scale, nanoseconds to milliseconds, using profiling tools that measure the time consumption of different parts of a program. What I really wanted for my build system was a coarse-grained profiler: something that could analyse the eight-hour run of the full build and give me stats at the level of processes and Makefile recipes.

I couldn't find such a tool ready-made, so I built one.

Cycle counting: the next generation

Wednesday 30 January 2013, 18:44

Here are the slides (PDF) and an audio recording (MP3, 25 megabytes, 54 minutes) from a talk I gave today about one of my research projects. You'll get more out of it if you have some computer science background, but I hope it'll also be accessible and interesting to those of my readers who don't. I managed to work in Curious George, Sesame Street, electronics, XKCD, the meaning of "truth," and a piece of software called ECCHI. I plan to distribute the "Enhanced Cycle Counter and Hamiltonian Integrator" publicly at some point in the future. Maybe not until after the rewrite, though.

Abstract for the talk:

It is a #P-complete problem to find the number of subgraphs of a given labelled graph that are cycles. Practical work on this problem splits into two streams: there are applications for counting cycles in large numbers of small graphs (for instance, all 12.3 million graphs with up to ten vertices) and software to serve that need; and there are applications for counting the cycles in just a few large graphs (for instance, hypercubes). Existing automated techniques work very well on small graphs. In this talk I review my own and others' work on large graphs, where the existing results have until now required a large amount of human participation, and I discuss an automated system for solving the problem in large graphs.

Where do I draw the line?

Monday 21 January 2013, 14:49

It's a very common pattern in the Han writing system that a character will be made of two parts that are themselves characters, or at least elements resembling characters, placed one above the other or one next to the other. For instance, 音 (sound) can be split into 立 (stand up) above 日 (day); and 村 (village) can be split into 木 (tree) next to 寸 (inch). This kind of structure can be nested, as in 語 (language). One can do a sort of gematria with the meanings, (what exactly is the deep significance of "village = tree + inch"?) but that's not the direction I'm interested in going today. Here's the thing: in the Tsukurimashou project, these two ways of constructing characters each correspond to a piece of code that's invoked many times throughout the system, and I thought it would be interesting to look at how often the different parameter values are used.

The fundamental attribution error

Saturday 12 January 2013, 16:23

Here's a quote.

We see a sloppily-parked car and we think "what a terrible driver," not "he must have been in a real hurry." Someone keeps bumping into you at a concert and you think "what a jerk," not "poor guy, people must keep bumping into him." A policeman beats up a protestor and we think "what an awful person," not "what terrible training." The mistake is so common that in 1977 Lee Ross decided to name it the "fundamental attribution error": we attribute people’s behavior to their personality, not their situation.

Photo gallery from CCCG 2012

Friday 10 August 2012, 21:00

Here's a photo gallery (salvaged from my old gallery software in July 2020) of my trip to the Canadian Conference on Computational Geometry in Charlottetown, PEI.

LCD monitor adventures

Monday 23 July 2012, 16:16

I haven't had very good luck with computer hardware, nor operating systems, in the last few months. I lost a hard drive in my main desktop computer at home, and had to replace that (no data loss because it waS RAIDed); the latest Arch Linux "upgrade" made my computer unbootable because the maintainers decided they had to move everything from /lib into /usr/lib and the documented procedure for doing the upgrade safely didn't cover oddball configuration cases like having GCC installed (because who would have that?); and now my LCD monitor is dying.

Photo gallery from SWAT 2012

Friday 6 July 2012, 21:00

Here's a photo gallery (salvaged from my old gallery software in July 2020) of my trip to the Canadian Conference on Computational Geometry in Helsinki.

Open letter to Joyce Bateman

Saturday 21 April 2012, 12:24

The latest evidence regarding the Conservative Party's fraudulent activities in the last Canadian federal election hits close to home for me because I live, and voted, in Winnipeg South Centre, one of the ridings subject to a court challenge by the Council of Canadians. The affidavit of Annette Desgagne is quite damning; people are calling it a smoking gun. One small ray of hope is that it's pretty clear this was all coordinated centrally, and probably by a small conspiracy. Much as I dislike the Conservative Party as an entity, I think there are some decent people within it, and it's likely a whole lot of them didn't know about the fraud and are as shocked by it as the rest of us. Most of the checks and balances of Canadian democracy have been emasculated under Harper, but it still remains that we can try appealing to the decent people within the Conservative Party to weed their own garden. Below is what I'm mailing to Joyce Bateman, CPC Member of Parliament for Winnipeg South Centre.

Arch installation thoughts

Tuesday 3 April 2012, 20:19

I had a request for some comments on Arch Linux, now that I've been using it a few days, and in particular the question of whether it is easy to install.

Switching to Arch and XFCE

Tuesday 27 March 2012, 12:32

I am, as the title implies, switching my home desktop system from Slackware Linux and KDE to Arch Linux and XFCE. You may see some minor disruptions here (in particular, the astrological chart generator may be down or unreliable) for the next few days. The switch is a pretty big production; I've been using Slackware for most of the last 20 years, and KDE for most of the time I've been using Slackware, and because I've been doing continuous incremental upgrades instead of full reinstalls, some parts of my system actually are that old.

There was no one big annoyance or disaster to make me want to switch, but my dissatisfaction with KDE has been gradually increasing for the last few years, and I decided it would be better to switch in a controlled way when I'm not fighting a fire, rather than wait until some kind of disaster forces me to switch under pressure. I'm still basically satisfied with Slackware, and I could have continued to use it, but I tried doing a similar KDE to XFCE switch on my laptop first to debug the process, and found that doing a complete reinstall of the underlying Linux distribution really makes the desktop change a lot pleasanter. Given that I'm doing a reinstall of the core Linux system, it seems like a good opportunity to also do the switch to Arch, which has some advantages over Slackware.