[Tag search]

Mountain-climbing addresses for code lines

Wednesday 25 November 2015, 03:59

I found an interesting problem while working on a test case generator for the Tsukurimashou Project. The thing is that I'd like to assign an identifying code, which I will call an address, to each line of code in a code base. It's to be understood that these addresses have nothing to do with machine memory addresses, and they need not be sequential; they are just opaque entities that designate lines of code. Anyway, I would like lines of code to keep the same addresses, at least probabilistically, when the program is modified, so that when I collect test information about a line of code I can still keep most of it after I update the software.

Kleknev: a coarse-grained profiler for build systems

Monday 11 March 2013, 19:12

When I was preparing the Tsukurimashou 0.7 release, I had to build the entire package several times from scratch, to verify that all the necessary pieces really were included in what I was preparing to ship. When I run the build on my development machine it normally re-uses a lot of previously-built components, only updating the parts I have recently changed. That kind of incremental compilation is one of the main functions of GNU Make. But if I'm shipping a package for others to use, it has to work on their systems which don't have a previous history of successful builds; so I need to verify that it will actually build successfully in such an environment, and verifying that means copying the release-candidate package into a fresh empty directory on my own system and checking that the entire package (including all optional features) can build there.

Tsukurimashou is a big, complicated package. It's roughly 92,000 lines of code, which may not sound like so much. For comparison, the current Linux kernel is about 15,000,000. Tsukurimashou's volume of code is roughly equivalent to an 0.99 version of Linux (not clear which one - I couldn't find numbers I trusted on the Web just now, and am not motivated to go downloading old kernel sources just to count the lines). However, as detailed in one of my earlier articles, Tsukurimashou as a font meta-family is structured much differently from an orthodox software package. Things in the Tsukurimashou build tend to multiply rather than adding; and one practical consequence is that building from these 92,000 lines of code, when all the optional features are enabled, produces as many output and intermediate files and takes as much computation as we might expect of a much larger package. A full build of Tsukurimashou maxes out my quad-core computer for six or eight hours, and fills about 4G of disk space.

So after a few days of building over and over, it occurred to me that I'd really like to know where all the time was going. I had a pretty good understanding of what the build process was doing, because I created it myself; but I had no quantitative data on the relative resource consumption of the different components, I had no basis to make even plausible guesses about that, and quantitative data would be really useful. In software development we often study this sort of thing on the tiny scale, nanoseconds to milliseconds, using profiling tools that measure the time consumption of different parts of a program. What I really wanted for my build system was a coarse-grained profiler: something that could analyse the eight-hour run of the full build and give me stats at the level of processes and Makefile recipes.

I couldn't find such a tool ready-made, so I built one.

Ideographic Description Sequences: some thoughts

Monday 19 December 2011, 15:14

I went through a bit of a crunch to get Tsukurimashou 0.5 out the door before my year-end vacation. With that done, and at least 99 kanji to do before the next planned release, I have a chance to sit back and think about some longer-term and spin-off projects. Here are some ideas on kanji searching.

UPDATE: A prototype implementation of the system described here now exists as part of the Tsukurimashou project, and you can check it out via SVN from there. Packaged releases will be available eventually.

Building a build for something weird

Monday 12 December 2011, 22:39

Here are some thoughts on the Tsukurimashou build system. You can find the code, and some documentation of how to use the build system, in the package, but this posting is meant to look more generally at some of the issues I encountered while building a build for something weird.

The thing is, Tsukurimashou isn't a piece of software in the normal sense, but a package of fonts. It's written sort of like software, using programming languages, but the data flow during build doesn't look much like the data flow during build of the usual kind of software package. As a result, although it seemed like using Make was the thing I wanted to do, the way I've written my Makefile doesn't look much like what we might expect on a more typical software project. Working on it has forced me to see the structure of the project quite differently from the way I'd usually look at software, and maybe some of the ideas from that can be applied to other things.

Code refactoring by combinatorial optimization

Monday 5 December 2011, 14:56

I encountered an interesting problem on the Tsukurimashou project recently, and some inquiries with friends confirmed my suspicion that if anyone has solved it, they've done it in a language-specific way for ridiculous languages. It appears I have to solve it again myself. Here are some notes.