Monday 11 March 2013, 19:12
When I was preparing the Tsukurimashou 0.7 release, I
had to build the entire package several times from scratch, to verify that
all the necessary pieces really were included in what I was preparing to
ship. When I run the build on my development machine it normally re-uses a
lot of previously-built components, only updating the parts I have recently
changed. That kind of incremental compilation is one of the main functions
of GNU Make. But if I'm shipping a package for others to use, it has to
work on their systems which don't have a previous history of successful
builds; so I need to verify that it will actually build successfully in such
an environment, and verifying that means copying the release-candidate
package into a fresh empty directory on my own system and checking that the
entire package (including all optional features) can build there.
Tsukurimashou is a big, complicated package. It's roughly 92,000 lines
of code, which may not sound like so much. For comparison, the current
Linux kernel is about 15,000,000. Tsukurimashou's volume of code is roughly
equivalent to an 0.99 version of Linux (not clear which one - I couldn't
find numbers I trusted on the Web just now, and am not motivated to go
downloading old kernel sources just to count the lines). However, as
detailed in one of my earlier
articles, Tsukurimashou as a font meta-family is structured much
differently from an orthodox software package. Things in the Tsukurimashou
build tend to multiply rather than adding; and one practical consequence is
that building from these 92,000 lines of code, when all the optional
features are enabled, produces as many output and intermediate files and
takes as much computation as we might expect of a much larger package. A
full build of Tsukurimashou maxes out my quad-core computer for six or eight
hours, and fills about 4G of disk space.
So after a few days of building over and over, it occurred to me that I'd
really like to know where all the time was going. I had a pretty good
understanding of what the build process was doing, because I
created it myself; but I had no quantitative data on the relative resource
consumption of the different components, I had no basis to make even
plausible guesses about that, and quantitative data would be really useful.
In software development we often study this sort of thing on the tiny scale,
nanoseconds to milliseconds, using profiling tools that measure the time
consumption of different parts of a program. What I really wanted for my
build system was a coarse-grained profiler: something that could analyse the
eight-hour run of the full build and give me stats at the level of processes
and Makefile recipes.
I couldn't find such a tool ready-made, so I built one.
Thursday 7 March 2013, 08:36
I'm very happy to announce the release of version
0.7 of Tsukurimashou, my Japanese-language font project. That is a
link to the release page for the source code package on SourceForge.JP; see
also the complete list
of downloadable files and the project home page. This has
been almost nine months in the making, and as I said on Twitter, the yak
hair is thick on the floor. Release notes below the cut.
Wednesday 30 January 2013, 18:44
Here are the
slides (PDF) and an audio recording (MP3, 25 megabytes,
54 minutes) from a talk I gave today about one of my research projects.
You'll get more out of it if you have some computer science background, but
I hope it'll also be accessible and interesting to those of my readers who don't.
I managed to work in Curious George, Sesame Street, electronics, XKCD, the
meaning of "truth," and a piece of software called ECCHI.
I plan to distribute the "Enhanced Cycle Counter and
Hamiltonian Integrator" publicly at some point in the future. Maybe not
until after the rewrite, though.
Abstract for the talk:
It is a #P-complete problem to find the number of subgraphs
of a given labelled graph that are cycles. Practical work on this
problem splits into two streams: there are applications for counting
cycles in large numbers of small graphs (for instance, all 12.3
million graphs with up to ten vertices) and software to serve that
need; and there are applications for counting the cycles in just a few
large graphs (for instance, hypercubes). Existing automated techniques
work very well on small graphs. In this talk I review my own and
others' work on large graphs, where the existing results have until
now required a large amount of human participation, and I discuss an
automated system for solving the problem in large graphs.
Monday 21 January 2013, 14:49
It's a very common pattern in the Han writing system that a character
will be made of two parts that are themselves characters, or at least
elements resembling characters, placed one above the other or one next to
the other. For instance, 音 (sound) can be split into 立 (stand up) above
日 (day); and 村 (village) can be split into 木 (tree) next to 寸 (inch).
This kind of structure can be nested, as in 語 (language).
One can do a sort of gematria with the meanings, (what exactly
is the deep significance of "village = tree + inch"?) but that's not the
direction I'm interested in going today.
Here's the thing: in the Tsukurimashou
project, these two ways of constructing characters each correspond to a
piece of code that's invoked many times throughout the system, and I thought
it would be interesting to look at how often the different parameter values
Saturday 12 January 2013, 16:23
Here's a quote.
We see a sloppily-parked car and we think "what a terrible
driver," not "he must have been in a real hurry." Someone keeps bumping into
you at a concert and you think "what a jerk," not "poor guy, people must
keep bumping into him." A policeman beats up a protestor and we think "what
an awful person," not "what terrible training." The mistake is so common
that in 1977 Lee Ross decided to name it the "fundamental attribution
error": we attribute people’s behavior to their personality, not their
Wednesday 2 January 2013, 10:26
The title is a song lyric; it means "the story that starts now," and
that's more or less where I feel I'm at. A lot has happened between
mid-November and now, and I'm hoping that this will mark a boundary or
change in the conditions around me.
Sunday 11 November 2012, 15:26
I've decided to stop using Arch Linux, because I believe in
The Arch Way.
I'm tempted to leave it at that, but more detail is below the cut.
Tuesday 2 October 2012, 10:15
I recently updated my OCR and Genjimon font packages, a process which
included merging them into the Tsukurimashou Project's build system as what
I'm calling "parasite" packages (IDSgrep has also become such). They now
come included automatically (but not built by default) when you download the
full Tsukurimashou package, or they can each be downloaded as a separate
distributed package. Some bugs are fixed, and Genjimon has two new styles
added, one of which is shown below.
Having the OCR package listed as a download on the Sourceforge.JP site
immediately boosted Tsukurimashou's rankings, because 15 or 20 people
download it every day. I'm happy to have the added visibility, but I wish
that visibility could be coming from popularity of the main Tsukurimashou
project instead of this minor spinoff.
Tuesday 11 September 2012, 15:47
As part of my efforts to be ready for wherever my next employment takes
me, I've shifted my email home. For a long time my usual practice has been
for email to end up delivered to my home computer, which I log into remotely
from wherever I am. The way I see it is that my personal email is
mission-critical, and I don't want my email home to be on any computer I
don't control, especially not one belonging to an employer or to Google. I
have had content in my email subject to a court case before, the other side
in that case wasn't able to interrupt my email because they had no right to
and it was all routed through systems controlled by people who understood
that, and I'd like to keep things that way.
Running my own email service requires my home computer to be accessible
on the Net at all times, and I've now had a couple of adventures in which it
or its Net connection stopped working while I was away from home and I had
to switch to less useful backup systems. So, as of today, my email is now
going to a leased server elsewhere on the Net. I can connect to it remotely
from wherever I have a good connection, even if my home computer doesn't.
This may be especially useful if, as seems quite possible, my current home
computer goes into storage for a while and I end up spending a lot of time
without an operational home computer of my own at any fixed location.
Monday 3 September 2012, 19:18
This is part II in a series. You can start from the
beginning, and you can pick up a package of Qucs files
to follow along on your own simulated workbench.