« The Terrible Secret of OpenTyp... | Home | The day a star is born »

Electric kerning

Thu 9 Dec 2010 by mskala Tags used: ,

It's not like I don't have enough projects to work on already. Nonetheless, I had an idea I thought was pretty cool, and I'm going to at least describe it here, whether I end up implementing it or not.

Okay, so: kerning. If you're setting type, you need to know where to put each glyph in relation to the previous one. In the old old days, it was easy because each glyph came on a little metal block and you'd just set them right next to each other and clamp them in place. But a computer (and, earlier, a phototypesetting machine) has the opportunity to make a decision. And if you just have a fixed bounding box for each glyph and set them side by side, you run into problems when you have situations like a capital A next to a capital V. "AV". Using the bounding boxes that would be correct for those letters in other contexts, you end up putting too much space between them. You need to squeeze them together so that part of the V projects into the space that the A, if it were next to something else, would reserve for itself. This squeezing together is called "kerning."

In the old days kerning was actually accomplished with special-shaped metal type blocks with cut-away sections so they could fit more closely together. With computer font technology, the basic idea is to store adjustments in the font file saying "when setting this pair of characters next to each other, add or subtract this much space." That requires a basically quadratic number of adjustments to be stored, all of which have to be designed by a human being. As the number of glyphs in the font increases, the amount of work for a human designer increases by the square of the number of glyphs. As a lazy person, I don't want to do quadratic work, and as a computer scientist, I think I shouldn't have to.

OpenType tries to make it easier with something called "class-based kerning" where glyphs can be grouped into classes and an adjustment applied between every member of this class and every member of that class. That would allow, for instance, applying the same kerning to every kind of accented A when followed by V, instead of having to treat them all separately. It saves space in the font file, and maybe it saves some human effort, but I'm not convinced that it saves much human effort because the designer still has to think through all the quadratic number of glyph pairs encompassed by a class pair. It would be better if the computer could somehow do the work automatically after receiving minimal direction from a human being. Preferably constant-size direction; I don't even want to do linear work myself.

We could tell the computer to choose kern distances so that the horizontal distance between glyphs is constant. I think that might work pretty well in, for instance, the "AV" case when we're talking about a sans-serif font. With a serif font, though, the issue is that the serifs don't count much toward our visual impression of where the edge of the glyph is, so if we position the glyphs to put the right amount of distance between the tip of the lower left serif on the "A" and the tip of the "V", that will actually be too much space between the main stems. (Wikipedia has a good illustration of this issue; I had thought to make my own images for this posting but it turned out to be too much work.) On the other hand, if we set the constant distance to make "AV" with serifs look good, then other pairs where there is a flattish part next to another flattish part with no serif, such as "00", may end up with too little space. There's also a side issue that computing the nearest-approach distance between two cubic splines isn't entirely trivial, numerically, but that's relatively minor. It would be cool if we could have a way of choosing the kern distance that well approximates the way human beings perceive the distance between glyphs: some perception of the actual distance of closest approach, but also giving more weight to the flat parts.

So the question is: is there a better way to do automatic kerning?

Please note I am not talking about "automatic" kerning in the sense of applying the adjustments that are coded into the font file, during the process of setting type. That's not really an interesting problem - you just follow the directions given by the font. I am talking about automatic kerning in the sense of automatically determining what adjustments should be coded into the font file, during typeface creation.

I imagined the glyphs as physical objects sliding around on the page. We want to kind of push them together until they are close enough, and we'll know they are close enough in some kind of way that has some kind of special treatment of corners and sharp points, as different from smoothly curved and flat parts. That kind of reminded me of fooling around with high-voltage electricity. Imagine cutting the glyphs out of metal, hooking them up to a Tesla coil, and bringing them closer and closer together until ZAP! a spark jumps across the gap! Maybe we could simulate that process and say "The correct amount of kerning is the amount that allows a spark at such-and-such voltage." That sounds like it would be fun. Doing it in real life would be more fun, of course, but would still involve a quadratic amount of human labour.

I read up on how spark gaps actually work, and started figuring out the math for how to simulate it. It's interesting, but quite complicated - there are several different physical effects that make a difference to the breakdown voltage of a spark gap, and it's also frustrating that I just finished packing up (for the move) almost all of my books including the old reference manuals that have the spark gap tables in them, and I couldn't find the information on the Net, and so on.

Then I realized I was going in the wrong direction anyway. The thing about sparks is that they prefer to jump from sharp points, or if no sharp points are available, from highly-curved surfaces. That's (part of) why Van de Graaf generators have their stereotypical spherical tops - the bigger the sphere the lower the curvature and the more charge you can push up there before it starts sparking. So in the kerning situation, this would mean that the serifs would cause the letters to be set further apart, not closer as we might desire.

But I still liked the electricity idea. Is there a way to use it that will produce the desired kind of results, some kind of way of pushing letters together, knowing when to stop, and having it stop sooner with wide flat parallel shapes, giving less weight but still some weight to sharp points like serifs? I think it can be done, using actually a simplified version of what I was imagining for the spark simulation (which might still be fun to build for other reasons - I'm imagining a high voltage game something like the famous Falling Sand Game).

Take your glyphs and rasterize them onto a grid, at some distance apart. So you've got a grid of pixels in three colours: paper, ink of one glyph, and ink of the other glyph. Now place an imaginary electrical resistor between each orthogonally adjacent pair of pixels. You've got a grid of resistors much like in the XKCD comic. The resistors will have three different values: Rii "ink to ink" when they connect two "ink" pixels, Rip for "ink to paper", and Rpp for "paper to paper". Now, also connect a resistor of value Rsi ("supply to ink") from the negative side of the power supply to each pixel of the left glyph, and from the positive side of the power supply to each pixel of the right glyph.

Calculate the overall resistence of this network. That can be done by turning it into a matrix of conductances (reciprocals of resistances) between the pixels, and solving the eigenproblem. It's a lot of number-crunching, but you can invoke a library to do it. It should be clear that in general, the resistance will be more the further apart the glyphs are, but exactly how much it is will depend on their shapes.

Hypothesis: there is some choice of the resistance values such that in a font with good kerning, this overall resistance will be about the same for all the kern pairs.

It should be possible to test that by examining some existing fonts that we think are well-designed. Furthermore, the obvious corollary is that once we have a choice of resistance values, we can use them to choose the kerning distance for a new font file, by moving the glyphs closer together or further apart until they all have the right resistance between them. Bingo: that's an algorithm for automatically kerning a font.

Will it actually produce appealing-looking results? I think there's some reason to think it will. Suppose Rii is pretty low, Rsi is a bit bigger, and Rpp is a fair bit more than either of them. That would correspond to basically conductive ink on basically insulating paper, and electricity being fed in uniformly over the area of the glyph, in such a way that a significant amount of the current in an ink pixel will come or go through the edges if possible, instead of passing directly to and from the supply. Now, in the vicinity of a long straight edge, what happens is you have a lot of electricity available at the edge because the current can spread out on the ink side of the edge and get to and from the supply over a large area. On the other hand, in the vicinity of a sharp point at the end of a stroke (such as a serif tip), the current all gets funnelled through a relatively thin ink line; more current per pixel with the same resistance per pixel means, by Ohm's law, more voltage drop, and more overall resistance. Thus, serifs have to be pushed closer together before their resistance gets small enough to trigger the algorithm saying "that's close enough!" As desired, this is a way of choosing a kern distance that will tend to put flat things further apart than pointy things - the opposite of what the spark-gap thing would have done.

There are some additional wrinkles to think about. The resistance between two glyphs (for a fixed font design) is determined by several things: the resistances Rii, Rip, Rpp, and Rsi; the size of the pixels; and of course the kerning distance. We'd run this calculation several times, maybe do a binary search, to find the kerning distance such that the answer is as close as possible to the desired overall resistance.

But actually, we don't need all those parameters. Using pixels is an approximation; we'd really rather do it at infinite precision on theoretically ideal curves, except that would be too hard (double integrals all over the place). So instead of specifying a pixel size, we'll say "as small as possible"; the limit for infinitely small pixels would be the desired infinite-precision value. Do we have to scale the resistances as we change the pixel size?

In the case of Rii and Rpp, the answer is "no." Imagine halving the pixel size. What we're basically doing is doubling the number of resistors in series (which doubles the overall resistance) but also doubling the number in parallel (which halves it). Over a uniform grid, then, changing the grid size doesn't change the resistances between macroscopic objects. It's a fact, incidentally, the people measure the resistance of surfaces in "Ohms per square". It doesn't matter what size square - it's the same for squares of any size because of the argument just described. This isn't true if you're talking about the resistance between individual grid points as in the XKCD problem; the glyphs or probes or whatever have to cover a larger number of grid points as the grid shrinks.

I think it's safe to neglect Rip entirely. If we imagine that it stays constant, which it should by the above argument, then as the grid size shrinks, along the perimeter of a glyph we've got a number of Rip resistances in parallel with each other and in series with everything (the current must pass out of one glyph and into the other, once each, on every path). If the resistances don't change with the grid size - and since they're the same kind of thing as Rii and Rpp which don't scale, the Rips shouldn't scale either - then that parallel combination goes to zero resistance in the limit of a fine grid. So the overall resistance, if we hope to get a good approximation of the infinitely small grid, actually does not depend on Rip. In the implementation I'd probably set it equal to one or the other of Rii and Rpp, or maybe their average.

Argument on the other side: imagine that paper and ink are basically perfect conductors but there is some special resistance the current experiences just as it crosses the border between the two, and we want to simulate it. The overall resistance should not depend on grid size, so if we believe in magic edge resistance, we must specify Rip in conductance per distance, and scale the actual values of the imaginary resistors up as the grid size scales down, to keep the overall resistance from depending on grid size. I think this isn't physically plausible, though.

I do want there to be some resistance from the supply to the ink, though. Otherwise why bother making the ink conductive at all? That resistance has to scale with the grid because when we halve the grid cell side length, there will be four times as many Rsi resistances. So we need to make each of them four times as large. Instead of being specified in Ohms per resistor like Rii and Rpp, we need to specify the supply resistance in terms of conductance per unit area. This does mean that the kerning distance will end up being different, in proportion to glyph size, for glyphs of different size; and it should be (two big "O"s are not set the same fraction of their diameter apart as two little "o"s), so for the moment I'm going to call that a feature.

Having eliminated Rip and the grid size, overall resistance now depends only on the Rii, Rpp, glyph shape, kern distance, and (1/Rsi*area).

The general idea of taking this further would be to try to find parameter values such that for one or more existing well-kerned fonts, the overall resistance seems to be nearly constant. If such parameters could be found, then the could be used to kern new fonts.

I know how to do the resistance calculation - it involves a fair bit of arithmetic but setting up the problem isn't too hard. I'm not sure about the parameter fitting; I think a simple hill-climbing with numerically evaluated derivatives would work, but it may be possible to derive a better method analytically; and it might be necessary to use a better method in order to be able to get answers in reasonable time. Both phases are more work than I can really afford to spend right now, but it's kind of tempting to start blocking it out anyway.


Most interesting. You can try it with a good font, and if the electric analogy fails you can fall back on creating artisanal functions. Then you can ask for opinions on the Typophile forum,
or the Typesetter forum,

Incidentally, kerning was not done at the foundry in the days of metal type, except on rare occasions, Usually it was done in the print shop with a mitre - a sort of very sharp guillotine activated by a lever arrangement with a very high power ratio. What WAS done in the foundry was "overhang" - casting letters so their serifs were sticking out a little beyond the edge of the lead body. On this topic you might like http://blog.typoretum.co.uk/2009/07/07/kerning-in-letterpress-typesetting/
Axel - 2010-12-10 01:43
I think if you were two give all glyphs the same charge, rather than opposite, you would find that it solves the spark gap issue. If all glyphs have the same charge, they will tend to want to move away from each other. Putting some sort of like charged wall on either side of a set of glyphs to press them inward would let them reach a natural equilibrium. By having adjacent glyphs carry the same charge there wouldn't be an issue with sparking, and the tendency for charge to gather in the sharp edges would actually help keep proper distance between the glyphs.
Another issue is that glyphs move along one dimension, not two. This should make the calculation easier, since although forces still need to be calculated in two dimensions, the projection of the force only matters in one. In the case of AV, for example, there is almost certainly some force urging A downward and V upward, but because they are constrained it can be ignored.
hoppie - 2011-02-05 18:05
Ergh, "to", not "two". Eye hate when Aye due that.
hoppie - 2011-02-05 18:07
That's an interesting idea - it takes kerning from two glyphs at a time, to the entire line of text. Which is, I suppose, how it really should be. I don't know if the tendency of charge to gather in the sharp points would still be true when the charge is being pushed away by charges in other nearby letters, but I guess that'd be something worth testing either why a physical experiment, better understanding of the theory than I have, or a trustworthy simulator.

It's true that for the basic glyph-to-glyph kerning there's only one direction of movement; even if we started doing the same thing with vertically written scripts we'd still only have to deal with one dimension at a time. I'm not sure if that's true in all cases, though. Just in the last couple days there was a discussion on the XeTeX mailing list of proper positioning of accent marks above letters - a question with a two-dimensional answer. I'm also not sure how much we'd really save in computation by having this constraint anyway.

On the whole, though, I think if I had time to code on this set of ideas at all, I'd probably be more excited about working on the "high voltage sandbox game" idea.
Matt - 2011-02-07 21:36

(optional field)
(optional field)
Answer "bonobo" here to fight spam. ここに「bonobo」を答えてください。SPAMを退治しましょう!
I reserve the right to delete or edit comments in any way and for any reason. New comments are held for a period of time before being shown to other users.