On having a personal Web site

Quick! Imagine a ham radio operator!

Now, give yourself a gold star if the person you imagined was not a white male over the age of 65. It is my guess that very few of those gold stars are going to be handed out.

Probably a lot of you don't even know what a ham radio operator is.

I'm actually one myself, at least in theory. I still have the license because in Canada they don't expire anymore because the government office in charge of that figured out that handling the renewal paperwork cost more than the fees they collected. (That's a clue.) It has been probably more than a decade since I keyed up my rig. When I first got my license I was a kid and had no money to buy equipment. I had been reading electronics books and magazines from fifteen years earlier and I had romantic ideas of being a technical innovator, and saving money and earning the approbation of my peers by designing and building my own equipment.

I quickly discovered that even then, and even more so by the time I got to an age where I had real financial resources of my own, you couldn't actually save money by building your own. You could buy a commercial radio that was made with wholesale-priced components and then assembled by semi-enslaved Chinese workers; or you could buy components at more than retail (because electronic components aren't normally sold in small quantities, so you have to pay a middleman who breaks them down into small lots), assemble it yourself at great time investment, and worst of all end up with a device that will never be as good as the commercial one anyway because the technology even when I was 12, and all the more by the time I was 20, was already at the point where it couldn't really be built without the resources of that Chinese factory. Technical innovation was out of the reach of the basement amateur even with the money I didn't have, and okay, so I've acquired equipment, what can I do with it? Well, I can talk! To 65-year-old white men! That is, to the few of those willing to spend the money and develop the technical skills for a radio and license. And only talk about a narrowly limited list of subjects, because there are absurdly strict content controls on what's allowed to be transmitted in the ham bands.

Yeah, yeah, some of my best friends, and current readers, are white men over 65. That's not the point and you know it.

Meanwhile... the other teenagers were building dialup BBS networks, and then getting on the Net, and I could join that world with equipment I already had, and talk to my actual peers (including girls!) about whatever the fuck I wanted (including using the word "fuck," which is forbidden on the ham bands), and be a real technical innovator, and have other people use my stupid Turbo Pascal serial driver code, and so on. That, in a nutshell, is why I never got into ham radio. The part of ham radio that I was interested in peaked at about the time I was born, in the mid-1970s, and it was dying before I actually had the opportunity to get involved. Now, I think it's even more dead.

This here's not really an article about ham radio (which probably means all the comments I get will be about ham radio, but okay, whatever); it's an article about this Web site on which I am posting. The thing is that I'm no longer involved in ham radio because when they say "Tune in the world!" the world I wanted to "tune in" no longer exists; I no longer run a dialup BBS because that world no longer exists; and now I'm kind of thinking that maybe it's time for me to stop running a personal Web site of the kind I've been running, because I'm seeing indications that this world no longer exists either.

A look at the logs

Below is a list of pages that have received no traceable human traffic in the year to date (2010, January 1 through March 28). This was derived by taking the logs, filtering out all known spammers and robots, and also filtering out any clients that didn't send a referrer header. That last exclusion may be controversial, but it appears that such clients are often robots (possibly in disguise) and all it takes is one robot to hit all pages once, to spoil any stats that count non-referrer-sending clients. My thinking is that if a page gets any real traffic, real traffic includes a large enough fraction of referrer-senders that the page will probably get at least one referrer-sending visitor. But if you prefer, you can read this as a list of pages that get almost no traffic instead of pages that get absolutely no traffic.

I've also excluded pages that are intended for robot access (such as RSS feeds), internal pages not intended for public viewing (such as .php files of functions used by other pages), Bonobo Conspiracy (which uses its own URL scheme and is kind of a separate site), pages that don't exist (often the result of cracker attacks; less often, human typos), and "index" pages (which aren't real content). This list is just pages of real content. The list is sorted by URL.

That is 250 pages, out of a little under a thousand on the site. So a quarter of my pages aren't really being used at all. I haven't counted (and it's hard to know exactly what number to measure) but it appears that very many of the remaining pages that got some human traffic, nonetheless got very little human traffic. The list would probably be much longer if I set the cutoff at a little bit more than zero.

Now, here's a similar list of pages that did get traceable human traffic, but it was all from search engine results.

That is an additional 39 pages - actually somewhat fewer than I was expecting when I set out to do these counts. Again, the list would probably be longer if the "no eligible traffic" criterion weren't absolute.

Most of my traffic comes from robots hitting RSS feeds - and most of that consists of failed attempts to hit the exchange rate feeds, which no longer exist. A large fraction of my human traffic comes from search engine queries hitting the following four pages. Traffic to these four pages is almost entirely from search engines, except for the OCR fonts which also get some hits through links from Wikipedia.

I also get a significant amount of human traffic from third parties referring to What Colour are your bits? Pretty often someone posts it in a discussion thread on Reddit (where I would think regular readers must be sick of it by now) and each time that happens I get a few dozen hits.

There is a constant trickle of human traffic to The terrible secret of Livejournal, which is still occasionally referred to in people's discussions of social networks, and is linked from a page in TV Tropes. This amounts to an average of less than one hit per day.

Traffic to all other pages is insignificant.

Conditional probabilities

Some time ago I wrote the phrase "sex partner" in a posting here and a regular reader named Dan contacted me about it. I think that at first he honestly believed that that was a literal error; he could not believe that I might have actually intended to write it on purpose. It had to be a typo or spell-checking error or something, and he wrote to let me know that it had somehow accidentally gotten posted, because I'd surely want to correct such an error as soon as possible. In the ensuing discussion, after I told him that I had actually meant to write that, he told me that it was an unwise choice of words because it would alienate "most of the women on the planet!" That's about two billion women, and I think that much like Stephen Harper seeing the goat, Dan could actually see the crowd of two billion women.

Women standing in a packed crowd can fit maybe eight per square metre, so two billion of them would cover a square roughly 16 kilometres on a side. It's a large crowd.

Now, when you consider that most of the women on the planet cannot read English, and, to say the least, some of them also don't visit my Web site all that frequently... I did the math and came to the conclusion that Dan's estimate could not be off by anything less than seven orders of magnitude, arguably as many as ten. So, what was the mistake? The mistake was in counting conditional probability rather than plain probability. Dan was cooking up a wildly improbable scenario of a woman whose opinion I'd care about actually reading that paragraph on that page on my Web site and caring in a negative way about whether I uttered the words "sex partner," and then (correctly, but it's not an interesting question) estimating that the probability of her being alienated conditional on that wacky scenario was high. My own estimate was of the overall probability of that happening - extremely low because it's bounded above by the probability of the wacky scenario, including the negatively-correlated "whose opinion I'd care about" term. The high conditional probability at the end of the chain of terms doesn't matter because it cannot be greater than one, and it is basically multiplied by zero.

Furthermore, he was balancing that conditional probability of something bad happening, against nothing - he actually said, more or less, there's no benefit at all to using that phrase, so even if you think the cost of using it (in risking the wacky scenario) is small, you're better off not doing so. I used it because I saw real benefits to using it in avoiding other bad things that were not wacky and unlikely, but in fact had already happened and seemed likely to happen again if not actively prevented. Comparing the large risk against the small risk, the choice was clear to me.

The take-away lessons: when evaluating costs and benefits you have to look at probabilities, not just conditional probabilities; and you have to look at both costs and benefits, not just look at one and ignore the other.


From that point of view, it occurs to me that maybe in the past when looking at my own Web site, I haven't been seeing it accurately. One argument for keeping a page online, even if it isn't a very good page, is that it costs nothing to keep it, and there's always the chance that someone could get a significant benefit from the page, draw attention to it, and it could "go viral" and attract a lot of the attention I crave. I now think there may be serious errors on both sides of that comparison: keeping a page does have a real cost, and the chance of getting significant attention to a page has to be evaluated as an overall probability, not just conditional on the right person reading it.

If you look down that list of pages that didn't get traffic, you'll notice that many of them are old. Many aren't accessible through the navigation from the front page of my site - you can recognize the multicoloured placeholder graphics from where I removed the Project Wonderful boxes in an earlier update - and that highlights one of the major costs of keeping pages online: they either have to be kept current with the structure of the site, or else they decay into cruft.

When I mentioned that cost in a discussion on Facebook, a reader named Gord asked by way of clarification if the issue was that my homemade content management system was too hard to maintain. That may be a contributing factor - other content management software might be easier to maintain - but it's not the real issue. The real issue on this point is that content itself requires maintenance. The software will inevitably change, requiring content conversions; links will go bad; information will fall out of date; and every page I keep today will require attention from me in the future, repeatedly, as long as I keep it. In computer-science terms, the maintenance requirement of a page is omega(1); there is no amount of maintenance I do on a page that is ever the last maintenance I do on that page, except the "maintenance" of deleting the page entirely.

Pages that link to third-party sites often fall victim to link decay: the third-party sites inevitably go down or change their organization so that the links break. There are automated ways to detect when that happens, but doing something intelligent when it does happen requires human attention.

There are other costs associated with keeping things online, too. I change my mind sometimes. Not every statement I posted in those never-visited pages is something I still agree with today. Without wishing to deny that I said those things, it remains that I wouldn't necessarily want a new reader today to think that it's all my current point of view. If people routinely would read one page on my site, then click through to other pages, they'd get a good idea of where I stand today and it wouldn't be a problem. But the logs reveal that readers do not do that - they come in from a link or search engine, maybe read the first page they hit (usually, they don't even read it, I think) and then they never read anything else. So keeping a lot of stuff I haven't even read myself recently, means I'm publishing a lot of stuff under my name and masthead that maybe I don't really want to be taken as representative of my current position.

Even if I still agree with my old opinions, many of the pages on my site provide information that isn't opinion but factual, and some of that factual information (especially regarding technical and legal issues) is simply no longer true. Having it online means people can find it, think it's still true, and then I've done them a disservice. If I want to prevent that, I'm faced with ongoing checking and updating of whatever I keep.

Keeping low-value pages provides ammunition for spammers and mockers. Stats are unreliable on this Web BBS rant, for instance, because so very much of the traffic to that page consists of spammers, that I don't think I can trust any of it to be really human. That issue might be predictable for that page because it contains the character sequence "bbs" in the URL, so spam robots may be programmed to attempt postings there; but I'm at a loss to explain why, for instance, this otherwise unremarkable link posting from eight years ago is also one of the biggest spam targets on the site. (Note that the page it highlights, is dead.) Anything I post can eventually become a spam target. There've also been a couple of incidents of disreputable persons trolling through my site to find the 0.5% most controversial pages on it from the point of view of their own value systems, and then posting lists of the results on Web BBSes in attempts to discredit me. If people are going to do that, I'd rather they did it with the documents I actually consider important, current, and worth fighting over, rather than with a random selection of things I've ever said in any context in all of history.

There is a risk of distraction or confusion. I don't know how big an issue this is because I think most people visit my site "on a mission," with a goal in mind; they either find what they're looking for and go away, or they don't find it, and go away. Very few people start out to visit my site without a goal in mind and then just read whatever's there. However, if anyone does do that, it seems reasonable that they only have a limited amount of time and attention for it. It's a crying shame, then, for them to spend that time on a throw-away link page, while pages that really are good (let's say Broken John and Hekka, from the "no human traffic" list) go unnoticed. Since these readers who are not on a mission are some of my most valuable readers, it seems important to steer them in any way possible to the content I'd most like them to see. The existence of lower-value content may interfere with that.

The 289 pages listed above easily represent an hour of my life each. Very likely more. Am I getting 289 hours worth of satisfaction from them when nobody visits?


People in my position in life are expected to have some sort of presence on the Web. People type "Matthew Skala" into Google and expect to find my contact information, my list of academic publications, and so on. It wouldn't really be fair, let alone a good career move, to drop off the Web entirely and have everyone looking for me go after the other Matthew Skala instead. Given that I pretty much have to have some kind of Web presence, it's nice to have my Web presence be on server space under my own control instead of space belonging to my employers. So for that kind of thing - what I call the Yellow Pages entry - it seems like I really had better have some kind of Web site of my own.

The bigger benefit of running a Web site is that I have a lot of ideas I want people to think about. I want attention for my ideas. I also want personal attention directed at me. Those two things go together - I'm not willing to change my ideas to be ones with which it would be easier to attract attention, because then they wouldn't be my ideas anymore and it wouldn't be attention for me.

Thinking it over recently it occurred to me that what I really want, and wish my Web site could provide, is something very much like Agatha Heterodyne's superpower: the fully general ability to get others' participation in activities that I can't do alone, whether it's fixing the coffee machine or anything else. Without that ability I'm in the position of Dr. Merlot, a highly intelligent non-Spark relegated to riding the coattails of his betters. Of course, outside the comic book universe of Girl Genius, there are no Sparks - I'm the closest thing you'll find in real life - and that particular superpower doesn't exist as an inherent ability someone can be born with.

If the site is going to bring me participants for my projects, that means the site has got to have readers, but they also have to be a particular kind of readers: they have to be readers who also do things other than read. For instance, readers who write too. And for my ideas to have a large influence, then the ideas have to reach a large number of people - either by a large number of people reading my site directly, or by the readers I do have, conveying those ideas to others, and then those others doing the same, through an arbitrary number of generations.

That does occasionally happen, both with pages on my site and pages on others people's sites. It's a known part of the way the Web works. When it happens people say that "the page has gone viral," but that's a dangerously misleading term because it isn't really something the page does at all. The real subject of the verb should be the readers - when a page goes viral, that means the readers have propagated it virally.

Viral propagation is only a little bit predictable. It's even less controllable. But it's a fact that out of the thousand or so pages on my site, there are maybe ten for which it has happened. So, there's a number we can feed into our cost-benefit analysis: if I keep posting pages the way I have, at my current level of effort, I can expect one percent of them to attract real attention. That's one benefit.

It's also become clear to me that viral propagation is extremely context-dependent. It really doesn't matter how good a page is or isn't. The pages on my site that have been virally propagated are better-than-average pages, but certainly not my best and absolutely not the ones I'd choose if I could choose 1% of my pages to be virally propagated. Let me emphasize the languaging: viral propagation happens. It happens to pages. It is not something that pages do. The subject of the verb is the propagation itself (it occurs), or the readers (they propagate the page). It is not a function of the page.

So what is it a function of? What determines it, such that if we control this thing, we could control the process? I said "context," and I think that's the answer. If my link gets onto a high-traffic Web site, that doesn't mean I get much or any traffic myself. My experiences with banner advertising validate that - people don't click ads (first approximation), and the exceptions to that are people who click the ads, but then don't propagate the resulting pages. On the other hand, if someone writes a Web log posting about something I said, then someone who reads that posting has a much higher chance of doing the same. If you receive an "RTed tweet" [sic] you're much more likely to "RT" [sic] it than if you found the same page with a search engine query. The way you are introduced to my ideas makes a big difference to how likely you are to propagate them. There are what I've called higher-order effects (see also the audio), too: it's a k-win lottery, where you're more likely to link to my page after you've seen several links to it, with the result that if it does propagate virally, there will be a phase during which it exhibits greater than exponential growth. (Note I mean exponential growth in the mathematical sense, not just "fast.")

So the benefit I'm chasing isn't just that people will read my ideas, nor even just that they'll mention my ideas in their own spaces, but that they will mention my ideas in the way that causes others to do the same. That's the only route to viral propagation. The chance of this happening is a significant part of analysing the benefits of my Web site.

It would be nice to think that if anything on my site attracts a lot of attention, then some of that will spill over to other pages. That's the justification for posting something like the paper weight page: I don't get much satisfaction from knowing that someone somewhere found out through my efforts that a sheet of paper weights 4.5 grams, but the idea is that if that page gets ten thousand visitors, maybe some of them will also click around to other parts of my site and read something like Stupid and Entitled, which is actually valuable and important.

That last benefit, of attention from one page spilling over to another, is what I'd previously have put forward as the biggest benefit to running my site at all. If I post lots of stuff, some of it (I can't predict what) will attract viral attention, and then I'll have the opportunity to apply that attention to projects of my choice. The more stuff, the more attention, obviously: we're comparing a positive benefit against zero, and so it's a win to post rather than not post, even if the amount of attention that comes under my control per posting is very small, especially on the majority of postings that never get viral propagation. The trouble is, it's a conditional probability - it's conditional on the reader being the kind of reader whose attention is transferrable from one page to another - and in practice, based on the 13 years' worth of logs I've been looking at, it appears to not actually happen often enough to be significant. Viral readers seem especially bad for that: they don't care about anything else except the one page that they are propagating. Maybe I shouldn't be spending so much resources on the possibility of someone who comes here for one page staying to see something completely different.

Personal Web sites

The Web as people know it today is down to a few large Web sites, most of them commercial: the Facebooks, the Wikipedias, the Googles. Note that a lot of people think their browser's location bar is Google and vice versa (software connections that do exist between the two don't help with that); a lot of people think Google is the Web; and there was that hilarious story of the people who couldn't distinguish a Web log article about Facebook's login process from the Facebook login process itself. (Read the comments!)

Then, and tied into that world of large Web sites, there's the back end of the Web, the actual information you look for through those big Web sites. That includes "profiles" on sites like Facebook; it includes Web logs; it includes medium-sized informational sites like BBSes, Web comics, and the home pages that act as glorified Yellow Pages entries for commercial enterprises. Many of those things can exist from a technical point of view either as individual sites or hosted as parts of bigger sites and there's little practical difference. People visit those "on a mission" from Google search results; they don't link to each other much, and if they do, people don't follow those links.

Where do personal Web sites fit in? I'm not sure they fit in at all. At the time I started mine, which was more than ten years ago, it was typical that people would navigate the Web by hopping from link to link, among small sites, many of which were run by individuals. In fact, the usual method of "surfing" was very much like what people do today within Wikipedia or TV Tropes. Typical flow might be from my page to Dr. Greenson's Gastrointestinal and Liver Pathology Home Page Extravaganza to some other individual site to yet another individual site. Now, I think every second page tends to be Google: you do a search, read a result, go back to the results, read another result, go back to the results, and so on. Or else you're remaining within a gated community like Facebook or a Web BBS.

So here's the thing: for my site to operate the way I'd really like it to, it has to be part of a community. I want to be part of the general community of personal Web sites that link to each other, and I want to get the visitors who are doing that non-search-engine-directed site-to-site surfing. Such visitors are actually a lot more important than the superficial "viral" visitors who propagate links through systems like Twitter; the reason to hope for viral visitors is that they'll attract the more serious visitors who actually do things. When I post something like the audio postings, I want to get participation and involvement from the other people who are doing similar things. Maybe the real problem is that those other personal Web sites don't actually exist, and the visitors doing the kind of browsing I want, don't exist. The community I want to be part of doesn't exist.

Note that my site has been online since 1997, and some of its content is years older than that, grandfathered in from pre-Web media. It's reasonable to guess that the nature of the Web may have changed in thirteen years. Other communications media have changed a lot on similar time scales.

It's looking to me like personal Web sites as I imagine mine, are ham radio. They're BBS networks. They're out of date; they're not where the cool kids are; they're a vanishingly small, and vanishing, slice of the World Wide Web as it exists in 2010; and if I'm hanging my hopes on being a player in the world of personal home pages, then at 33 I'm the grey-haired gentleman shaking a cane and yelling at the kids to stay off my lawn.

That doesn't mean I can't or shouldn't attempt to run a personal home page; but it does mean that I have to accept being relegated to the back end of Google searches and other goal-directed visits, and my site should reflect that. It should be something like the glorified Yellow Pages entry.

Lessons for this site

If this site is to succeed in the modern Web, then, I think there are a few important conclusions I can draw.

  • Ongoing maintenance requirements for existing content should be reduced to the absolute minimum possible.
  • The navigation should be what goal-directed visitors expect.
  • It should only provide content I'm willing to maintain and consider important.
  • It should rank well in the search engine results for the queries used by the people I want as visitors, to the extent such people exist.
  • Content directed to an audience that doesn't exist should be minimized.

On maintenance requirements: it's probably time to switch from my current homemade content management code to something that someone else wrote. That way I don't have to spend my time on maintaining and improving the code. On the other hand, it means I have to spend time fixing whatever's broken about someone else's code (inevitably, there will be something) and that can be worse than just writing my own the way I wanted it to begin with. Also, it means at least one more iteration of moving all my content to new code - and as can be seen in the lists above, I never even completed the last one. However, periodic switches to new code every few years seem to be inevitable anyway, so that last point may not be too significant.

Right now I'm thinking seriously about using PivotX for the next iteration. I'm not thrilled by its use of Javascript and cookies, but it looks like I may be able to configure it into a form I could live with, and (the important win) it could reduce ongoing maintenance requirements. However, as I said before, different software will not eliminate the fundamental issue that content itself needs to be maintained.

I think switching to some sort of ready-made content management software would probably also help with navigation. People think they know what "blogs" are and how to use them; if they visit my site and it looks like a "blog" they'll have less trouble with it than they have with my current site where it's organized by the content itself and nobody is quite sure what to make of that. My hope is that a change to a more conventional appearance might improve the conditional probability of someone viewing a second page after they view the first one they hit.

It remains that I think a lot of content has to be cut. My first inclination is to just wipe the site clean, switch to new software, and post back old items as if they were new postings, as and when I feel like it. Maybe do some URL rewriting so that existing third-party links will point to the new locations of stuff. There are very few existing third-party links though (that's part of the problem) so there's not much actual work that needs to be done supporting them.

I don't know if I'd really go that far. Another alternative would be something like what I did last time I switched to new site code: install the new software so that it controls the front page of the site, leave the old directories of stuff intact but subject to link rot, and move stuff over as and when I feel like it. I know if I do that, though, I'll end up with a lot of that old crufty stuff kicking around for a long time, and people complaining that they can't find my old stuff. I'd rather have the situation be "I cut it on purpose" instead of "It exists, but you can't find it, because I haven't updated it."

I think search engine ranking isn't really a big concern. My existing site has actually done pretty well for that; also, users who come in from search engines tend to be low-value users for the reasons I've described. I don't see this being a big issue in the future either.

On minimizing content aimed at non-existent audiences, that may be the hardest part: I have to think a lot more than I have in the past about who I can reasonably expect to read my site, and not waste time catering to the audience I wish existed, but doesn't. At the same time I have to balance that against catering to an audience I might like that does exist but isn't reading my site yet - and figure out how to get them here.


Steve C
My first thought is that you have identified a common need that could meet all of your goals if you could figure out a way of filling it. Namely hands free content maintenance. If you can figure out a way so that "there is no amount of maintenance I do on a page that is ever the last maintenance I do on that page" is no longer true then that idea will go viral. Plus it will be focused on personal websites... exactly the type of audience you are most interested in attracting.

Of course accomplishing that wouldn't be easy but if you can pull it off the web will beat a path to your server. Steve C - 2010-03-30 15:45
I think the answer may be "wiki"; but that may be a cure worse than the disease.

You're right, though, that solving this problem for real would be important both to me and to others. Matt - 2010-03-30 15:56
Steve C
Other (hopefully helpful) comments on your new website:

- I noticed that your molecular modeling page has disappeared in the website revamp.
- Your pages often don't have good titles. This is especially true for the two articles you referred to above: "Broken John and Hekka" and "Stupid and Entitled". I especially liked the latter article and read it when you first posted it. I think it's still quite profound. But the title doesn't entice someone to read it.
- Your better pages are often long and contain multiple ideas. I believe your goals of getting users to click around and explore your site would be better met by breaking up longer pages. For example this page could be at least 3 pages. (Stats/logs, probability/cost/benefit, and lessons of maintaining a personal website.) The more you can link to your own current work the more people will go deeper into your website.
- To preview a comment you have to type in the b-word answer. But If you do so it appears in the entry box for the post for "post comment". So there is a problem. A human is seeing the form correctly filled out, while the software is still waiting for the data. Steve C - 2010-03-30 16:07
Yes, I deleted the "molecular models" article. It falls into "not the core mission."

On titles and splitting up pages: both ideas are worth consideration. I don't think titles are a big part of how people get to my pages, but maybe there's something better I could do with them. Right now I face the issue that PivotX doesn't seem to handle titles (as in the HTML TITLE tag) all that well and I have to get it doing what I want; but that software question is separate from the higher-level consideration of what the titles actually should be. On breaking up pages, I don't like it when other Web sites do that. However, they do it because it encourages click-throughs, and maybe that's what I need to do even if it seems like a sneaky trick.

I don't understand the issue you're describing with previewing and the anti-spam question. However, I note that when I visit these pages, it pre-fills the spam question box with the correct answer (and that is in the HTML source, it's not just my browser doing it for me). That seems odd. I don't know if maybe it's because I am logged in and it's detecting my login cookie, or what. So: what, step by step, are you doing and seeing that isn't what you expected? There's still a lot I don't understand about PivotX's spam protection - that's part of the price of using code I didn't write myself. Matt - 2010-03-30 17:25
Steve C
I enter in a comment. To preview it, I hit "preview comment". Spamquiz box left blank. Result: Preview of comment shown. Error displayed, "The Spamquiz answer was not correct... blah blah".
Expected result: Either (a) error OR (b) preview. Not both. (This is minor and not exactly what I referred to above.)

Separate visit:
I enter in a comment. Spamquiz box answered correctly. To preview it, I hit "preview comment". Result: Preview of comment shown. No error. Result = Expected so all's good.
Same visit:
Decide I like what I wrote, Spamquiz box answered from before with correct answer that's already in there. I hit "Post Comment". Result: Preview of comment shown, plus error "The Spamquiz answer was not correct..." and most importantly comment NOT POSTED. It LOOKS like it was posted though. I confirm it wasn't posted by refreshing the main page and not seeing the comment in the sidebar. Expected result: (a) Post or (b) Spamquiz left blank rather than pre-filled out.

To post my comment after I preview it I have to manually delete the correct answer from the Spamquiz box, then manually type it back in. Steve C - 2010-03-30 20:30
Steve C
I don't consider pages broken up to be a "sneak trick". Far from it. As a user I consider it valuable formatting.

In regard to "core mission" and excluding stuff that doesn't fit... is that in regards of the site transfer or just in general? I can totally understand if you decide such-in-such article isn't worth continued maintenance, especially in a site overhaul. But are you purposefully excluding your own content based on some set of other criteria? If the latter, then I think you are making a mistake.

If any content "goes viral" it will because it simply does, not because you want it to. "How much does a sheet of paper weigh?" is a perfect example. I can't imagine that post is part of anyone's "core mission" however anyone on the planet might define a "core mission". But you say it's one of your top traffic posts.

I'm just saying be cautious you don't throw away something unimportant to you but unexpectedly important to your goals. Steve C - 2010-03-30 20:51
On the spam quiz points: I can confirm that it lets you preview (but complains) if you try to preview without entering the answer. That may not be ideal, but I don't think it's a big problem. One could argue that since previewing is harmless, it should allow you to do that as easily as possible.

I wasn't able to reproduce the other, more serious problem you describe, where it shows the correct answer in the box but then gives an error as if the question hadn't been answered. Axel reported something similar on another entry, though, so it wasn't just your own configuration. He then said it had apparently been fixed; and in the intervening time I did upload a new version of the software, trying to fix something else. So for the moment my guess is that the new version happened to fix that bug as well. Let me know if you see it again.

The spam quiz authentication seems quite complicated and unstable. I may end up having to dig into the code myself to get it running the way I want. I'd prefer to avoid that, though. I already note that it has problems with internationalization - I can only enter one question and answer site-wide, which is why it now appears in both English and Japanese on all pages; and I'm not convinced that the error messages have translations (still to be checked).

To make things even more complicated, it apparently sets a cookie with the answer once you've given it once correctly. That may have something to do with the behaviour of it appearing automatically in the box without you entering it by hand. And some browsers may or may not do auto-fill of such things as well. Matt - 2010-04-01 11:21
As for keeping around stuff that doesn't seem important to me because it *might* be valuable, either now in the future, without my knowing or predicting it: well, see the points discussed at length in this entry. I feel that for the past however many years I've been following the policy you recommend, trying to keep everything under the sun on the grounds that it's cheap to do so, and all traffic is good traffic.

I don't think that policy has worked for me. One big part of the problem is that all traffic is not necessarily good traffic. Large numbers of visitors to a page, especially if they come in from search engines, doesn't translate to large numbers of visitors to other pages. Note that my real goal isn't large numbers of visitors overall - it's large numbers of participating visitors for the pages I do think are important. Another part of the problem is that keeping content is less cheap than it may appear.

So: I'm trying a different policy now, one in which I will have much less total content overall, and won't necessarily keep content around forever. That does mean deleting content in a substantive way. In that case, why preserve the "paper weight" page? Well, note I cut it to one sentence... it seems a shame to throw out all the traffic that it does get, but at least I can drastically reduce the maintenance cost of it. (Just last week I got email about the old, long version of the page, complaining that the links in it were dead. That's a waste of resources.)

On the molecular model page in particular, it got some good participation when I first posted it. Your own comments were especially valuable. But it wasn't continuing to get participation; and I could see in the logs that it was starting to attract hits from the high school crowd looking for how-to information that wasn't there. My thought is that at some point in the future when I've actually *built* some models I may post a new page, with photos, and that could be valuable. I don't think the page I removed was helpful to my goals.

It is possible that this new policy won't work either. In that case I can switch back to the old policy - either by uploading the backup I made of the site just before I put in the new software, or by just changing the way I use the new site - or I can switch to some third policy. At this time, though, I want to see what happens when I don't keep material I think distracts from my goals. Matt - 2010-04-01 12:12
Steve C
I thought when you were talking about the maintenance of old content you were referring to pre-2010 stuff. I've seen you've given it a great deal of thought of what to include and what to cut from your site even for the more recent stuff. That's all I wanted to confirm for that particular page.

BTW this post is unnecessary but I'm testing the bug. The bug seems to be squashed. Steve C - 2010-04-01 23:23
Vilhelm S
My typical web reading behaviour is to find someone's personal page or (nowadays) blog, and compulsively click through all the entries for the next couple of days. But I guess I'm far from typical.

It is indeed sad to see traditional modes of communication disappear. Even without going back as far as BBSs, it seems Livejournal at some point was really the place to be, but I suspect starting one now would be pointless.

On a separate note, "応答しますください" should be "応答してください". (Or perhaps just "答えてください" -- 応答 seems technical.) Vilhelm S - 2010-04-10 12:40
Livejournal peaked in the first few months of 2005, at least as measured by "number of accounts updated in the last N days." Chart at http://pics.livejournal.com/pyrop/pic/0002qhpf/g15 . Now it doesn't even get mentioned in most people's lists of popular social networking sites.

Thanks for the note on the Japanese translation. Matt - 2010-04-11 07:27

