Quick! Imagine a ham radio operator!
Now, give yourself a gold star if the person you imagined was not a white male over the age of 65. It is my guess that very few of those gold stars are going to be handed out.
Probably a lot of you don't even know what a ham radio operator is.
I'm actually one myself, at least in theory. I still have the license because in Canada they don't expire anymore because the government office in charge of that figured out that handling the renewal paperwork cost more than the fees they collected. (That's a clue.) It has been probably more than a decade since I keyed up my rig. When I first got my license I was a kid and had no money to buy equipment. I had been reading electronics books and magazines from fifteen years earlier and I had romantic ideas of being a technical innovator, and saving money and earning the approbation of my peers by designing and building my own equipment.
I quickly discovered that even then, and even more so by the time I got to an age where I had real financial resources of my own, you couldn't actually save money by building your own. You could buy a commercial radio that was made with wholesale-priced components and then assembled by semi-enslaved Chinese workers; or you could buy components at more than retail (because electronic components aren't normally sold in small quantities, so you have to pay a middleman who breaks them down into small lots), assemble it yourself at great time investment, and worst of all end up with a device that will never be as good as the commercial one anyway because the technology even when I was 12, and all the more by the time I was 20, was already at the point where it couldn't really be built without the resources of that Chinese factory. Technical innovation was out of the reach of the basement amateur even with the money I didn't have, and okay, so I've acquired equipment, what can I do with it? Well, I can talk! To 65-year-old white men! That is, to the few of those willing to spend the money and develop the technical skills for a radio and license. And only talk about a narrowly limited list of subjects, because there are absurdly strict content controls on what's allowed to be transmitted in the ham bands.
Yeah, yeah, some of my best friends, and current readers, are white men over 65. That's not the point and you know it.
Meanwhile... the other teenagers were building dialup BBS networks, and then getting on the Net, and I could join that world with equipment I already had, and talk to my actual peers (including girls!) about whatever the fuck I wanted (including using the word "fuck," which is forbidden on the ham bands), and be a real technical innovator, and have other people use my stupid Turbo Pascal serial driver code, and so on. That, in a nutshell, is why I never got into ham radio. The part of ham radio that I was interested in peaked at about the time I was born, in the mid-1970s, and it was dying before I actually had the opportunity to get involved. Now, I think it's even more dead.
This here's not really an article about ham radio (which probably means all the comments I get will be about ham radio, but okay, whatever); it's an article about this Web site on which I am posting. The thing is that I'm no longer involved in ham radio because when they say "Tune in the world!" the world I wanted to "tune in" no longer exists; I no longer run a dialup BBS because that world no longer exists; and now I'm kind of thinking that maybe it's time for me to stop running a personal Web site of the kind I've been running, because I'm seeing indications that this world no longer exists either.
A look at the logs
Below is a list of pages that have received no traceable human traffic in the year to date (2010, January 1 through March 28). This was derived by taking the logs, filtering out all known spammers and robots, and also filtering out any clients that didn't send a referrer header. That last exclusion may be controversial, but it appears that such clients are often robots (possibly in disguise) and all it takes is one robot to hit all pages once, to spoil any stats that count non-referrer-sending clients. My thinking is that if a page gets any real traffic, real traffic includes a large enough fraction of referrer-senders that the page will probably get at least one referrer-sending visitor. But if you prefer, you can read this as a list of pages that get almost no traffic instead of pages that get absolutely no traffic.
I've also excluded pages that are intended for robot access (such as RSS feeds), internal pages not intended for public viewing (such as .php files of functions used by other pages), Bonobo Conspiracy (which uses its own URL scheme and is kind of a separate site), pages that don't exist (often the result of cracker attacks; less often, human typos), and "index" pages (which aren't real content). This list is just pages of real content. The list is sorted by URL.
- Broken John and Hekka
- Journal of Intercelestial Fly Fishing (1) "A curious catch"
- Journal of Intercelestial Fly Fishing (2) "Lady Midnight"
- Kansas City Standard Blues
- Notes War
- The Other channels
- Sad-Eyed Joe
- Merry Halloween, Z39.50
- Software Engineering Bromides
- First Nations of the Net
- Why Web BBSes suck
- A dream of virtual modems
- The Geneva Convention
- Supreme Court on same-sex marriage
- Technical information on the banknote block
- Bloomberg story on China/U.S. balance of payments and copyright
- Copyright: the worthless fiat good that could save America
- Paper on Japanese IP balance of payments
- Another prediction: No joy for the Japanese copyright cartel
- Stick to copyright, please!
- Frame-dropping for extra commercials
- Debate on C-48
- Motion M-431
- CRTC calls for comments on Internet retransmission, HDTV
- Public comments on HDTV
- CRTC and Justice deadline heads-up
- CRTC Broadcasting Public Notices RSS
- 2002-38 comments part 1
- 2002-38 comments part 2
- BPN 2002-38 reply comments posted
- BBC to open up its TV archives
- Reception should be free-for-all
- Zhu v. Merill Lynch HSBC
- LawMeme v. PHP-Nuke
- Gator license agreement analysed
- Actual text of the Boucher letter
- Parliamentary Research Branch publications
- Copyright claimed on silence
- First printed book was public domain
- Copyright claimed on yoga
- Paris Hilton, privilege holder
- Jack Layton on file sharing
- Study of the Net's impact on music sales
- Felten's unified theory of filesharing
- Current Heritage Minister vows to undo the work of previous Heritage Minister
- MPAA-funded copyright indoctrination program in American schools
- Petition for Users' Rights
- Wendy Lill demands an end to peer-to-peer
- Former Malaysian PM asserts IP claim on students
- Creative Commons Canada license launch
- Ashcroft declares war on intellectual property infringement
- Users' rights petition tabled
- Bev Oda demands copyright capitulation
- Users' Rights petition score card
- Lots of reading material on C-60
- C-60 comes up in Question Period
- First response to Digital Copyright Canada petition
- ZDNet on lyrics sites
- Access Copyright offers to let you pay to be indoctrinated
- Appropriation Art
- Coverage of C-36
- Bill C-36: Library and Archives of Canada
- More on C-36: We're already there
- Copyright extensions being removed from C-36
- Bev Oda questions C-60
- Bev Oda questions C-60
- Bill C-60: DMCA du Canada
- MPs' package on C-60
- Lynda Williams interview on copyright, C-60
- DMCA du Canada
- Théberge v. Galerie d'Art du Petit Champlain Inc.
- The news from Eldred
- Eldred transcript online
- Harvard College v. Canada
- US Supreme Court 7-2 in favour of copyright extension
- Strawberry Shortcake v. Gabe and Tycho
- Supreme Court agrees to hear Schmeiser/Monsanto canola patent case
- Lex Informatica on Tariff 22 hearing
- Copyright Board issues BML decision
- Schmeiser seed-patent case going to Supreme Court
- Report from the Schmeiser hearing
- CCH Canadian Ltd. v. Law Society of Upper Canada
- Lessig: How I Lost the Big One
- Federal Court: uploading is not reproduction
- Schmeiser loses, but does not pay
- SOCAN v. CAIP
- About this Jibjab business
- Cato Insitute urges repeal of DMCA
- Canadian Federation of Students open letter against copyright extension
- Canadian Music Creators Coalition
- Ireland amends term extension
- The Inducement Devolves into Unlawful Child Exploitation Act
- More than you wanted to know about the INDUCE act
- News on the Eastern Front
- AMV study looking for creators, viewers
- Japanese Web site operators arrested for posting manga scans
- Copyright is "most lobbied area"
- "Authorised" Peter Pan sequel author chosen
- Recipe copyright: not such a tasty idea
- Hilary Rosen doesn't get it
- Toronto consultation meeting (26 March 2002)
- Thoughts on Thoughts on Music
- WIPO advertises position of "Head"
- Which Canadian Supreme Court Justice are you?
- Justice Committee studies Supreme Court appointments
- Submission of Matthew Skala in response to BPN 2002-32
- Submission of Matthew Skala in response to BPN 2002-38
- Reply comment of Matthew Skala in response to initial comments on BPN 2002-38
- Reforming the DNS
- Bill to create .kids toned down to .kids.us
- Your name isn't your own
- WHOIS to get all, like, serious
- CIRA board elections
- Reforming the DNS
- ICANN cut secret deal on .travel approval
- Syabu inventors beware!
- ISP licensing bill lives again
- Comparison of Bills C-396, C-424, and C-234
- The dead horses of Parliament
- Parliament returns
- The House of Commons: Day 2
- Kill a Krook for Khrist
- More sex and pornography
- And the horse you rode in on
- The Contraventions Act and other news
- Lots of small stuff
- Point of privilege, Mr. Speaker
- More little stuff
- Intellectual property comes up in Parliament
- Let's hear it for them thar vigilante men!
- I don't think they mean "screen"
- Parliament returns
- April 10 Hansard
- April 11 Hansard
- Hansard from April 12 and 15
- A busy day in the House
- Privilege, health, and ISP liability for child pornography
- Sex and violence in Parliament
- Cuba joins Canada in attempting to ban free speech
- A couple of slow days
- Two medical bills
- Several days' debates
- Watch for the tell-tale signs of corruption
- Tuesday in the House
- Lots of financial fun
- Three days in the House
- Monday and Tuesday in the Commons
- House catch-up
- Last three days in the House
- Hansard in RSS
- The Speech from the Throne
- Tuesday in the House
- So, what about that House of Commons, huh?
- So, what else about that House of Commons?
- So, any more news from the House of Commons?
- So, are we finished catching up to the House of Commons yet?
- Parliamentarians behaving badly
- Let's actually improve things for a change!
- Friday in the House
- Some business is taken care of
- Bill C-11 (Internet retransmission) gets Royal Assent
- Last of the 2002 Hansard
- Tuesday through Friday in the House
- Prostitution, the Internet, addiction, campaign finance reform, and the Twelve Days of Giving
- Revenge of the Private Members' Bills
- Public service, proportional representation, the budget, and more budget
- Big pile o' Hansard
- Shake the jar! See if they'll fight!
- House of Commons roundup, 28 February to 24 March
- House of Commons roundup, 25 March to 2 April
- House of Commons roundup, 3-11 April
- Communications privacy, crime, terrorism, SARS
- Legislation by numbers
- House of Commons roundup, 29 April-7 May
- Terrorism, super-pot, blank media levy
- Abortion and the budget
- Budget, budget, missile defense
- Pestilence, copyright, labour, and war
- House of Commons blast from the past (part I)
- House of Commons blast from the past (part II)
- What's new in the House?
- C-36 passes, the Alliance continues to annoy me, and so on
- Parliament back in session
- New C-12: arrest without warrant, computer searches
- Free networking
- Another patent motion
- U.S. Govt. moves to reduce drug patent abuse
- UK to outsource patent tasks to Denmark and Netherlands
- U.S. sticky about drug patents
- Trademark on traditional knowledge?
- The rich world's patents abandon the poor to die
- Protecting traditional knowledge
- Patent filed on SARS
- Drug firms urged to relax patent laws
- US Army patents biological/chemical weapon
- Industry Committee considers drug patents
- DE Technologies patents ebusiness
- Patents on numbers are old news
- Microsoft patents XML for word-processing documents in New Zealand
- Microsoft's Canadian WPML patent application
- Virtual property... is patented!
- More Monsanto plant-patent controversy
- Canadians reject patents on life
- Russia accuses USA of piracy
- European authorities revoke patent on Indian traditional wheat
- Software patents instead of copyright
- Petition calls for ban on genetic DRM
- Wrigley's gets patent for biodegradeable chewing gum
- European Parliament rejects software patent directive
- Google patents search results - or not
- A few more degrees of chill
- Repressive European wiretap laws come to Canada
- Submission on "lawful access"
- Lawful access slides and links
- Toronto cop demands access to encryption keys, data retention, warrantless searches
- Lawful access comment summary published
- Another reason not to use MSWord for legal documents
- Computer monitoring as probation condition
- Britain testing biometric ID cards
- Life imitates rec.humor.funny
- Xerox printer-ID information
- Bill C-359 against identity theft
- Panel recommends keeping court documents off Net
- Intellectual Privacy Canada
- Protecting lawful private communications: Submission of Matthew Skala
- Manitoba to require technicians to snoop?
- U.S. judge jails journalist
- Slashdot places Web bugs in RSS feeds
- Is spam obscene?
- Anti-spam law slips by me
- Marketing is out of control
- ISP licensing lives again under guise of anti-spam
- Spammers say: vote Conservative!
- Another thing wrong with attention bonds
- Your name isn't your own
- Fuckedcompany fucked by Ford
- Spectators banned for drinking wrong cola
- Wizards and yowies and tradmarks, oh my!
- Intellectual property for preservation of status symbols
- Supreme Court overturns inmate voting ban
- Avi Rubin describes his experiences as election judge
- Gambling more secure than voting
- Party positions on tech issues
- Diebold electronic voting system horribly insecure
- Pornography not just for boys anymore
- Pornography not just for boys anymore
That is 250 pages, out of a little under a thousand on the site. So a quarter of my pages aren't really being used at all. I haven't counted (and it's hard to know exactly what number to measure) but it appears that very many of the remaining pages that got some human traffic, nonetheless got very little human traffic. The list would probably be much longer if I set the cutoff at a little bit more than zero.
Now, here's a similar list of pages that did get traceable human traffic, but it was all from search engine results.
- Eight knights of the dragon
- Will McCarthy and the Screaming Avocados
- The delivery man and his death
- A Halloween observance
- He lei nalu ho`okahi
- It only takes NAND
- Bonus Marks
- PimpMaster C explains dollar cost averaging
- PimpMaster C explains the efficiency frontier
- PimpMaster C explains market and limit orders
- PimpMaster C explains triple witching hour
- Probatio diabolica
- Photoshop blocks images of banknotes
- Felten on banknote recognition
- Elect Barney
- US Senator proposes to make a law respecting an establishment of religion
- PVR makers cripple own products
- Animemusicvideos.org forced to take down some AMVs
- Anime music music
- L. M. Montgomery Copyright Term Extension Act
- Bill C-365: VoIP free-for-all
- CRTC requires VoIP providers to provide 911
- C-506: ISP licensing is back again
- Divorce, procreation, Iraq
- Dead bills resurrected and pushed through
- On free networking
- Drug companies demand further extension of patents
- Websense receives broad patent on censorware
- RPTC oil-well torpedo lawsuits
- ISP ordered to break privacy in gay blood donor case
- MS Word revision-control strikes again!
- Doctors to be required to report gunshot wounds
- Some revelations about WoW spyware
- Google 302 exploit explained
- On Google and honest auctions
- Google "hacker" honeypots explained
- Gimmie one reason: Kitchener-Waterloo candidates round-up
That is an additional 39 pages - actually somewhat fewer than I was expecting when I set out to do these counts. Again, the list would probably be longer if the "no eligible traffic" criterion weren't absolute.
Most of my traffic comes from robots hitting RSS feeds - and most of that consists of failed attempts to hit the exchange rate feeds, which no longer exist. A large fraction of my human traffic comes from search engine queries hitting the following four pages. Traffic to these four pages is almost entirely from search engines, except for the OCR fonts which also get some hits through links from Wikipedia.
I also get a significant amount of human traffic from third parties referring to What Colour are your bits? Pretty often someone posts it in a discussion thread on Reddit (where I would think regular readers must be sick of it by now) and each time that happens I get a few dozen hits.
There is a constant trickle of human traffic to The terrible secret of Livejournal, which is still occasionally referred to in people's discussions of social networks, and is linked from a page in TV Tropes. This amounts to an average of less than one hit per day.
Traffic to all other pages is insignificant.
Some time ago I wrote the phrase "sex partner" in a posting here and a regular reader named Dan contacted me about it. I think that at first he honestly believed that that was a literal error; he could not believe that I might have actually intended to write it on purpose. It had to be a typo or spell-checking error or something, and he wrote to let me know that it had somehow accidentally gotten posted, because I'd surely want to correct such an error as soon as possible. In the ensuing discussion, after I told him that I had actually meant to write that, he told me that it was an unwise choice of words because it would alienate "most of the women on the planet!" That's about two billion women, and I think that much like Stephen Harper seeing the goat, Dan could actually see the crowd of two billion women.
Women standing in a packed crowd can fit maybe eight per square metre, so two billion of them would cover a square roughly 16 kilometres on a side. It's a large crowd.
Now, when you consider that most of the women on the planet cannot read English, and, to say the least, some of them also don't visit my Web site all that frequently... I did the math and came to the conclusion that Dan's estimate could not be off by anything less than seven orders of magnitude, arguably as many as ten. So, what was the mistake? The mistake was in counting conditional probability rather than plain probability. Dan was cooking up a wildly improbable scenario of a woman whose opinion I'd care about actually reading that paragraph on that page on my Web site and caring in a negative way about whether I uttered the words "sex partner," and then (correctly, but it's not an interesting question) estimating that the probability of her being alienated conditional on that wacky scenario was high. My own estimate was of the overall probability of that happening - extremely low because it's bounded above by the probability of the wacky scenario, including the negatively-correlated "whose opinion I'd care about" term. The high conditional probability at the end of the chain of terms doesn't matter because it cannot be greater than one, and it is basically multiplied by zero.
Furthermore, he was balancing that conditional probability of something bad happening, against nothing - he actually said, more or less, there's no benefit at all to using that phrase, so even if you think the cost of using it (in risking the wacky scenario) is small, you're better off not doing so. I used it because I saw real benefits to using it in avoiding other bad things that were not wacky and unlikely, but in fact had already happened and seemed likely to happen again if not actively prevented. Comparing the large risk against the small risk, the choice was clear to me.
The take-away lessons: when evaluating costs and benefits you have to look at probabilities, not just conditional probabilities; and you have to look at both costs and benefits, not just look at one and ignore the other.
From that point of view, it occurs to me that maybe in the past when looking at my own Web site, I haven't been seeing it accurately. One argument for keeping a page online, even if it isn't a very good page, is that it costs nothing to keep it, and there's always the chance that someone could get a significant benefit from the page, draw attention to it, and it could "go viral" and attract a lot of the attention I crave. I now think there may be serious errors on both sides of that comparison: keeping a page does have a real cost, and the chance of getting significant attention to a page has to be evaluated as an overall probability, not just conditional on the right person reading it.
If you look down that list of pages that didn't get traffic, you'll notice that many of them are old. Many aren't accessible through the navigation from the front page of my site - you can recognize the multicoloured placeholder graphics from where I removed the Project Wonderful boxes in an earlier update - and that highlights one of the major costs of keeping pages online: they either have to be kept current with the structure of the site, or else they decay into cruft.
When I mentioned that cost in a discussion on Facebook, a reader named Gord asked by way of clarification if the issue was that my homemade content management system was too hard to maintain. That may be a contributing factor - other content management software might be easier to maintain - but it's not the real issue. The real issue on this point is that content itself requires maintenance. The software will inevitably change, requiring content conversions; links will go bad; information will fall out of date; and every page I keep today will require attention from me in the future, repeatedly, as long as I keep it. In computer-science terms, the maintenance requirement of a page is omega(1); there is no amount of maintenance I do on a page that is ever the last maintenance I do on that page, except the "maintenance" of deleting the page entirely.
Pages that link to third-party sites often fall victim to link decay: the third-party sites inevitably go down or change their organization so that the links break. There are automated ways to detect when that happens, but doing something intelligent when it does happen requires human attention.
There are other costs associated with keeping things online, too. I change my mind sometimes. Not every statement I posted in those never-visited pages is something I still agree with today. Without wishing to deny that I said those things, it remains that I wouldn't necessarily want a new reader today to think that it's all my current point of view. If people routinely would read one page on my site, then click through to other pages, they'd get a good idea of where I stand today and it wouldn't be a problem. But the logs reveal that readers do not do that - they come in from a link or search engine, maybe read the first page they hit (usually, they don't even read it, I think) and then they never read anything else. So keeping a lot of stuff I haven't even read myself recently, means I'm publishing a lot of stuff under my name and masthead that maybe I don't really want to be taken as representative of my current position.
Even if I still agree with my old opinions, many of the pages on my site provide information that isn't opinion but factual, and some of that factual information (especially regarding technical and legal issues) is simply no longer true. Having it online means people can find it, think it's still true, and then I've done them a disservice. If I want to prevent that, I'm faced with ongoing checking and updating of whatever I keep.
Keeping low-value pages provides ammunition for spammers and mockers. Stats are unreliable on this Web BBS rant, for instance, because so very much of the traffic to that page consists of spammers, that I don't think I can trust any of it to be really human. That issue might be predictable for that page because it contains the character sequence "bbs" in the URL, so spam robots may be programmed to attempt postings there; but I'm at a loss to explain why, for instance, this otherwise unremarkable link posting from eight years ago is also one of the biggest spam targets on the site. (Note that the page it highlights, is dead.) Anything I post can eventually become a spam target. There've also been a couple of incidents of disreputable persons trolling through my site to find the 0.5% most controversial pages on it from the point of view of their own value systems, and then posting lists of the results on Web BBSes in attempts to discredit me. If people are going to do that, I'd rather they did it with the documents I actually consider important, current, and worth fighting over, rather than with a random selection of things I've ever said in any context in all of history.
There is a risk of distraction or confusion. I don't know how big an issue this is because I think most people visit my site "on a mission," with a goal in mind; they either find what they're looking for and go away, or they don't find it, and go away. Very few people start out to visit my site without a goal in mind and then just read whatever's there. However, if anyone does do that, it seems reasonable that they only have a limited amount of time and attention for it. It's a crying shame, then, for them to spend that time on a throw-away link page, while pages that really are good (let's say Broken John and Hekka, from the "no human traffic" list) go unnoticed. Since these readers who are not on a mission are some of my most valuable readers, it seems important to steer them in any way possible to the content I'd most like them to see. The existence of lower-value content may interfere with that.
The 289 pages listed above easily represent an hour of my life each. Very likely more. Am I getting 289 hours worth of satisfaction from them when nobody visits?
People in my position in life are expected to have some sort of presence on the Web. People type "Matthew Skala" into Google and expect to find my contact information, my list of academic publications, and so on. It wouldn't really be fair, let alone a good career move, to drop off the Web entirely and have everyone looking for me go after the other Matthew Skala instead. Given that I pretty much have to have some kind of Web presence, it's nice to have my Web presence be on server space under my own control instead of space belonging to my employers. So for that kind of thing - what I call the Yellow Pages entry - it seems like I really had better have some kind of Web site of my own.
The bigger benefit of running a Web site is that I have a lot of ideas I want people to think about. I want attention for my ideas. I also want personal attention directed at me. Those two things go together - I'm not willing to change my ideas to be ones with which it would be easier to attract attention, because then they wouldn't be my ideas anymore and it wouldn't be attention for me.
Thinking it over recently it occurred to me that what I really want, and wish my Web site could provide, is something very much like Agatha Heterodyne's superpower: the fully general ability to get others' participation in activities that I can't do alone, whether it's fixing the coffee machine or anything else. Without that ability I'm in the position of Dr. Merlot, a highly intelligent non-Spark relegated to riding the coattails of his betters. Of course, outside the comic book universe of Girl Genius, there are no Sparks - I'm the closest thing you'll find in real life - and that particular superpower doesn't exist as an inherent ability someone can be born with.
If the site is going to bring me participants for my projects, that means the site has got to have readers, but they also have to be a particular kind of readers: they have to be readers who also do things other than read. For instance, readers who write too. And for my ideas to have a large influence, then the ideas have to reach a large number of people - either by a large number of people reading my site directly, or by the readers I do have, conveying those ideas to others, and then those others doing the same, through an arbitrary number of generations.
That does occasionally happen, both with pages on my site and pages on others people's sites. It's a known part of the way the Web works. When it happens people say that "the page has gone viral," but that's a dangerously misleading term because it isn't really something the page does at all. The real subject of the verb should be the readers - when a page goes viral, that means the readers have propagated it virally.
Viral propagation is only a little bit predictable. It's even less controllable. But it's a fact that out of the thousand or so pages on my site, there are maybe ten for which it has happened. So, there's a number we can feed into our cost-benefit analysis: if I keep posting pages the way I have, at my current level of effort, I can expect one percent of them to attract real attention. That's one benefit.
It's also become clear to me that viral propagation is extremely context-dependent. It really doesn't matter how good a page is or isn't. The pages on my site that have been virally propagated are better-than-average pages, but certainly not my best and absolutely not the ones I'd choose if I could choose 1% of my pages to be virally propagated. Let me emphasize the languaging: viral propagation happens. It happens to pages. It is not something that pages do. The subject of the verb is the propagation itself (it occurs), or the readers (they propagate the page). It is not a function of the page.
So what is it a function of? What determines it, such that if we control this thing, we could control the process? I said "context," and I think that's the answer. If my link gets onto a high-traffic Web site, that doesn't mean I get much or any traffic myself. My experiences with banner advertising validate that - people don't click ads (first approximation), and the exceptions to that are people who click the ads, but then don't propagate the resulting pages. On the other hand, if someone writes a Web log posting about something I said, then someone who reads that posting has a much higher chance of doing the same. If you receive an "RTed tweet" [sic] you're much more likely to "RT" [sic] it than if you found the same page with a search engine query. The way you are introduced to my ideas makes a big difference to how likely you are to propagate them. There are what I've called higher-order effects (see also the audio), too: it's a k-win lottery, where you're more likely to link to my page after you've seen several links to it, with the result that if it does propagate virally, there will be a phase during which it exhibits greater than exponential growth. (Note I mean exponential growth in the mathematical sense, not just "fast.")
So the benefit I'm chasing isn't just that people will read my ideas, nor even just that they'll mention my ideas in their own spaces, but that they will mention my ideas in the way that causes others to do the same. That's the only route to viral propagation. The chance of this happening is a significant part of analysing the benefits of my Web site.
It would be nice to think that if anything on my site attracts a lot of attention, then some of that will spill over to other pages. That's the justification for posting something like the paper weight page: I don't get much satisfaction from knowing that someone somewhere found out through my efforts that a sheet of paper weights 4.5 grams, but the idea is that if that page gets ten thousand visitors, maybe some of them will also click around to other parts of my site and read something like Stupid and Entitled, which is actually valuable and important.
That last benefit, of attention from one page spilling over to another, is what I'd previously have put forward as the biggest benefit to running my site at all. If I post lots of stuff, some of it (I can't predict what) will attract viral attention, and then I'll have the opportunity to apply that attention to projects of my choice. The more stuff, the more attention, obviously: we're comparing a positive benefit against zero, and so it's a win to post rather than not post, even if the amount of attention that comes under my control per posting is very small, especially on the majority of postings that never get viral propagation. The trouble is, it's a conditional probability - it's conditional on the reader being the kind of reader whose attention is transferrable from one page to another - and in practice, based on the 13 years' worth of logs I've been looking at, it appears to not actually happen often enough to be significant. Viral readers seem especially bad for that: they don't care about anything else except the one page that they are propagating. Maybe I shouldn't be spending so much resources on the possibility of someone who comes here for one page staying to see something completely different.
Personal Web sites
The Web as people know it today is down to a few large Web sites, most of them commercial: the Facebooks, the Wikipedias, the Googles. Note that a lot of people think their browser's location bar is Google and vice versa (software connections that do exist between the two don't help with that); a lot of people think Google is the Web; and there was that hilarious story of the people who couldn't distinguish a Web log article about Facebook's login process from the Facebook login process itself. (Read the comments!)
Then, and tied into that world of large Web sites, there's the back end of the Web, the actual information you look for through those big Web sites. That includes "profiles" on sites like Facebook; it includes Web logs; it includes medium-sized informational sites like BBSes, Web comics, and the home pages that act as glorified Yellow Pages entries for commercial enterprises. Many of those things can exist from a technical point of view either as individual sites or hosted as parts of bigger sites and there's little practical difference. People visit those "on a mission" from Google search results; they don't link to each other much, and if they do, people don't follow those links.
Where do personal Web sites fit in? I'm not sure they fit in at all. At the time I started mine, which was more than ten years ago, it was typical that people would navigate the Web by hopping from link to link, among small sites, many of which were run by individuals. In fact, the usual method of "surfing" was very much like what people do today within Wikipedia or TV Tropes. Typical flow might be from my page to Dr. Greenson's Gastrointestinal and Liver Pathology Home Page Extravaganza to some other individual site to yet another individual site. Now, I think every second page tends to be Google: you do a search, read a result, go back to the results, read another result, go back to the results, and so on. Or else you're remaining within a gated community like Facebook or a Web BBS.
So here's the thing: for my site to operate the way I'd really like it to, it has to be part of a community. I want to be part of the general community of personal Web sites that link to each other, and I want to get the visitors who are doing that non-search-engine-directed site-to-site surfing. Such visitors are actually a lot more important than the superficial "viral" visitors who propagate links through systems like Twitter; the reason to hope for viral visitors is that they'll attract the more serious visitors who actually do things. When I post something like the audio postings, I want to get participation and involvement from the other people who are doing similar things. Maybe the real problem is that those other personal Web sites don't actually exist, and the visitors doing the kind of browsing I want, don't exist. The community I want to be part of doesn't exist.
Note that my site has been online since 1997, and some of its content is years older than that, grandfathered in from pre-Web media. It's reasonable to guess that the nature of the Web may have changed in thirteen years. Other communications media have changed a lot on similar time scales.
It's looking to me like personal Web sites as I imagine mine, are ham radio. They're BBS networks. They're out of date; they're not where the cool kids are; they're a vanishingly small, and vanishing, slice of the World Wide Web as it exists in 2010; and if I'm hanging my hopes on being a player in the world of personal home pages, then at 33 I'm the grey-haired gentleman shaking a cane and yelling at the kids to stay off my lawn.
That doesn't mean I can't or shouldn't attempt to run a personal home page; but it does mean that I have to accept being relegated to the back end of Google searches and other goal-directed visits, and my site should reflect that. It should be something like the glorified Yellow Pages entry.
Lessons for this site
If this site is to succeed in the modern Web, then, I think there are a few important conclusions I can draw.
- Ongoing maintenance requirements for existing content should be reduced to the absolute minimum possible.
- The navigation should be what goal-directed visitors expect.
- It should only provide content I'm willing to maintain and consider important.
- It should rank well in the search engine results for the queries used by the people I want as visitors, to the extent such people exist.
- Content directed to an audience that doesn't exist should be minimized.
On maintenance requirements: it's probably time to switch from my current homemade content management code to something that someone else wrote. That way I don't have to spend my time on maintaining and improving the code. On the other hand, it means I have to spend time fixing whatever's broken about someone else's code (inevitably, there will be something) and that can be worse than just writing my own the way I wanted it to begin with. Also, it means at least one more iteration of moving all my content to new code - and as can be seen in the lists above, I never even completed the last one. However, periodic switches to new code every few years seem to be inevitable anyway, so that last point may not be too significant.
I think switching to some sort of ready-made content management software would probably also help with navigation. People think they know what "blogs" are and how to use them; if they visit my site and it looks like a "blog" they'll have less trouble with it than they have with my current site where it's organized by the content itself and nobody is quite sure what to make of that. My hope is that a change to a more conventional appearance might improve the conditional probability of someone viewing a second page after they view the first one they hit.
It remains that I think a lot of content has to be cut. My first inclination is to just wipe the site clean, switch to new software, and post back old items as if they were new postings, as and when I feel like it. Maybe do some URL rewriting so that existing third-party links will point to the new locations of stuff. There are very few existing third-party links though (that's part of the problem) so there's not much actual work that needs to be done supporting them.
I don't know if I'd really go that far. Another alternative would be something like what I did last time I switched to new site code: install the new software so that it controls the front page of the site, leave the old directories of stuff intact but subject to link rot, and move stuff over as and when I feel like it. I know if I do that, though, I'll end up with a lot of that old crufty stuff kicking around for a long time, and people complaining that they can't find my old stuff. I'd rather have the situation be "I cut it on purpose" instead of "It exists, but you can't find it, because I haven't updated it."
I think search engine ranking isn't really a big concern. My existing site has actually done pretty well for that; also, users who come in from search engines tend to be low-value users for the reasons I've described. I don't see this being a big issue in the future either.
On minimizing content aimed at non-existent audiences, that may be the hardest part: I have to think a lot more than I have in the past about who I can reasonably expect to read my site, and not waste time catering to the audience I wish existed, but doesn't. At the same time I have to balance that against catering to an audience I might like that does exist but isn't reading my site yet - and figure out how to get them here.