There's a classic adventure game called Paranoia which is
set in an extremely repressive Utopian futuristic world
run by The Computer, who
is Your Friend. Looking at a recent LawMeme
posting and related discussion, it occurred to me that the concept of
colour-coded security clearances in Paranoia provides a good metaphor for
a lot of copyright and intellectual freedom issues, and it may illuminate
why we sometimes have difficulty communicating and understanding the
ideologies in these areas.
UPDATE: Some follow-up material about this article has been posted in the "colour" category on this Web site. You might want to read that stuff too.
An article based on this one and its follow-ups, by me, Brett Bonfield, and Mary Fran Torpey, appeared in the 15 February 2008 issue of LJ, Library Journal.
In Paranoia, everything has a colour-coded security level (from Infrared up to Ultraviolet) and everybody has a clearance on the same scale. You are not allowed to touch, or have any dealings with, anything that exceeds your clearance. If you're a Red Troubleshooter, you're not allowed to walk through an Orange door. Formally, you're not really supposed to even know about the existence of anything above your clearance. Anyone who breaks the rules is a Commie Mutant Traitor, subject to the death penalty.
Much of the game revolves around the consequences of the security levels.
For instance, Friend Computer might assign a team of Red Troubleshooters
to re-paint a hallway that ought to be Orange but was painted Yellow by
mistake the Commie Mutant Traitors. It's quite likely in
such a case that the Troubleshooters will all end up shooting each other
for treason against Friend Computer, since none of them are allowed to
touch the paint, go near the hallway, or talk about their mission, and
they're all charged with enforcing the rules on one another.
In intellectual property and some other fields we're very interested in information, data, artistic works, a whole lot of things that I'll summarize with the term "bits". Bits are all the things you can (at least in principle) represent with binary ones and zeroes. And very much of intellectual property law comes down to rules regarding intangible attributes of bits - Who created the bits? Where did they come from? Where are they going? Are they copies of other bits? Those questions are perhaps answerable by "metadata", but metadata suggests to me additional bits attached to the bits in question, and I'd like to emphasize that I'm talking here about something that is not properly captured by bits at all and actually cannot be, ever. Let's call it "Colour", because it turns out to behave a lot like the colour-coded security clearances of the Paranoia universe.
Bits do not naturally have Colour. Colour, in this sense, is not part of the natural universe. Most importantly, you cannot look at bits and observe what Colour they are. I encountered an amusing example of bit Colour recently: one of my friends was talking about how he'd performed John Cage's famous silent musical composition 4'33" for MP3. Okay, we said, (paraphrasing the conversation here) so you took an appropriate-sized file of zeroes out of /dev/zero and compressed that with an MP3 compressor? No, no, he said. If I did that, it wouldn't really be 4'33" because to perform the composition, you have to make the silence in a certain way, according to the rules laid down by the composer. It's not just four minutes and thirty-three seconds of any old silence.
My friend had gone through an elaborate process that basically amounted to performing some other piece of music four minutes and thirty-three seconds long, with a software synthesizer and the volume set to zero. The result was an appropriate-sized file of zeroes - which he compressed with an MP3 compressor. The MP3 file was bit-for-bit identical to one that would have been produced by compressing /dev/zero... but this file was (he claimed) legitimately a recording of 4'33" and the other one wouldn't have been. The difference was the Colour of the bits. He was asserting that the bits in his copy of 433.mp3 had a different Colour from those in a copy of 433.mp3 I might make by means of the /dev/zero procedure, even though the two files would contain exactly the same bits.
Now, the preceding paragraph is basically nonsense to computer scientists or anyone with a mathematical background. (My friend is one; he'd done this as a sort of elaborate joke.) Numbers are numbers, right? If I add 39 plus 3 and get 42, and you do the same thing, there is no way that "my" 42 can be said to be different from "your" 42. Given two bit-for-bit identical MP3 files, there is no meaningful (to a computer scientist) way to say that one is a recording of the Cage composition and the other one isn't. There would be no way to test one of the files and see which one it was, because they are actually the same file. Having identical bits means by definition that there can be no difference. Bits don't have Colour; computer scientists, like computers, are Colour-blind. That is not a mistake or deficiency on our part: rather, we have worked hard to become so. Colour-blindness on the part of computer scientists helps us understand the fact that computers are also Colour-blind, and we need to be intimately familiar with that fact in order to do our jobs.
The trouble is, human beings are not in general Colour-blind. The law is not Colour-blind. It makes a difference not only what bits you have, but where they came from. There's a very interesting Web page illustrating the Coloured nature of bits in law on the US Naval Observatory Web site. They provide information on that site about when the Sun rises and sets and so on... but they also provide it under a disclaimer saying that this information is not suitable for use in court. If you need to know when the Sun rose or set for use in a court case, then you need an expert witness - because you don't actually just need the bits that say when the Sun rose. You need those bits to be Coloured with the Colour that allows them to be admissible in court, and the USNO doesn't provide that. It's not just a question of accuracy - we all know perfectly well that the USNO's numbers are good. It's a question of where the numbers came from. It makes perfect sense to a lawyer that where the information came from is important, in fact maybe more important than the information itself. The law sees Colour.
Suppose you publish an article that happens to contain a sentence identical to one from this article, like "The law sees Colour." That's just four words, all of them common, and it might well occur by random chance. Maybe you were thinking about similar ideas to mine and happened to put the words together in a similar way. If so, fine. But maybe you wrote "your" article by cutting and pasting from "mine" - in that case, the words have the Colour that obligates you to follow quotation procedures and worry about "derivative work" status under copyright law and so on. Exactly the same words - represented on a computer by the same bits - can vary in Colour and have differing consequences. When you use those words without quotation marks, either you're an author or a plagiarist depending on where you got them, even though they are the same words. It matters where the bits came from.
I think Colour is what the designers of Monolith are trying to challenge, although I'm afraid I think their understanding of the issues is superficial on both the legal and computer-science sides. The idea of Monolith is that it will mathematically combine two files with the exclusive-or operation. You take a file to which someone claims copyright, mix it up with a public file, and then the result, which is mixed-up garbage supposedly containing no information, is supposedly free of copyright claims even though someone else can later undo the mixing operation and produce a copy of the copyright-encumbered file you started with. Oh, happy day! The lawyers will just have to all go away now, because we've demonstrated the absurdity of intellectual property!
The fallacy of Monolith is that it's playing fast and loose with Colour, attempting to use legal rules one moment and math rules another moment as convenient. When you have a copyrighted file at the start, that file clearly has the "covered by copyright" Colour, and you're not cleared for it, Citizen. When it's scrambled by Monolith, the claim is that the resulting file has no Colour - how could it have the copyright Colour? It's just random bits! Then when it's descrambled, it still can't have the copyright Colour because it came from public inputs. The problem is that there are two conflicting sets of rules there. Under the lawyer's rules, Colour is not a mathematical function of the bits that you can determine by examining the bits. It matters where the bits came from. The scrambled file still has the copyright Colour because it came from the copyrighted input file. It doesn't matter that it looks like, or maybe even is bit-for-bit identical with, some other file that you could get from a random number generator. It happens that you didn't get it from a random number generator. You got it from copyrighted material; it is copyrighted. The randomly-generated file, even if bit-for-bit identical, would have a different Colour. The Colour inherits through all scrambling and descrambling operations and you're distributing a copyrighted work, you Commie Mutant Traitor.
To a computer scientist, on the other hand, bits are bits are bits and it is absolutely fundamental that two identical chunks of bits cannot be distinguished. Colour does not exist. I've seen computer people claim (indeed, one did this to me just today in the very discussion that inspired this posting) that copyright law inescapably leads to nonsense conclusions like "If I own copyright on one thing, and copyright inherits through XOR, then I own copyright on everything because everything can be obtained from my one thing by XORing it with the right file." That sounds profound only if you're a Colour-blind computer scientist; it would be boring nonsense to a lawyer because lawyers are trained to believe in and use Colour, and it's obvious to a lawyer that the Colour doesn't magically bleed to the entire universe through the hypothetical random files that might be created some day. You could create the file randomly, but you didn't. Maybe you could create a file identical to the complete works of Shakespeare by XORing together two files of apparently random garbage. "Why, so can I, or so can any man;" but that doesn't mean that I am William Shakespeare.
This idea of Colour is a problem for communication between those of us who work in the world of computers, where Colour does not exist, and those of us who work in the law, where Colour exists and is important. Lawyers will ask computer scientists questions about how to determine the Colour of bits (like "How can Friend Computer prevent the Commie Mutant Traitors from making illegal copies of files, while still allowing loyal Troubleshooters to use disk-copying equipment?"), and computer scientists will find it difficult to say anything in response that the lawyers can comprehend - because a big part of computer science is about understanding that Colour does not exist. Someone who cares a lot about what Colour the bits are, and spends a lot of resources on trying to answer that question, is a dangerous idiot if not a Commie Mutant Traitor. In intellectual property law the Colour of bits exists and is of absolutely paramount importance. A computer scientist who won't tell what Colour the bits are is being deliberately unhelpful, and a computer scientist who denies the very existence of Colour (as any conscientious computer scientist must eventually do) is a dangerous idiot and/or a Commie Mutant Traitor.
There are several ways we could try to avoid the issue. Computer scientists who want to try to be helpful may say, "Okay, you, the lawyer, are a dangerous idiot, but I have to work with you or be thrown in jail as a Commie Mutant Traitor as happened to Dmitry Sklyarov, so I'll try to address your concerns. You say there is some special property of some bits and we need to know which bits have this property. Fine. We'll attach tags to the files to say what Colour they are." In the copyright realm, that's the "rights management information" solution. It's what they do with DVDs (region coding), VHS tapes (Macrovision), Adobe eBooks ("you may not read this file aloud"), CDs (SCMS), and many other formats. The trouble is, if we (as computer scientists) are intellectually honest about it, we'll have to admit that it can't really work.
The tags are just more bits. You can write a tag that says "this is an Orange tag", but it will be made out of bits and so it can't really have a Colour because Colour does not exist. It will just be a Colour-less tag saying "this is an Orange tag". It will be subject to all the consequences of the fact that Colour does not exist - such as the fact that the tag could be stripped out somewhere down the line. The computer scientists are aware of that; we have to be, because knowing about the non-existence of Colour is what makes us computer scientists in the first place.
What we are doing with rights management information is simulating Colour in a computer-sciencey way. But lawyers will seize on the possibility of doing this kind of simulation and say, "See! You admit it! You can recognize the Colour of bits after all!" and then conclude from there that all the other rules they want to make (such as "Red Troubleshooters may not walk down Orange hallways") are meaningful in the computer science realm. They'll say "You can recognize the Colour of bits after all!" rather than "Colour exists after all!" because the idea of Colour not existing in the first place is not within their imagination. The "fact" that Colour is something real is so fundamental to law that it can't be challenged. Of course Colour exists. We lawyers think about Colour so much that we think we can see it. Why can't you? Maybe there is something wrong with your eyes. As computer scientists, we need to make clear that Colour simulated by Colour-less tags saying "this is an Orange tag" and such, is still only a simulation. The properties that Colour is supposed to have do not automatically come with the tags, because those properties are Colour, the tags are bits, and bits do not have Colour. Even bits that talk about Colour do not have Colour themselves. There is no such thing as Colour.
Another thing computer scientists will try to do is to treat Colour as a function (in the strict mathematical sense of "function") of the bits - maybe an uncomputable function (in the strict mathematical sense of "uncomputable"), maybe intractable, but a function nevertheless. We either do that because we mistakenly believe that Colour really is a function, or because we're a little more sophisticated, we know that it's not a function, but we think that we can fake it closely enough with a function to get the lawyers off our backs. Either way, the idea is that we should be able to look at bits and somehow determine, from the bits themselves, what Colour they ought to be.
Treating Colour as a function is almost the same as attaching tags to the bits - the difference is that when the Colour is a function of the bits, we don't have to worry about the tags being detached; on the other hand, when the Colour is a function of the bits, we can never have more than one possible Colour for a given sequence of bits. Monolith depends on exploiting this problem: it assumes that one file can only ever have one Colour, asserts that the Colour of its output file is the "you may copy this" Colour because of the (correct) claim that fixing any other single unchangeable Colour would raise legal problems, and then follows the logic to a claim that it can produce what would otherwise be an illegal copy of the copyrighted input, without breaking copyright law. One Colour per file was never one of the lawyers' rules of Colour; it's merely a consequence of "Colour is a function", and Colour being a function is just something we computer people decided to believe because functions make sense to our training and Colour doesn't. Colour is not actually a function at all.
Trying to infer the Colour from the bits may seem like an okay thing to do as long as bits are tied to physical objects. You can examine a paper document and determine whether it is an original or a photocopy. You can probably examine something purporting to be a photograph and determine whether it is a photograph of a real scene, or something more complicated. But even in the analog realm, determining Colour by examination is not always possible. You can't determine by looking at a photograph of two people having sex whether they consented to the sex or not, let alone whether they consented to the making of the photograph. That's a Colour distinction that is not a function of the bits that make up the photograph - and it's true even of analog photographs.
Other important questions which you may or may not be able to answer by examining a photograph are "Are those things actually humans, or some kind of simulation?" and "How old are they?" Those questions may have been difficult with analog; they become even more difficult with digital. It is easy to imagine that someone could render by innocent means (drawing or ray tracing or whatever) an image bit-for-bit identical to an image that has the Colour (presumably Pink) of illegal child pornography. In that case, depending on your view of such things, it may matter where the bits came from to the determination of whether they are Pink (illegal) or Green (legal). Identical bits may have different Colour.
Child pornography is an interesting case because I find myself, and I think many people in the computing community will find themselves, on the opposite side of the Colourful/Colour-blind gap from where I would normally be. In copyright I spend a lot of time explaining why Colour doesn't exist and it doesn't matter where the bits came from. But when it comes to child pornography, I think maybe Colour should make a difference - if we're going to ban it at all, it should matter where it came from. Whether any children were actually involved, who did or didn't give consent, in short: what Colour the bits are. The other side takes the opposite tack: child pornography is dangerous by its very existence, and it doesn't matter where it came from. They're claiming that whether some bits are child pornography or not, and if so, whether they're illegal or not, should be entirely determined by (strictly a function of) the bits themselves. Legality, at least under the obscenity law, should not involve Colour distinctions.
I think computer scientists could actually understand Colour a lot better than we do, because there are places in computer science where Colour does matter. I already mentioned the idea of quoting and plagiarism - identical words are or are not okay to use without quote marks in an academic paper depending on their Colour. Those of us with degrees are able to follow the rules for that because people who aren't get kicked out of school before finishing their degrees. That's a general academic application of Colour.
If you've any exposure to metrology - not "meteorology", I mean the science of measurement - you'll be familiar with the idea of tracing the pedigree of standards. Down in the chemistry lab they have a big jar of buffer solution with a label asserting that it not only has a pH of exactly 7.00, but that its pH is "traceable" to such-and-such primary standard, through a chain that probably terminates at the National Bureau of Standards in Boulder, Colorado, USA. That's Colour. Not only do you know the pH of the buffer solution, but you know where it came from. Someone other than the National Bureau of Standards might be able to produce a buffer solution that is just as good and just as accurately 7.00 pH. If you have a sample of good pH 7.00 buffer solution it might be indistinguishable from the real traceable standard solution; but it wouldn't really be the traceable solution unless it had the intangible Colour to make it authentic.
The computer science applications of Colour seem to be mostly specific to security. Suppose your computer is infected with a worm or virus. You want to disinfect it. What do you do? You boot it up from original write-protected install media. Sure, you have a copy of the operating system on the drive already, but you can't use that copy - it's the wrong Colour. Then you go through a process of replacing files, maybe examining files, swapping disks around and carefully write-protecting them; throughout, you're maintaining information on the Colour of each part of the system and each disk until you've isolated the questionable files and everything else is known to be the "not infected with virus" Colour. Note that developers of Web applications in Perl use a similar scorekeeping system to keep track of which bits are "tainted" by influence from user input.
When we use Colour like that to protect ourselves against viruses or malicious input, we're using the Colour to conservatively approximate a difficult or impossible to compute function of the bits. Either our operating system is infected, or it is not. A given sequence of bits either is an infected file or isn't, and the same sequence of bits will always be either infected or not. Disinfecting a file changes the bits. Infected or not is a function, not a Colour. The trouble is that because any of our files might be infected including the tools we would use to test for infection, we can't reliably compute the "is infected" function, so we use Colour to approximate "is infected" with something that we can compute and manage - namely "might be infected". Note that "might be infected" is not a function; the same file can be "might be infected" or "not (might be infected)" depending on where it came from. That is a Colour.
But the "might be infected" Colour is clearly a fictional thing we create to help us approximate a tricky function. It's still easy to argue that Colour doesn't really exist. I've saved until last what I think is the best example of a Colour in computer science, and I think even the most hardline mathematicians will have to agree that even though this isn't a function and cannot be represented in bits, it's something real that we have to be able to think about and care about.
Random numbers have a Colour different from that of non-random numbers. The question of how to determine whether numbers are random or not by looking at them is one of the recurring flame wars of sci.crypt. You can't do it. Here's a number: 2. Was that a random number? Well, maybe I got it by rolling a die (a random generator); or maybe I got it by counting my legs (probably not random). If I give you a file of supposedly random bits, there's no way you can tell whether they are randomly generated or not. The same file could have been generated by a quantum-mechanical random source, monkeys on typewriters, or by encrypting some well-known non-random file with some scheme that may or may not be generally known.
There are statistical tests you can do; for instance, if you look at the file and discover that it contains a copy of the works of Shakespeare, then it doesn't look much like you would expect randomly generated numbers to look. But it could still be randomly generated. The test tells you whether the file has the statistical properties expected from randomly generated files, not whether the file really is randomly generated or not. It's not even correct to say "the probability of this being from a random generator is very low" because that's not true - it either was or was not randomly generated, that's not open to probability. At best you could say "If we ran a random generator to produce a file this size, the probability of it generating this file would be very low", which sounds almost the same, but is not.
Note my terminology - I spoke of "randomly generated" numbers. Conscientious cryptographers refuse to use the term "random numbers". They'll persistently and annoyingly correct you to say "randomly generated numbers" instead, because it's not the numbers that are or are not random, it's the source of the numbers that is or is not random. If you have numbers that are supposed to come from a random source and you start testing them to make sure they're really "random", and you throw out the ones that seem not to be, then you end up reducing the Shannon entropy of the source, violating the constraints of the one-time pad if that's relevant to your application, and generally harming security. I just threw a bunch of math terms at you in that sentence and I don't plan to explain them here, but all cryptographers understand that it's not the numbers that matter when you're talking about randomness. What matters is where the numbers came from - that is, exactly, their Colour.
So if we think we understand cryptography, we ought to be able to understand that Colour is something real even though it is also true that bits by themselves do not have Colour. I think it's time for computer people to take Colour more seriously - if only so that we can better explain to the lawyers why they must give up their dream of enforcing Colour inside Friend Computer, where Colour does not and cannot exist. Maybe then they'd stop trying to shoot us as Commie Mutant Traitors.
John from 212.44.43.80 at Wed, 25 Jan 2006 10:27:17 +0000:
As a Computer Person I appreciate the clarity of the legalese and legal thought.
Especially the real estate analogy - I think that's very strong in terms of communication between the two professions (after all most people have some experience of real estate - and even proving that you live there / own it...)
Sean Lynch from 71.198.156.248 at Wed, 19 Apr 2006 01:54:38 +0000:
It's an interesting explanation, but I think that it may be more elucidating to say that Colour is not a property of the bits at all, but a property of the people moving those bits around, their intent, and the process by which they obtained those bits.
Assuming 4'33" met the definition of a work, someone distributing a 4'33" length MP3 file made by catting /dev/zero could indeed be prosecuted, because why in the world would they be distributing such a file otherwise?
Likewise, if someone is distributing a Mono file, it really doesn't matter that the Mono file looks like nonsense or can be used (along with this copyrighted work) to obtain a freely available work. The fact is that person is enabling others to obtain a copyrighted work without permission, and there really isn't any other plausible reason for them to be distributing that Mono file. The other work is already available in un-obfuscated form, so it can't be for purposes of providing access to the public domain work.
Demokritos from 69.62.156.11 at Wed, 21 Jun 2006 20:08:59 +0000:
Very interesting article.
Now what I am more concerned about is that due to the prevalent DRM/TPM techniques used, doesn't a whole lot of the previously 'colourless' public domain information fall into non-usable materials and be illegal, as it has no metatags nor colour function to imply it is in public domain. And how about the expiration of copyright, how will that be implemented if at all?
And just for the kicks of it, this comment has no colour whatsoever.
solinym from 67.9.118.217 at Wed, 09 Aug 2006 05:16:38 +0000:
"The fact is that person is enabling others to obtain a copyrighted work without permission, and there really isn't any other plausible reason for them to be distributing that Mono file. The other work is already available in un-obfuscated form, so it can't be for purposes of providing access to the public domain work."
So by extension, any file in an encrypted file system must be not public-domain, because said files are already available in non-encrypted form?
The infection analogy is somewhat off; files have the same contents on the hard disk as in the original media are not infected or tainted. However, we cannot use the tools until they are compared against the media, so we boot off something trusted and expand the circle of trust until we reach the data files on the hard disk itself. Something similar is used in Trusted Computing to boot into a trusted state; each stage verifies the next before loading it. The provenance of the data is not the question; the question is still about the contents. Provenance is irrelevant.
A closer analogy comes with trojans; current wisdom typically runs along the lines of "don't execute any attachment if you don't trust the source". The problem is that it's generally impossible for a computer to reason about an arbitrary program, something CS majors know as the halting problem. Think of it this way; if hash(x) is a one-way function of x, like a cryptographic hash or "fingerprint", then does the following algorithm halt or not?
x = "start"
while x != "finish":
x = hash(x)
halt
You can't know that unless you perform the algorithm. Actually you can know it won't halt if you hit a loop, but this demonstrates the point enough, and trivial modifications can break the loop.
So what we do instead is talk about the source of the program. But how can we determine the author of a program? The program doesn't know where it came from. Our friend who sent it to us may not be security-conscious and may be forwarding it from someone untrustworthy. Or this email was sent by a virus that he accidentally ran. And Authenticode showed us that even if the program was written by someone trustworthy, the way it handles input data has to be secure. Trustworthy people sometimes make mistakes, they are not always secure.
Mathematicians and cryptographers know colour; there are many systems which are "proven secure" which fail in practice because the proof involved an assumption that wasn't valid. There are varying levels of proof; for example, some rely on the "random oracle" assumption, and are generally considered weaker than those that don't. An assumption is an assumption when the prover does not know whether it is true or not; if he knew, it would merely be a step in the proof. So assumptions are unknown quantities, and they are usually showed to be false, rarely to be true. When a person provides a new proof reaching an old conclusion but with fewer assumptions, or weaker assumptions, that's a step forward because of colour. All logical proofs implicitly assume that True != False. Many algorithmic complexity computer science proofs assume that P != NP.
Vilhelm Sjoberg from 195.224.237.163 at Fri, 18 Aug 2006 17:55:49 +0000:
My suggestion for a name for this concept would be "provenance". In art teminology, the provenance of a painting is a chronological account of who originally created it and all subsequent owners. If you consider the painting itself as data (pixel values recorded using oil-and-pigment based storage technology), then the provenance a colour of that data.
You get the same slightly counter-intuitive effects here, because the commercial value of an art object is determined very much by the circumstances of its creation. Any skilled oil painter can create an almost indistiguishable copy of a Rembrant painting, and anyone at all can create an almost indistinguishable copy of some modern artworks, but only the original is worth anything. If an assumed Rembrant painting is discovered to be a fake, it suddenly drops in value, even though the picture itself is just as beautiful as it always was -- because it was produced in the wrong way.
Behdad from 24.43.244.143 at Mon, 11 Sep 2006 03:31:24 +0000:
Hi there. Sorry for the delay. I posted my comments to your article on my blog page:
http://mces.blogspot.com/2006/08/links-and-notes.html#115794429745640421
Gwern from 129.21.115.80 at Sun, 11 Mar 2007 03:05:57 +0000:
Just a small bit on your Perl comment: Haskell also uses a similar idea of tainting for handling IO - for example, strings might have a type of String when created normally and functionally, but when they come from a user or an outside source, they get tainted and are no longer String but IO String. Which is kind of neat although occasionally annoying.
Psy-Kosh from 69.3.205.20 at Sun, 11 Mar 2007 07:09:26 +0000:
You said:
"It's not even correct to say "the probability of this being from a random generator is very low" because that's not true - it either was or was not randomly generated, that's not open to probability."
It may not be open to frequentist probability, but a Bayesian statistician would be perfectly happy talking about the probability of it having been produced by a random source. Bayesians consider probability to be a valid representation of belief or confidence, not just of frequencies.
(I've seen mathematical arguments along the lines of given some really basic constraints on what a set of formal rules for dealing with belief must have (like internal conistancy, and other basic stuff) any set of rules would have to be equivalent to probability theory)
Matthew Skala from 67.158.72.8 at Sun, 11 Mar 2007 09:18:36 +0000:
Psy-Kosh - The point you raise is similar to one Behdad talked about in comments on his own Web log; you might like to see what he had to say about it. He posted the link above, and I'll repeat it here:
http://mces.blogspot.com/2006/08/links-and-notes.html#115794429745640421
If you scroll down on that page, you can also read my reply to him (as "Anonymous").
Matthew Skala from 67.158.72.8 at Sun, 11 Mar 2007 16:47:48 +0000:
Note for Reddit users - some of you complained about being redirected to a spam page when trying to view this one. That was my fault, I'm afraid. I use scripts that attempt to recognize and punish referrer-log spammers. Before I set them up I was getting thousands of hits per day, enough for the traffic alone to be an issue, from scumbags trying to attract hits to their own sites by inserting bogus entries in my referrer logs. Since I use my referrer logs a lot, it was quite annoying. So I introduced a system that attempts to detect referrers that hit my site a lot, checks them to see if they *really* refer to my site, and if not, puts them on a black list. Once on the black list, those referrers will be randomly redirected to each other.
Unfortunately, my system (which has successfully seen me through high-traffic links from sites like Slashdot and Digg) decided that reddit.com was a spam site; so any hits detected as coming from reddit.com got redirected to other things that had been detected as spam sites. The host programming.reddit.com was whitelisted, so it wasn't redirected; and people whose browsers for whatever reason sent something else as referrer, or no referrer, didn't see the problem. I think what happened was it stripped off the ?query part of the URL before checking it, and the URL with no query didn't point to somewhere that linked to my site.
I've now manually whitelisted reddit.com, and will investigate making changes to the script to make it more bulletproof.
Psy-Kosh from 69.3.205.20 at Sun, 11 Mar 2007 19:32:48 +0000:
Matthew: Just to clarify, I do not consider the Bayesian vs Frequentist issue to invalidate the rest of what you said. You raise some good and interesting points. I was simply commenting on that one specific aspect.
I still found the rest to be interesting and gave me something to think about. I wonder if type systems could be viewed as applying something analogous to color.... Of course, under the hood is just bits, but the concept of types and type systems do seem to be attempting, from what I understand, to be a formal version of what you're calling color. Perhaps some sort of limited color emulator could be built out of some of the higher quality type systems?
As far as the bayesian vs frequentist debate, it does not matter for the main points in your article. but it does matter for things like hypothesis testing, so it's more than just metaphysical. The differences lead to different mathematical methods that are not equivalent.
Matthew Skala from 129.97.79.144 at Sun, 11 Mar 2007 21:18:40 +0000:
On using type systems to keep track of Colour - yes, that sounds similar to the tainting mechanism I described in Perl and Gwern described in Haskell. There are no doubt other languages that offer similar features, and a sufficiently configurable language could be extended to handle other interesting properties besides "tainting." DRM systems attempt much the same thing when they attach watermarks and rights management information. Essentially, there's metadata being kept alongside the data throughout its life in the system. It seems to me that that sort of thing works quite well as long as we can keep the bits inside the system, and know that we have correct metadata when they come in.
The trouble comes when the system has to run in the outside world. For Perl and Haskell programs it's fine because they only care about keeping the metadata correct within the context of the program itself, and while it's running. DRM systems face a harder problem because they also care about what's going on in the outside world - issues like "Am I, the software implementation of DRM, actually running on a virtual machine right now?" Those kinds of issues are what make really working DRM theoretically impossible.
Karl Fogel from 67.121.245.47 at Mon, 12 Mar 2007 02:55:35 +0000:
Beautiful metaphor! Perhaps more comprehensible to computer scientists and technically-trained lawyers than to the general public, but for its audience it really sets the issues out in stark relief...
Regarding watermarking: one of the unintended consequences of DRM and watermarking (especially "transactional watermarking) is that they interferes with the computer's function as a plagiarism-detection device. When you have the raw data flowing freely around a system, it's easy to automatically notice when the bit substring over here is the same as the one over there. It's even relatively easy to compensate for certain standard transformations -- e.g., the addition of whitespace in a text file, a linear transformation applied to an audio file, etc -- while doing such comparisons.
DRM and some kinds of watermarking a) make it difficult for arbitrary code to get access to the raw bits, and b) create new kinds of differences between bit substrings that would otherwise be the same or similar, thus increasing the work involved in any comparison.
DRM and watermarking should thus not be thought of as benign from a data point of view. Making formerly comparable bitstrings incomparable interferes with users' (and artists') ability to identify what's coming from where.
mathguy from 121.45.251.157 at Thu, 03 May 2007 11:54:47 +0000:
Nicely written article. I have to take issue with your last example however,
I don't think it's the best analogy of the lot.
<p>
The main problem is that mathematically, a truly random number(*) doesn't make
any sense. The only thing that makes sense is a _sequence_ of numbers
which pass a certain collection of statistical tests. So when you talk
of the number 2 alone, it's not random in any sense. You'd have to
talk about a _sequence_, because randomness in part involves checking
that the following numbers aren't linked to the first one in obvious
deterministic ways.
<p>
You then go on to invoke the idea that the number 2 (or some sequence) might
be generated from some truly random physical source, but that unless you
can trace the source, it would be impossible to verify the "randomness" of the
sequence. Again, this is flawed:
<p>
The reason is that there simply do not exist truly random physical
sources in the universe, except for those that we _mathematically_
define to be random. However, such mathematical definitions are
_always_ phrased entirely in terms of observable properties of the
output sequences(**). Why is this a problem for your analogy? It means
that the provenance or "Colour" is irrelevant to whether the sequence
is considered random, for only the sequence itself matters (ie by
looking at the means, the correlations, etc. in practice, or by
computing the Kolmogorov complexity of the sequence in principle).
<p>
So in this sense, your analogy for Colour cannot work because random
sequences are rigorously defined to be sequences of any provenance
whatsoever which must merely "look" right.
<p>
(*) I'm really talking about realizations of random numbers, but this
is harmless. There is a mathematical concept of random variable which
maps to another idea of randomness, but it has no physical reality unless
you subscribe to the quantum mechanical hypothesis of multiple alternative
worlds. And in that case, you're still stuck with the single world you're in.
<p>
(**) The mathematical definitions avoid provenance precisely because of
the difficulty of making sense of the concept as part of a logical definition.
If you're going to call numbers random because they come from a random source,
you've only moved the logical goal towards defining a random source: either
it's an arbitrary (and meaningless label), or you have to phrase the definition
in terms of the output sequences, which is circular at that point.
j h woodyatt from 17.206.50.218 at Thu, 03 May 2007 22:50:10 +0000:
I think the analogy to the <i>Paranoia</i> RPG only makes more sense when you recognize that Troubleshooters, in addition to being assigned a mission to help Friend Computer eliminate all the Commie Mutant Traitors, <i>are also assigned roles in one or more factions of the Commie Mutant Traitors</i>.
Computer scientists all know that it's insane to think that Colour is a computable property of bits, but the Law forbids us from recognizing the fact of that. So, we end up scragging one another's clones in a festival of insanity, which only drives the lawyers more insane in the process of trying to keep us from doing it.
This is what happens when you can't be an effective Troubleshooter unless you're also a Commie Mutant Traitor.
Max Rabkin from 196.209.253.3 at Mon, 07 May 2007 08:47:25 +0000:
Very interesting article. Your point about "Random Numbers" reminded me of http://xkcd.com/c221.html (XKCD 221: "Random Number), which has the following source code:
int getRandomNumber()
{
return 4; // chosen by fair dice roll
// guaranteed to be random
}
BlueNight from 71.213.139.190 at Sat, 12 May 2007 21:39:54 +0000:
MSkala: Thanks for the comic, and the essay here. I am a big fan, and I initially came here from a link from xkcd.
This post inspired one of my own.
http://triessentialism.blogspot.com/2007/05/real-world-applications.html
You see, "Colour" (the metaphor you used here) does not exist in the Physical world. It only exists in the mind, that is, Logically and Emotionally. It can be simulated and encoded, but "Colour" is not essentially a Physical thing.
However, I assert that the Logical and the Emotional do exist, and can affect each other through nexus points such as the human mind, and to a lesser extent, animal minds and computers. Only when those other two types of things are held to exist is Law meaningful. Only in the realm of the Moral, which is where Law exists, does "Colour" also exist.
In other words, Colour exists because people say it does. It exists in the realm of things that exist by recognition.
I also am interested in justification of software and music piracy in light of the Todd Goldman scandal, in which everybody's brother decried the violation of Shmorky's intellectual property rights. (So such things DO exist after all!)
By the way, Triessentialism is an ontology wherein there are three main types of things, the Physical, the Logical, and the Emotional, and three corresponding contexts/universes where those three types of things each exist. I've been exploring it for six years, and it has helped me tremendously.
One of its implications is that money is a socially-defined form of energy. I call it Econodynamics. It explains inflation as value-loss, analogous to waste heat.
BlueNight from 71.213.139.190 at Sat, 12 May 2007 22:16:55 +0000:
I've just searched for "econodynamics" for the first time ever, and apparently someone else came up with it first (and developed it a lot farther than I have). However, I came up with it independently, through my own philosophical researches, so it's not plagarism. An amazingly coincidental illustration of provenance ("Colour").
Padraic from 75.54.80.232 at Thu, 31 May 2007 15:42:49 +0000:
Hey, I followed a link here from the Official Paranoia Blog (www.costik.com/paranoia/), just FYI.
The interesting thing here is that so-called 'hackers' (I subscribe to an older definition of hacking: The use of insufficient tools and/or training to achieve a goal in terms of computer programming. Usage: "It's okay for a hack, but a bit kludgey.") have, since the publication of the "Hacker's Manifesto" used the ridiculously meaningless expression "information wants to be [alternatively, "should be"] free" as a defense of their theft.
This shows that while computers do not in and of themselves recognize the concepts of ownership, legality, or the origin of any given data, those that program and/or operate them do. True, in order to represent this recognition, additional data must be fabricated and appended to the existing data, but that is a result of working with a device that operates in the manner of a computer with regards to data (this included CD/DVD players, VCRs, or any other communication/storage medium).
There are certain facts that, in this circumstance need to be taken into account, but, as you have accurately described here, are not intrinsic to the data itself:
1) All data has a source.
2) It is possible to copyright any set of data that has not already had a copyright applied to it.
3a) Once a set of data has a copyright applied to it, ownership of that data is not in and of itself a violation of copyright law. (www.copyright.gov/circ1.html)
3b) Transfer and/or duplication of the aforementioned set of data is regulated by copyright. The owner of the copyright determines who can and cannot distribute, copy, or modify the copyrighted work, and under what conditions.
4) Copyright refers only to the data in question, in whole or in part. Derivative works are handled by Trademark laws, not Copyrights.
Once again, we run into a problem when trying to apply this information to a computer file. Theoretically, it is possible to create a completely original file using individual bytes from other files. If, for example, I wanted to write this comment, but my Q key didn't work, I could copy and paste the letter Q from another text file obtained ANYWHERE.
If I obtained the letter Q from, say, an online computer game manual, I would TECHNICALLY be guilty of copyright violations, as I would be reproducing that file in part without permission of the copyright holder.
In order to accurately reflect this legal reality, every single byte (more accurately, every single bit) of a file should have that "colour" attribute associated with it.
For every 1 or 0, an additional X bytes stating "This bit is copyright 2007 XXXXXXXXXX". (I can just hear the rending of hair as other programmers realize the amount of work this would take, not to mention the amount of storage space wasted thereby.) Additionally, do we apply the same colour attribute to the bits that make up that tag? And who holds the ownership of those? Ideally, those bits would be owned by the person who created this system.
Not feasible, not practical, and if you consider the iterative nature of the tag I've described, not actually possible.
So long as information exists, there will be those who wish to obtain that information. So long as there are those who wish to obtain that information, there will be those who will distribute it. Whether due to idealism, rebelliousness, or greed, someone out there is going to go painting all those white hallways black. (For you non-Paranoia-players out there, white is the colour associated with ULTRAVIOLET Security Clearance - the highest - and black, the colour of the lowest - INFRARED).
It is a factor of human nature that may be accounted for, but that is impossible to prevent. If X wants a copy of Y and he doesn't want to pay Z for it, he will take any steps reasonable to obtain it for Q -- Q being an expenditure of time, effort and money that is less than that represented by Z.
The trick is to make Z so ridiculously cheap/easy that the effort and expenditure involved in Q is no longer worthwhile. Not to impose additional restrictions on Z, making Q look that much more appealing.
no-richard from 166.165.149.240 at Fri, 27 Jul 2007 06:11:46 +0000:
Apparently, our bits *do* have a colo(u)r, and it's yellow:
http://www.seeingyellow.com/
Who'da thunk?
Peter da Silva from 15.203.233.76 at Mon, 30 Jun 2008 17:42:55 +0000:
I don't think that "color doesn't exist" is a useful phrase. We know that things like "chain of evidence" and "chain of custody" exist. We know how to tell if they exist or not.
The thing is that we can't tell what they are from looking at some bits, we have to go through all the same real-world bookkeeping and proofs that we have to go through to show that this knife buried in Joe's backyard is a murder weapon and this other knife buried in Fred's backyard isn't, even if you can't examine the knives to tell which is which, because the murder was two years ago and the one buried in Fred's backyard had roots older than that grown around it, and the police officer who dug it up will testify to that...
This thing you call color, it's got nothing to do with bits. It applies to all kinds of things without being an intrinsic attribute of them. The handkerchief that Elvis tossed to your grandmother wasn't changed into something other than a handkerchief by Elvis, it's objectively no different from any other 50 year old handkerchief, but it's still "Elvis colored".
It's not that "color doesn't exist", it's that there's no practical test you can apply to tell if it exists or not. THe universe doesn't record the path anything took through time and space in the thing itself, except indirectly, so all you can do if you want to tell what that path was is to keep track of that path outside the thing you're tracking.
aeschenkarnos from 220.245.180.133 at Tue, 01 Jul 2008 13:54:59 +0000:
Nice analogy.
One of the interesting assertions of the Forces of Evil, as manifested in the RIAA and fellow travellers, is that they claim copyright not only in a work, but in all sufficiently-close representations of the work.
If I rip a track from a CD, now I have a 6 million bit MP3 file, which has Colour "Copyright (c) Whoever Inc". If I flip (almost) any one of those bits, it is still (for any realistic purpose) the same MP3 file. If I flip (almost) any two bits, the same applies. And so on. I have to flip a fairly large number of bits, *probably* close together, to create a file so deviant from the original MP3, that the copyright assertion cannot reasonably be said to apply. Let's guess that to be 2 million of the bits. So that means that for any given track, roughly 6*10^12!/(6*10^12-2*10^12)! of the variations of it that could be ripped and post-processed, are still asserted as copyright by the RIAA. As a consequence, any given ripping and distributing infringes the copyright not only of the "perfect" song, but of all insufficiently imperfect copies of the song.
And there we have it: a line of reasoning which might perhaps go some way towards justifying the amount of damages claimed for an infringement.
Captain Smokeblower from 168.103.67.105 at Sat, 26 Jul 2008 05:08:22 +0000:
"In intellectual property and some other fields we're very interested in information, data, artistic works, a whole lot of things that I'll summarize with the term "bits". Bits are all the things you can (at least in principle) represent with binary ones and zeroes."
Matt: I believe we in CS are always interested in the information carried by the bits so I have difficulty with your text summarizing the meaning with the term bits. It's information you represent in ones and zeros and each one and each zero is a bit. It's not just the pattern of bits (00000101), but does that pattern represent five apples or the character 'e,' that's critical. I can properly operate on those bits only if I know what information they represent. The information they carry is for me the Color/Colour of the bits. It's one of the most common mistakes in programming to perform a function on bits of different color. Oh, it might be fun to add the contents of file a.mpg to the contents of file b.mpg and send the results through software for playing MPEG files -- once. I suggest you differentiate between the 'bits' and the information they carry. For this reason I disagree that in the world of computers people don't care about the 'Colour of bits.'
I hope I'm not misunderstanding your paper. I realize it's about establishing a common ground/language for discussing copyright law and computer files and applaud your effort.
Matt from 129.97.79.144 at Mon, 28 Jul 2008 15:29:40 +0000:
Ah, spoken by a true Pascal user.
Variables have types, and if you treat data of one type as if it were of another type (for instance, by trying to add two pointers as if they were integers) you *may* get garbage. However, there's more to Colour than just data type, and the rules of data type are not absolute. Many important uses of computers depend on deliberately ignoring data type distinctions.
If you get a phone and have conversations on it in English, and then one day you want to have a conversation in French with your phone, do you need a new phone? No... the phone just delivers the sound of your voice, whatever language it's in. Similarly, if you make a ZIP file out of a bunch of Word documents, and then one day you want to include some JPEG images in your ZIP file instead, do you need a different compressor? No... the compressor just puts files into the archive, whatever type they happen to be. You can make rules to keep from confusing yourself - for instance, Word probably won't load a JPEG file, at least not as a document - but those are just convenient fictions to make things easier for humans.
You *shouldn't* add two pointers as integers, but you certainly *can*. There is nothing about pointers that makes it impossible for you to add them as integers. If you want to design your business model around the idea that nobody will be able to treat bits as flat bits, then you lose. Maybe some programming languages elevate that "it's a bad idea" advice to an enforced rule, but programmers can opt to use other languages that don't.
tola from 195.80.104.178 at Mon, 17 Nov 2008 09:54:27 +0000:
If two people make identical .txt files, lets say both have one newline in them. So they are bit-identical but each file has different owner :)
Ashley from 24.137.73.33 at Sat, 14 Feb 2009 01:13:21 +0000:
Try replacing the word "Colour" in your essay with the word "Context" or perhaps "History" or "Narrative" and you'll see something interesting.
Matt from 216.59.242.133 at Sat, 14 Feb 2009 05:13:40 +0000:
See the followup article, Ashley.
Martin Maney from 69.17.22.215 at Sun, 08 Mar 2009 14:45:05 +0000:
Interesting, and almost convincing... but your own real world examples are so wrong that it makes me question the whole exposition (even more than I was, that is).
The buffer solution story sounds almost as though you think that the jar comes from the standards bureau. In fact, it is almost certainly mixed up (and refreshed) locally: what's traceable is that the tools and materials used to measure the ph can trace their accuracy back to the official standard. And in any context where that matters (student labs probably don't care more than that the label says it's okay to trust it), it's no magic Colour that's important, it is the documentation - clearly and explicitly metadata - that grounds belief in the accuracy of the buffer's pH.
The example about the virus-infected machine is so completely off base I hardly know where to start. It's not that the on-disk copy of things has the wrong Colour, it's that *the bits are different*. It's complicated by the fact that since the bits that have been changed may do a good job of disguising themselves if you're using the altered bits to run the machine, but that's the reason you don't trust the OS on the disk: its bits are (or may be) CHANGED. No mystical Colour needed or wanted.
You claim that CS people have a hard time seeing what lawyers do in this mystical property, but you yourself seem to demonstrate that lawyers certainly can be prone to imagine this stuff where it plainly isn't.
Matt from 69.63.60.29 at Sun, 08 Mar 2009 15:22:14 +0000:
Interesting that you think I'm a lawyer. Actually, I'm a computer scientist: I was earning a PhD in CS when I wrote the above (and have since completed the degree), and I'm currently employed as a researcher.
Too bad you didn't understand what I said. I said: the buffer solution is traceable to the NBS. That doesn't mean they mixed it, although it also almost certainly is *not* locally produced either. Labs generally buy such stuff from a commercial vendor. The important thing is a chain of verification back to the primary standard: NBS has accurate weights and measures, those get used to produce accurate amounts of chemicals of known purity for making a standard solution (which might well be a standard of something else, not "pH 7.00 buffer"), that standard gets compared against other things to make sure the other things are accurate, someone mixes a pH 7.00 buffer and tests its accurate relationship to the standard solution, and so on. There is a chain of responsibility and accurate measures that goes all the way back to the NBS. The solution itself was not manufactured by NBS. That's not the business NBS is in; their job is to be the first link in the chain. Someone could also mix up a batch of buffer solution without having that chain of responsibility and accurate measures, and the result could be atom-for-atom indistinguishable from the real traceable solution, but it wouldn't be the real traceable solution because it would be the wrong Colour.
The virus infection scenario has confused a lot of people, and we ended up cutting it to one sentence in the magazine version because it was too hard to explain clearly, but just to make another try: no, I do NOT mean that the bits are necessarily different. It is important to my point that two files can be bit-for-bit identical but I still will need to treat them differently because of where they came from. Almost all the files on the possibly-infected machine are in fact unchanged, but I don't know which ones, and I have no way of accurately knowing as long as I'm using the possibly-infected machine to do the tests. So I declare "every file on this machine is possibly-infected," boot from a separate, external source of files, and keep track of which files came from outside and which came from the possibly-infected machine. After I have examined a file and know it to be identical to a file that came from outside, then yes, I can declare it to be just as good as if it came from outside, even if it didn't actually come from outside. But I can't start examining files without at least temporarily using the artificially-imposed "where did it come from?" distinction.
A CS topic I encountered recently, which might or might not be of interest, is the "extensionality" issue in some logic programming systems. If you have two things that are identical for operations like unification, can they be distinguished in any way or are they really always the same thing? If identical things are always the same thing, they're "extensional"; terms in Prolog are like that. Some other systems, perhaps systems that allow destructive modification, have "intensional" things. You can have A=B and then change A without changing B because they're stored separately. A system I'm working with right now called the Attribute Logic Engine, or ALE, allows the programmer to specify that some types shall be intensional and others shall be extensional. If we were all ALE users, the point of this article could be reduced to "Most things in the computer are extensional, but non-CS people tend to think of things in the real world as being intensional, and they expect computers to be the same."
I encourage you to read the follow-up article, which is linked from my name on this comment.
ha ha from 80.229.83.223 at Sun, 08 Mar 2009 18:55:27 +0000:
I get your point, but you are wrong. John Cage's Silence will have an identifier embedded within it to enable the music publishing service to count royalty payments from radio etc. So the tracks will not be identical. As it happens, I don't believe any radio station has tried to play the track yet, but I may be wrong!
Keith Douglas from 142.206.2.14 at Mon, 30 Mar 2009 20:48:41 +0000:
As someone who works in software and yet was trained in philosophy of science and technology, this is an interesting topic. In particular, there happens to be a large literature in metaphysics (what philosophers do, not the squishy new age stuff) about essentiality of origin, with all the same considerations. A molecule-for-molecule copy of "great work of art" is not one if created by accident (say, as a quantum fluctuation), according to many, because it did not originate from "great artist".
It might be useful if one wants to help to resolve this matter to make use of some of this philosophical material. Not that philosophers agree on the solutions, but I dare say it might be neat to investigate all the same. Incidentally, this topic also quickly runs into a topic which I have actually researched, similarity in computer programs (which also has legal ramifications). Another interesting and fantastically difficult problem ...
Keith Douglas from 142.206.2.14 at Mon, 30 Mar 2009 20:49:29 +0000:
Correction on my website URL.
Viray from 24.184.242.122 at Thu, 15 Oct 2009 14:03:02 +0000:
I recently read somewhere that Pi should not be converted to binary and that there are illegal prime numbers. But I've been listening to the number 0.1234567891011121314151617181920...
(especially in Binary) since before there were CD's!
It's public domain and has everything possible including Elvis
singing Madonna songs in every language and NEED NOT BE COPIED
although I occasionally demonstrate by using part of it as
a file that sings a song about a pocket calculator.
And indeed I also have built a "pocket calculator" that plays
all possible musical sounds from the digits of that number.
I also can easily translate sounds from my dreams and
imagination into musical number math to play on it.
It also can format the sound as video and make interesting
patterns on a screen, but not as MP3 players do,
because it puts the same bits into the speaker as it
does on the screen, where they ALWAYS HAVE "COLOUR".
It is absolute logical synaesthesia and could also
generate specific images with whatever sound the image bits
are EQUAL TO.
I first did this before CD's, so I have a unique priority
with the concept of listening to UNRECORDED SOUND AND IMAGES
using absolutely-not-random bit numbers and by other like
non-electronic means, as this is so simple I could make it
of clockwork and still make it sing any song.
Any way of counting is public domain and not random and
eventually makes every file. Computers cannot make random
numbers at all. They have to count in a weird way that
looks random like Pi, but Pi is not random and a way of
getting the digits simply from counting (them) has been
discovered. Every song has several prime numbers that
sound like it. Songs that Elvis never sung in Klingon
also have prime numbers that sound like them.
Songs from the future have prime numbers that sound like them.
Home Taping killed the music mob,and that number exorcised them.
Was there a big deal about Home Taping BEFORE CD's?
I thought that portable radio tape recorder stereo boxes
and walkmans came around that time also.
VISTA aka WINDOWS 7 still the greatest
failure of all time because it's all DRM, which is just a huge
expensive mean prank to take your lunch money and bully
you into listening to cRAP music on a useless un-programmable
machine that makes you push mouse buttons all day as a job
and force you to let evil programs mess up an un-write-
protectable disk drive*, and who needs to download when all
possible 4 minute songs can be heard by listening to every
number less than 4 million one bits in a row, which isn't
even necessary as any of them can be calculated almost as
fast as clearing those bits to zeros on a USEFUL COMPUTER,
such as one that Microsoft no longer supports which booted
instantly and said READY (your wish is it's command) and you
could just tell it what do to and take the rest of the day off
while it would even MAKE SOMETHING without giving you carpal
tunnel syndrome. On such a machine I first played and
audiovisualized the never-ending musical number.
*Microsoft created viruses by removing the write protect switch
from disk drives. Before then, they had a program disk that was
write protected and data disks were not. Viruses were impossible
It was always common sense to protect data with the disk switch.
My life has a soundtrack and it's MY LIFE, MY SOUND,
and NONE OF MUSIC BUSINESS.
Napsters never STOLE a single CD. MP3's ARE NOT CD's. Duh.
I can play a song with less than only one screenful of
program and data... WITH HUMAN SOUNDING VOICES.
It's impossible for an individual to "Pirate"
or infringe because it's a Corporate Offense.
Individuals can copy books in a library.
Individuals can record anything analog or digital.
Individuals can build an invention from a Patent.
Individuals can and SHOULD ALWAYS backup important data.
Individuals cannot profit from counterfeiting or plagiarism.
Individuals can't rob ships either, in their own home.
Individuals with no contracts can sing in public if they want.
Public performance = free speech, NOT A COPY.
How much do individuals get paid WHENEVER they sing a song?
HOW MUCH DO ALL MUSICIANS GET PAID FOR EACH CD SOLD?
WHO ALWAYS GETS RICH IN THE MUSIC BUSINESS, AND DOING WHAT?
SONY BMG tried to steal my music,
and that's how they KNOW about 0.123,
if not because I suggested it as a defense for (or
alternative to) Napster. I have hundreds of alternatives
to 0.123 but they are much less obvious. For example,
certain fractals also have ALL POSSIBLE MUSIC in them.
And of those I released, I released to Public Domain.
00000.12345
atom from 69.55.237.145 at Mon, 30 Nov 2009 23:43:40 +0000:
first, this is a great essay for bridging the gap between the reality of modern law and the reality of modern technology. thanks!
hypothetical situation: let's say that one distributes one or both of the following files: 1) a JPG that's under a CC-by license (essentially one step removed from public domain) and 2) a file full of "random" bytes without any specified license, publisher or "owner". however, if those two files are XORed, the result "just happens to be" a copy of "plane crazy" starring mickey mouse (copyright 1928 walt disney productions, and has enjoyed no less than 11 copyright extensions which have kept it from entering the public domain). let's further assume that the disney corporation perceives this to be a threat to their empire, and pursues redress under relevant copyright law.
it's my suspicion that disney would prevail in demonstrating that the file full of "random" bytes is actually a derivative of a copyright that they control (although they would likely assert "ownership" i believe the term "control" is more technically accurate; copyright is a monopoly right, despite efforts to distort it into a property right). whether we think in terms of color, context, history, etc, the case would, IMHO, run the same course.
next, let's look at something that's technically almost identical to monolith: pad - http://www.madore.org/~david/misc/freespeech.html
while technically very similar in function to monolith (nearly identical?) the author of pad has proposed some novel ideas for "proving" that two or more seemingly "random" files are actually innocent and non-infringing. so, instead of using monolith's example of two files, let's say we need 5 files to construct "plane crazy". no matter which of those files disney and their army of lawyers tries to take action against, ALL of them can be "proven" (in theory) that they are either deterministically generated or required as part of *legitimate* (regardless of practical) distribution methods.
i'd like to think that the courts would have to err on the side of free speech and the right to distribute works in impractical ways. there's no single file that's a "smoking gun" of infringement, and removing all of the files, or any subset of them, would be insane (i'm not saying it wouldn't be a decision of a court (after all, the judge has to decide between loyalty to friend computer and being executed for being a commie mutant traitor), just that it's insane, in addition to being infeasible considering jurisdictional issues).
noko from 76.85.196.138 at Sun, 06 Dec 2009 00:50:20 +0000:
For every bit string B and every copyrighted work C, there exists an encoding scheme in which B represents C. However, this does not imply that the owner of C also has copyright on B, because only a bit string together with the information necessary to decode it can constitute a "copy" of C.
Suppose you were to encrypt a movie with 256-bit AES, so that it looks like a perfectly random and meaningless bit string. Furthermore, you had your eyes closed and typed randomly when you selected the movie and when you created the password, so that even you cannot decode the file and cannot know the contents of the file. Then you distribute the file widely. This would not be copyright infringement, because it would be impossible for anyone to access the contents or know the contents of the encrypted file.
John Beattie from 80.177.223.120 at Thu, 07 Jan 2010 14:37:09 +0000:
As social beings, Colour impacts engineers too. Your example of the pc with a virus reminded me of this passage in Zen and the art of motorcycle maintenance (in Chapter 14, where Pirsig meets up with DeWeese). It is the last paragraph of the passage which is really relevant, where a machine which hasn't been properly checked is a 'down machine', even if it is in fact in perfect working order. The rest is for context.
"Peace of mind isn't at all superficial, really," I expound.
"It's the whole thing. That which produces it is good maintenance; that which disturbs it is poor maintenance. What we call workability of the machine is just an objectification of this peace of mind.
The ultimate test's always your own serenity. If you don't have this when you start and maintain it while you're working you're likely to build your personal problems right into the machine itself."
They just look at me, thinking about this.
"It's an unconventional concept," I say, "but conventional reason bears it out. The material object of observation, the bicycle or rotisserie, can't be right or wrong. Molecules are molecules. They don't have any ethical codes to follow except those people give them. The test of the machine is the satisfaction it gives you. There isn't any other test. If the machine produces tranquillity it's right. If it disturbs you it's wrong until either the machine or your mind is changed. The test of the machine's always your own mind. There isn't any other test."
DeWeese asks, "What if the machine is wrong and I feel peaceful about it?"
Laughter.
I reply, "That's self-contradictory. If you really don't care you aren't going to know it's wrong. The thought'll never occur to you. The act of pronouncing it wrong's a form of caring."
I add, "What's more common is that you feel unpeaceful even if it's right, and I think that's the actual case here. In this case, if you're worried, it isn't right. That means it isn't checked out thoroughly enough. In any industrial situation a machine that isn't checked out is a `down' machine and can't be used even though it may work perfectly. Your worry about the rotisserie is the same thing. You haven't completed the ultimate requirement of achieving peace of mind, because you feel these instructions were too complicated and you may not have understood them correctly."
dave from 76.167.130.194 at Sun, 10 Jan 2010 01:18:38 +0000:
The most telling point in this article is the association of the concept of colour coded security from the game Paranoia with the concept of copyright.
In the game, the interpretation of colour which lends it the power that makes this association appropriate is created by a machine which is completely insane. This interpretation is deliberately designed (by the game designers) to almost always result in injury, death and damage. If the idea of "Colour" used above is rooted in a concept which was formed from insanity and seems to naturally result in chaos, then it seems like a poor starting point from which to argue convincingly about the merits of copyright.
There are several other serious factual errors in the post, but I think the core problem needs to be fixed before any comments on the examples can be productive.
Matt (mskala) at Sun, 10 Jan 2010 01:28:21 +0000:
"If the idea of "Colour" used above is rooted in a concept which was formed from insanity and seems to naturally result in chaos, then it seems like a poor starting point from which to argue convincingly about the merits of copyright."
On the contrary, the basic irrationality of Colour is what makes it an especially good analogy for copyright.
Do not enter a fake email address. If you don't want to provide one, just leave it blank. Comments with fake email addresses will be deleted.
This form is for posting public comments to be read by other people who visit this Web site. If you have a software support question, or other material directed to the page author instead of to the general public, please send email instead.
All the data you enter, and your IP address, will be saved and displayed. Don't enter secret information. HTML is not accepted; it will be displayed as plain text. Your comment will only be added if you enter valid data in all required fields; if it isn't, use the back button and try again.
I, and I alone, reserve the right to remove postings for any reason.
Jonathan from 209.150.207.108 at Fri, 20 Jan 2006 00:18:54 +0000:
As someone from the legal side of the tracks, I'm extremely impressed with this explanation.
It seems to me the key point is that the things you identified as "Colour" are not intrinsic properties of the buckets of bits we's talking about.
We use "metatags" on physical objects all the time to identify non-instrinsic properties. Consider "ownership." You can't determine who owns a physical object by examining it anymore than you can determine whether a binary file was copied by examining it.
Note that physical-object "metatags" have the same problems as binary-information metatags. How do we know who owns a physical object? Most of the time we rely on physical control. When I buy a printer at Worst Buy, they let me take it out of the store (and out of their control.) Of course, I can loan my printer to you; that changes control, but no ownership. (In lawyerese, possession creates a rebuttable presumption of ownership.)
For real estate it's trickier. You can't take it with you when you buy it. So we rely on documents, which of course can be forged, destroyed, or modified, just like a metatag that says "This binary file is orange" can.
This may be a way to explain the problem with DRM to a lawyer. There's no way to make metatags intrinsic to a binary file, any more than there's a way to make ownership intrinsic to a physical object. You can put a fence around it, put it in a locked box, install locks on the doors, but there's no way the legal system can actually prevent some Commie Mutant Traitor from changint the metatag.