All you're "based off" are belong 2 us
Sat 30 Oct 2010 by mskala Tags used: dares, linguisticsI saw a Web BBS posting recently in which the poster, who was a foreigner learning English as a second language, asked "Which is correct - 'based off' or 'based off of'?" The person asking the question can probably be forgiven because they don't know any better, and at least were smart enough to ask, but if you know me you'll probably be able to guess that the general agreement among the answers, that "based off of" is incorrect and you should say "based off" instead, caused me to consider the merits of a tri-provincial killing spree.
I will not apologize for being a prescriptivist. There are some usages that would be wrong even if all the other native speakers of English used them; and "based off" (with or without an "of") is such a usage. I'm willing to accept "different than" as an issue of formalism, and acceptable in speech or informal writing even though I do not use it myself; I'm willing to (very grudgingly) grant that persons from the United States of America may be allowed to say "anyways" as a regional dialect thing, even though it makes them sound illiterate; but "based off" is just completely unacceptable.
Nonetheless, from a scientific perspective and from the point of view of "know the enemy," it may be interesting to look seriously at the questions of who does say "based off," and when they started.
First impressions
When I mentioned this issue to my mother, a former editor, she expressed surprise that anybody said "based off" - she didn't recall hearing or seeing it. On the other hand, when I read the TV Tropes Wiki, my impression is that "based off" is almost universal there, with "based on" a rare exception. I change it whenever I see it, but given the rate of editing it seems a certainty that instances of "based off" and "based off of" are being added to TV Tropes faster than I remove them.
A little bit of Web searching turns up a few other people expressing views like mine about "based off," and the general consensus seems to be that it's become popular only in the last five years or so, and primarily among young people. I heard a colleague (a linguist to boot) use it and also "anyways" in a formal academic lecture recently, but that seems to be unusual; it's mostly the uneducated who say "based off."
Why did they start, and why did it grow? At this point it's probably impossible to really know, but my guess is that it's tied up in the metaphor built into the phrase - which is probably also why I find this one much more annoying than other equally incorrect phrases. When you say "based on," you are invoking a metaphor. Saying "based off" discards that metaphor in favour of a syntactic analogy to an unrelated word; so saying it reveals you don't really know what the word "base" means.
Suppose you're erecting something like a statue. You don't want it to fall over. So you put it on top of something solid and stable - like a big block of stone. That solid, stable thing is the base of the statue. More generally, a base (as a noun) is a place of security: for instance, soldiers hope to be safe while they are in their fortified "base"; and a runner in baseball can't be tagged out while he is "on base." The same root that gives us the noun "base" also gives us, for instance, the adjective "basic" - used for things that are simple and strong and provide bases for other things. Then "base" can also be used as a verb to indicate putting something on a base - you "base" your statue on the block of stone; or (extending the metaphor to intangibles) you can "base" an argument on a concept, or create a new piece of artistic work "based on" some earlier work - that is, using it as a source of ideas or inspiration. The meaning of the phrase "based on" is, well, based on this metaphor of putting objects on top of other objects for stability. Using the preposition "on" is fundamental to the metaphor.
Suppose the statue is not on the base but off of the base. Well, then it's not based at all; it is baseless; it's not stable or secure; it is the opposite of based. In fact, we use the phrase "off base" to describe something that has no base or foundation - or a baseball runner who is vulnerable to the tag. The usage of "based off" where "based on" would be correct, is off base.
But I think the purported justification for "based off" comes from a different line of thinking that goes like this: the speaker neither knows nor guesses the metaphor of solidly supported erections that links "based on" to the noun "base" meaning "place of security." Instead, they think of "to base" as a verb with no history, that means "to imitate" with no connection to any noun meaning of "base." Then they think it is a sort of politer version of "to rip off"; and so they think it should take the same preposition(s). Someone who would write "The Witch Hunter Robin opening is ripped off of a Madonna music video," thinks it's okay to write "The anime Suzumiya Haruhi no Yuuutsu is based off of a series of light novels." The latter anime is copied, it's imitative, but not in an unethical way - KyoAni had permission to use the source material and did so openly, so the anime is not a rip-off, but it's still a copy.
Note that that story nicely explains why some people say "based off of" instead of just "based off." The phrase "ripped off of" makes sense to me because "ripped off" is the verb (it's related to the noun "rip-off") and "of" is a preposition for linking an indirect object to that verb. Trying to hammer "base" into the hole left by "rip" when you don't want to suggest plagiarism, yields "based off of."
So, yes, this does imply that if I were forced to accept one of "based off" or "based off of," I might prefer "based off of"; "based off" seems to be a further corruption from there. That is not the right answer to the language learner's question in my opening paragraph, though!
Tracking the cancer's growth
On the Web in general, it's common practice to use Google's number-of-hits estimates to examine questions like this - xkcd has based some amusing cartoons on that technique - but we can do better. A linguist at Brigham Young University has assembled something called COCA, the "Corpus Of Contemporary American English," and put up a search engine for not only it, but several other major corpora of English. These "corpora" are samples of language built specifically for linguistic research. Exactly what kind of samples they are, and how they're constructed, depends on the particular corpus - each one does things a little differently - but it's typical that a corpus will attempt to contain a randomly selected representative sample of the entire language, for instance including both spoken and written words, fiction and nonfiction, a variety of levels of discourse, within defined boundaries of dialect and historical time period, and so on. So counting occurrences in one of these is kind of like doing that Google search, except like actually scientific and stuff!
The COCA pushes you to register and prove that you're a "real" researcher in order to expand its per-user-per-day usage limits, and I don't particularly like that. It still does give me a bit of a thrill, though, that I actually do have the relevant qualifications. So I did the secret handshake, created an account, and ran some searches. I stuck the results in a Gnumeric spreadsheet and you can download it if you want to (or go to the site, register and do searches yourself, and get even more data than I included in my spreadsheet) but I'll summarize the interesting bits here.
COCA
The COCA, which is 410 million words attempting to sample uniformly from American English from 1990 to 2010, includes 56016 instances of "based on" and 17 of "based off," which includes 8 instances of "based off of." Of those 17 instances of "based off" which includes 8 of "based off of," 12 of the "based off"s including 7 of the "based off of"s are in the "spoken" section of the corpus. There is one "based off of" in each of 1993, 1996, and 1998. All the other instances of "based off" (with or without an "of") are in 2004 or later.
So in this corpus, it looks like we can say:
- "based on" is much more popular;
- "based off of" seems to be older than "based off";
- "based off" has started to occur since 2004;
- "based off" and "based off of" are primarily used in speech, not written.
Just as a reaction from my experience: I think "based off" is far more popular than its occurrences in this corpus suggest, and I think that highlights something about this corpus: it's a corpus of educated language. It's mostly published writings, such as books, newspapers, and magazines; and even the spoken component is transcripts of radio and television, where the speakers are, to be blunt, not idiots. If you go on the Net and read writings like Wikipedia, where many of the contributors actually are idiots, I think you'll see a lot more "based off"s. Corpora of uneducated language do exist, and it would be interesting to look at them for this.
COHA
The same search engine also offers access to the "Corpus Of Historical American English" or COHA, which is 400 million words covering 1810 to 2009, split roughly evenly per year. In that corpus, there's a total of 15306 instances of "based on" and 2 of "based off," both of which are in the year 2005. There are no instances of "based off of."
TIME
BYU's search engine also offers access to a database of TIME magazine articles, 100 million words from 1923 to 2006. It contains 7460 instances of "based on," none of "based off" or "based off of."
BNC
Finally, there's the British National Corpus, which is 100 million words of British English. It covers from the 1980s to 1993; the data is not so nicely broken down by year. It contains 11467 instances of "based on," none of "based off" or "based off of"; but before concluding that "based off" is uniquely American, we should bear in mind that it didn't start appearing in the American data at all until 1993, and became a lot more prevalent starting around 2004-2005; so it could simply be too new to appear in the British data even if British idiots are using it as much today as are American idiots.
Pretty picture
I was interested in "based off," but there are too few occurrences of it in the corpora (probably because, as I said, these are corpora of educated language) to make a nice chart. So instead I'll show you this chart I made of the prevalence of "based on" per million words. It looks like "based on" started to become popular in the 1830s - and prescriptivist Web loggers were probably as enraged by it then as I am by "based off" today.
UPDATE: I just spotted a "based off" in the December 10 Something Postiive. It's spreading.
12 comments
Owen - 2010-10-31 10:55
In the metaphorical sense usually conveyed by "based on", the usage "based off" seems (to me) inspired by phrases like "runs off" or "comes off" which might arise in other metaphorical phrases. "A comes off B"...bananas come off a banana tree...health care comes off Communism... Of course it is strange that "to be based" would replace "to come or run", since the meanings are almost diametrically opposed. But 180° turns are not rare in language.
Axel - 2010-10-31 11:35
Matt - 2010-10-31 11:42
The meaning of propositions depends entirely on context, and can change the meaning of verbs.
A banana comes of a banana tree, not off.
def0 - 2010-10-31 12:16
My pet peeves are of the you're/your, it's/its, and to/too/two variety. I see this, and correct it via comments, two or three times per day on facebook. More infuriating is that it's usually found on posts from my young cousins. These kids are the fucking future of this country! People who don't use the Oxford Comma also pisses me off, because it's so obvious.
A friend of mine argued a few months ago that he thought your in place of you're would just become acceptable usage in American English in a decade. That means in two or three decades, it'll be considered acceptable in Canada. The thought makes my blood boil.
Steven R. Baker - 2010-10-31 14:45
And I'd say a banana comes *from* a banana tree. To me, coming "of" suggests cause and effect of intangible events rather than the source of a physical object.
Prepositions do convey information. You can write two sentences differing only by a preposition, and have them both be meaningful and in standard syntax, but with different meanings (for the sentence). Consider the difference between doing something "for" or "to" someone. Then it makes little sense to me to say prepositions have "no meaning"; maybe a preposition alone, divorced from context, has very little useful semantic value, but it remains important to the meaning of a sentence in such a way that part of the sentence's meaning can be reasonably attributed to the preposition.
Matt - 2010-10-31 17:03
Prepositions convey information but they don't have meaning in the sense that "permanganate" or "red" have meaning. They convey information because they belong to conventional forms. English "in the box" and French "dans la boîte" convey the same information, as do "on the box" and "sur la boîte", so one has the impression that in=dans and indicates that the following noun contains something, while on=sur and indicates that the following noun is surmounted by something. But English "on the street" is translated by "dans la rue". We translate text, not words; it is the text that conveys meaning. There are exceptions, of course; most scientific words are strictly of one meaning - and even then context is important, or else you won't understand why astrophysicists sometimes call oxygen a metal.
Axel - 2010-10-31 22:31
Steve - 2010-11-01 04:16
This whole thing reminds me of another possibly related English usage that I managed to get added to the old alt.usage.english list of contronyms maintained by Mark Israel: "out of", with the meaning "in". This has become entirely mainstream - "he's based out of Portland", meaning he's based "in" Portland. It works for only a few verbs, and doubtless the origin explains that. So:
He works out of Portland.
He's based out of Portland.
The software is supported out of Portland.
*He lives out of Portland.
All but the last mean the activity goes on "in" Portland. The last means he lives somewhere outside the city, but it's a doubtful construct - you'd say "outside of Portland". And there are borderline cases: What proportion of women work out of their homes in 2010?
But we can see pretty clearly that these all came from the notion that's found most closely in the "supported out of" sentence, that is of some activity flowing from Portland out to somewhere else. Think of travelling salesmen, telemarketing, support, and so on.
So other than a general resemblance, the missing piece to the argument is the transition sentence between "based on" and "based off [of]".
Oh and by the way, Brits generally say "in the street" rather than "on the street".
Tony H. - 2010-11-09 20:12
Also, there's more to these phrases than what linguists might call the surface realization: not all sentences that contain the word "based" followed immediately by the word "off" are examples of the "based off" construction I'm objecting to. Axel's example of the sailor based off Portland seems to be like that - it's using "based" in a different sense, and even if that happens to result in the word "off" occurring immediately after "based," it's not the same construction. You can't recognize it correctly by substring matching. Amusing similar example: is "it's" *always* incorrect as a possessive? Many of us would say "yes" - until we have to talk about the eponymous villain of the Stephen King novel IT.
Matt - 2010-11-11 12:09
def0 - 2010-10-31 02:24