You are currently browsing the category archive for the ‘language’ category.
Ambiguity and fear of ambiguity are common arguments for a variety of grammatical as well as editorial choices. For example, some people insist that since shouldn’t be used like because (as in “since you’re here so early, let’s build the trebuchet we’ve been planning”), because since could also mean “from that time forward”. The fear is that readers or listeners will commit to that latter reading and find it confusing — if not impossible — to switch tracks to the former reading.
Now, in the case of since, it’s actually rare that both meanings are reasonable for long enough to cause confusion; differences in the type of constituent or verb tense following the since tend to quickly disambiguate the sentence. But in other cases, ambiguity can be real and persistent:
(1) Since I was young, I went to church with my Mom [...]
In rare cases, the ambiguity can even be such that a reader can’t confidently determine which is intended, and in even rarer cases, the difference is meaningful. To insure against this confusion, some writers eschew the “because” meaning of since completely.
And that sounds like a good idea, except for one thing: there’s a flip side to the problem. So long as a substantial fraction of the linguistic community continues to permit the ambiguous form, it doesn’t matter whether you personally avoid the ambiguity; the ambiguous situation arises from unambiguous usage as well. In this case, it’s that ambiguity can arise even in the time-based usage of since:
(2) Since I was young, I have understood how right Benito Juárez, the outstanding Mexican patriot, was when he said: “Respecting others’ rights is the way to peace.”
Even if you never use the “because” meaning, your reader (probably) doesn’t know that. When they get to “Since I was young…”, they still might think that you’re using the “because” form. Again, this is probably only a temporary ambiguity. But it’s as much an ambiguous setting as the one that everyone complains about, so to avoid ambiguity, it also needs eschewed.
Here’s another example, from the cover of a book I’m reading:
The book is on Walter O’Malley, a former owner of the Brooklyn Dodgers, and the one who moved them to Los Angeles back in the 1950s. The front cover of the book, pictured above, reads “The True Story of Walter O’Malley, Baseball’s Most Controversial Owner, and the Dodgers of Brooklyn and Los Angeles”.
Now, if there were no such thing as the Oxford comma in this world, this subtitle would be unambiguous — baseball’s most controversial owner would clearly be an apposition referring to Walter O’Malley. (This is, by the way, the intended reading.) But because the Oxford comma exists, this could be a list. That’s the case even if neither the writer nor the reader ever uses the Oxford comma. The possibility of the Oxford comma will still color the interpretations.
I see two lessons here for usage in general. The first is that your writing and speaking do not exist in a vacuum. The principles of usage on which you make your usage decisions ought to take account of how other people use the language. It’s nice*, perhaps, if a writer insists that nauseous can only mean “inducing nausea”, but if no one else adheres to this rule, their readers probably won’t be able to recognize or use that principle in interpreting the writing. Common usage has an unavoidable influence on one’s readers and listeners.
The second is that ambiguity is not limited to contested usages. We tend to think of these debates about ambiguity as each influencing a particular choice or construction, but there’s almost always an overlooked construction that’s affected as well. If the fear of ambiguity is sufficient for a writer to avoid the ambiguous choice (e.g., the Oxford comma or because-since), then the fear of ambiguity also ought to cause the writer to avoid the ambiguity inducer (e.g., appositives in lists, time-since).
There are cases where that second avoidance is reasonable — I think I try to avoid appositives in lists, for instance — but in many situations, this would be tantamount to cutting the word out of the language. If both those senses of since are out, when could it be used? In cases like this, we really have to think hard about the intensity and importance of the ambiguity in the usage before deciding whether or not it’s tolerable. A blanket dictum against ambiguity is too broad a brush.
*: I’m, of course, using nice here in a sense somewhere between the rare “precise or particular in matters of reputation or conduct” and the obsolete “displaying foolishness or silliness” meanings.
Most people think of formal language as the ideal form, with less formal versions being a devolved, flawed, or generally worsened form of the formal language. It certainly sounds reasonable; formal language certainly feels harder to acquire and use consistently, for one. As a result, it’s a stance that many people (including me, prior to studying linguistics) take without even thinking about it: obviously, formal language is the language, and informal language is its cheap approximation.
In case I haven’t telegraphed it enough yet, I’d like to argue that this is incorrect. Informal language is not what you use when it isn’t worth the effort to use formal language, and informal language is not a strictly less governed system than formal language.
Since that might be butting up against ingrained opinion, let me start off with an analogy to levels of formality in another domain: fashion. Obviously, formal clothing like suits and ties and dresses can make people look really good for a gala event. But if you’re hoping to play a game of backyard football, they’re terrible, because they restrict your movement, and you’ll be unwilling to join into a dogpile because you’ll never get the blood and mud out. Similar problems arise if you’re working in a factory, doing dentistry, painting — the list goes on. Even just the fact that it’s summer now renders almost all of my formal clothing off-limits, lest I develop heatstroke.
Returning to formal language, we see many of the same points. Formal language can sound nicer than informal in some settings — oratory springs to mind. In other cases, whether or not it sounds nicer, it’s more appropriate. One wouldn’t, for instance, write an academic paper in informal English and expect it to be accepted. (Much as one wouldn’t wear a well-worn T-shirt to a job interview.) And because it tends to be the intelligent or successful who are most often in these “formal language required” settings, it’s unsurprising that formal language is believed to be the better form.
But informal language has its advantages. I’m hesitant to use singular they in formal writing, which at times forces me to concoct suboptimal versions of a sentence and pick one that I don’t like, only because the one that would sound best and most natural doesn’t feel formal enough. This need for formality slows me down and prevents me from saying what I’d really like to. Informal English is more flexible, and allows me to say what I mean more directly. Informal English isn’t a devolution because it lets me express myself better.
Another example is with contractions — and this also shows that informal English has its own rules apart from formal English. In most people’s forms of formal English, contractions are a no-no. But informal English allows both contractions and their uncontracted counterparts, the latter usually being used for emphasis. Consider these song lyrics:
“I didn’t see this coming,
no, I did not.”
I find the emphasis of the second line to be greatly reduced in the formal equivalent “I did not see this coming, no, I did not.” In fact, I occasionally find when I’m writing in formal English that the uncontracted version sounds too strident, but my hands are tied.
Stan Carey also talked about this earlier in the week, specifically in the context of song lyrics. Informal language of course thrives in song lyrics, of course, but that doesn’t stop people grousing about it. But wouldn’t it be far worse to be stuck with formal songwriters, who report that they “can not get any satisfaction” or that you “are nothing but a hound”?
Stan’s post links to a January discussion by Geoff Pullum of what he called “Normal and Formal” language, and how the competent writer is the one who switches between them readily and appropriately, not the one who unfailingly aims for Formal. His use of “Normal” in place of “informal” is important. Informal language is normal. It’s how virtually all of us talk to each other, even the most highly educated or successful.
That’s part of why formal language can feel more difficult than informal. We use informal language constantly, and as a result it comes naturally to us. Formal language is rarer, and like tying a tie, it’s hard when you’re not used to doing it. Not only that, but it can end up feeling pretty unnatural when adhered to too closely. Pullum gives the example of commenter who wanted him to write “whom are you supposed to trust” (instead of who), despite its stiltedness. He didn’t, and he was right.
Summary: Informal language is not a devolved version of formal language. It has rules that formal language doesn’t (e.g., choosing whether to use a contraction), and is in general more natural and readable than formal language. Informal language is, as Geoff Pullum puts it, normal language. This means that while formal language can be good and at times more appropriate than informal, it’s not always right, and it shouldn’t be treated as the ideal form of language.
A news story’s making the rounds this week that the members of the U.S. Congress have stopped talking at an 11th-grade level and have started talking at a 10th-grade level. This fits very neatly into the overall feeling that America is becoming ever more anti-intellectual, that Congress has become a group of petty and immature cliques who exist primarily to prevent each other from accomplishing anything, which is why the story has picked up steam. And perhaps these feelings are accurate, but this story doesn’t provide any evidence of it.
In short, the Flesch-Kincaid readability test that’s used in this analysis is completely inappropriate for the task.
I discussed this during the Vice-Presidential debates back in 2008, and Chad Nilep at the Society for Linguistic Anthropology and Mark Liberman at Language Log each talked about it in light of this new story. Here’s an updated set of arguments why the whole thing is nonsense.
How do we deal with speech errors? Speech has something that writing doesn’t have: disfluencies. Whether it’s a filled pause (uh, um, you know), a correction (We have — I mean, don’t have), an aborted phrase (I am a man with– I have goals), there’re lots of words that come through in speech that wouldn’t be in edited writing. Here’s an example from the 2008 debate, where Gwen Ifill said:
“The House of Representatives this week passed a bill, a big bailout bill — or didn’t pass it, I should say.”
That’s a sentence supposedly at the eighth-grade level. If we remove the mistakes & repetitions, we get a sentence that has now dropped a grade level. That’s the same drop that Congress supposedly has undergone. Maybe they just started editing the Congressional Record more tightly?
Grade levels aren’t based on content or ideas. The Flesch-Kincaid grade level calculation uses two statistics: syllables per word and words per sentence. These are imprecise stand-ins for want we really want, which is presumably the difficulty of the individual words and the complexity of the sentence structure. Word difficulty is going to be tied to their predictability in context, their frequency in the language, their morphological complexity, and other factors, all of which are loosely correlated with the number of syllables. Longer words will in general be more difficult, but there is a lot of noise in the correlation. Because we’re only using an estimate of the difficulty, our estimate of the grade level is inherently imprecise.
There is no punctuation in speech. There are lots of different ways to punctuate a speech. Is a given pause supposed to indicate a comma, a semicolon, or a period? The difference between these can be substantial; Nilep’s post shows how punctuating the speech errors as sentences of their own drop a sentence from grade level 28(!) to 10.
The rhetorical style of a speaker also comes into play here. Suppose Senator X and Senator Y deliver the same speech. Senator X uses a staccato style, where each clause becomes its own sentence. Senator Y uses a more relaxed and naturalistic style, combining some clauses with semicolon-ish pauses. Because the reading level calculation is based largely on number of words per sentence, Senator Y is going to get a much higher grade level, even though the only difference is in the delivery, not any of the content.
What does the grade level measure? The idea of grade-level estimation for writing was to give a quick estimate of how difficult a passage is to understand. The main readability scores were calibrated by asking people with known reading proficiency (as determined by a comprehension test or the grade level they were in) to read passages of various difficulty and to answer comprehension questions. The goal of the calibration was to get it so that if a piece of writing had a grade level of X, then people who read at the X level would be able to get some given percent of the comprehension questions right. Crucially, the grade level does not measure the content of the text, or the intelligence of the ideas it contains. In fact, for readability — the purpose the tests were developed for — a lower score is always better, assuming the same information is conveyed.
As I mentioned above, there’s a world of difference between reading and writing, so this calibration is probably invalid for speech. But if was valid, then we’d probably want to see the level go down.
The designers knew grade levels were imprecise measures. In a 1963 paper, George Klare wrote:
“Formulas appear to give score accurate to, or even within, one grade-level. Yet actually they are seldom this accurate.”
In a 2000 paper, George Klare wrote:
“Typical readability formulas are statistical regression equations, not mathematical identities, and do not reach that level of precision.”
I mention the two quotes here because they span 40 years of readability research, and the point remains the same. Grade-level assessment is somewhat informative, but it’s not very precise. You can be reasonably certain that a child will understand a third-grade level story better than a twelfth-grade level one. It is not nearly so certain that a tenth-grade level and eleventh-grade level story will be distinguishable. In fact, the Kincaid et al paper from 1975 that debuted the Flesch-Kincaid reading level calculation acknowledges its imprecision:
“Actually, readability formulas are only accurate to within one grade level, so an error of .1 grade level is trivial.”
Conclusions. So what we have here is a difference of 1 grade level (which is the edge of meaningfulness in ideal circumstances) when the reading level calculation is applied to speech, on which it is uncalibrated and in which we don’t have clear plans in place to account for the vagaries of punctuation and the issue of speech errors. Also, we have no data on the cause of the grade level decrease, whether it’s due to dumbing down, a push for clarity, or just new punctuation guidelines at the Congressional Record.
Which is to say, we have no reason to believe in this effect, nor to draw conclusions about its source, other than the unfortunate fact that we have a belief crying out to be validated.