You are currently browsing the category archive for the ‘language’ category.

A news story’s making the rounds this week that the members of the U.S. Congress have stopped talking at an 11th-grade level and have started talking at a 10th-grade level. This fits very neatly into the overall feeling that America is becoming ever more anti-intellectual, that Congress has become a group of petty and immature cliques who exist primarily to prevent each other from accomplishing anything, which is why the story has picked up steam. And perhaps these feelings are accurate, but this story doesn’t provide any evidence of it.

In short, the Flesch-Kincaid readability test that’s used in this analysis is completely inappropriate for the task.

I discussed this during the Vice-Presidential debates back in 2008, and Chad Nilep at the Society for Linguistic Anthropology and Mark Liberman at Language Log each talked about it in light of this new story. Here’s an updated set of arguments why the whole thing is nonsense.

How do we deal with speech errors? Speech has something that writing doesn’t have: disfluencies. Whether it’s a filled pause (uh, um, you know), a correction (We have — I mean, don’t have), an aborted phrase (I am a man with– I have goals), there’re lots of words that come through in speech that wouldn’t be in edited writing. Here’s an example from the 2008 debate, where Gwen Ifill said:

“The House of Representatives this week passed a bill, a big bailout bill — or didn’t pass it, I should say.”

That’s a sentence supposedly at the eighth-grade level. If we remove the mistakes & repetitions, we get a sentence that has now dropped a grade level. That’s the same drop that Congress supposedly has undergone. Maybe they just started editing the Congressional Record more tightly?

Grade levels aren’t based on content or ideas. The Flesch-Kincaid grade level calculation uses two statistics: syllables per word and words per sentence. These are imprecise stand-ins for want we really want, which is presumably the difficulty of the individual words and the complexity of the sentence structure. Word difficulty is going to be tied to their predictability in context, their frequency in the language, their morphological complexity, and other factors, all of which are loosely correlated with the number of syllables. Longer words will in general be more difficult, but there is a lot of noise in the correlation. Because we’re only using an estimate of the difficulty, our estimate of the grade level is inherently imprecise.

There is no punctuation in speech. There are lots of different ways to punctuate a speech. Is a given pause supposed to indicate a comma, a semicolon, or a period? The difference between these can be substantial; Nilep’s post shows how punctuating the speech errors as sentences of their own drop a sentence from grade level 28(!) to 10.

The rhetorical style of a speaker also comes into play here. Suppose Senator X and Senator Y deliver the same speech. Senator X uses a staccato style, where each clause becomes its own sentence. Senator Y uses a more relaxed and naturalistic style, combining some clauses with semicolon-ish pauses. Because the reading level calculation is based largely on number of words per sentence, Senator Y is going to get a much higher grade level, even though the only difference is in the delivery, not any of the content.

What does the grade level measure? The idea of grade-level estimation for writing was to give a quick estimate of how difficult a passage is to understand. The main readability scores were calibrated by asking people with known reading proficiency (as determined by a comprehension test or the grade level they were in) to read passages of various difficulty and to answer comprehension questions. The goal of the calibration was to get it so that if a piece of writing had a grade level of X, then people who read at the X level would be able to get some given percent of the comprehension questions right. Crucially, the grade level does not measure the content of the text, or the intelligence of the ideas it contains. In fact, for readability — the purpose the tests were developed for — a lower score is always better, assuming the same information is conveyed.

As I mentioned above, there’s a world of difference between reading and writing, so this calibration is probably invalid for speech. But if was valid, then we’d probably want to see the level go down.

The designers knew grade levels were imprecise measures. In a 1963 paper, George Klare wrote:

“Formulas appear to give score accurate to, or even within, one grade-level. Yet actually they are seldom this accurate.”

In a 2000 paper, George Klare wrote:

“Typical readability formulas are statistical regression equations, not mathematical identities, and do not reach that level of precision.”

I mention the two quotes here because they span 40 years of readability research, and the point remains the same. Grade-level assessment is somewhat informative, but it’s not very precise. You can be reasonably certain that a child will understand a third-grade level story better than a twelfth-grade level one. It is not nearly so certain that a tenth-grade level and eleventh-grade level story will be distinguishable. In fact, the Kincaid et al paper from 1975 that debuted the Flesch-Kincaid reading level calculation acknowledges its imprecision:

“Actually, readability formulas are only accurate to within one grade level, so an error of .1 grade level is trivial.”

Conclusions. So what we have here is a difference of 1 grade level (which is the edge of meaningfulness in ideal circumstances) when the reading level calculation is applied to speech, on which it is uncalibrated and in which we don’t have clear plans in place to account for the vagaries of punctuation and the issue of speech errors. Also, we have no data on the cause of the grade level decrease, whether it’s due to dumbing down, a push for clarity, or just new punctuation guidelines at the Congressional Record.

Which is to say, we have no reason to believe in this effect, nor to draw conclusions about its source, other than the unfortunate fact that we have a belief crying out to be validated.

The terms descriptivism and prescriptivism get thrown around a lot, and it seems most everyone says one of the words with a sort of dripping scorn that wouldn’t be out of place on the word “Communist” in the Army-McCarthy hearings. For many people, the difference between the two is black and white: one is the right philosophy and the other the wrong one. One improves language and the other ruins it.

There are a precious few who are able to avoid this facile good-vs-evil characterization, but they are a minority. I think many of you readers fall into this category, although I can’t say I always live up to your example myself.

The problem’s only exacerbated by the fact that even those who haven’t genericized these terms don’t necessarily agree on their boundaries. For some descriptivists, anyone who corrects any error is a prescriptivist. For some prescriptivists, updating a dictionary is descriptivist madness.

Many prescriptivists seem to use the word descriptivist as a term of generic revulsion, as though its definition were little more than “someone who disagrees with me”. (Similar to the genericization of fascist in 60s-era political discourse, socialist in contemporary political discourse, or hipster in my own discourse.) And descriptivists do the same to prescriptivist. Again, I’m as guilty of this as anyone.

So I felt like trying my hand at laying out what I think of as the division between descriptivists and prescriptivists, and why one can (and in fact ought to be) a little of both. Let me start off with a pithy summary of the debate between descriptivism and prescriptivism: is grammar something to be learned or something to be taught?

Descriptivism, in brief, is looking at what people say in a language and building up grammar rules from that. Prescriptivism, again in brief, is having a series of rules to tell you what should and should not be said. The difference in opinion between descriptivists and prescriptivists is often referred to as a “war”. I’m reluctant to say that’s overblown, because the gap between the two philosophies really exists and really is wide. But it’s based on a critical misconception: namely, that descriptivism and prescriptivism weigh in on the same matters.

They shouldn’t. A descriptivist philosophy is nothing more than saying that we need to be aware of the full range of allowable utterances in a language before we commit to its analysis. Descriptivism looks at what can possibly be said in a language. It’s at this level, for instance, that we can say that English is a Subject-Verb-Object word order language and not a Subject-Object-Verb language (like, say, Korean or Aymara), because virtually no one says I ball caught. This rule exists without explicit prescription.

A prescriptivist philosophy says that certain possible utterances are better than others. This sort of judgment may be based on aesthetics, clarity, prestige, or any other consideration. Here is where one can say that passive sentences should be avoided or that epic is gravely overused. But note that these rulings, unlike the descriptivist ones, do not determine validity of a sentence. Instead, these rulings tell what is a good usage, as opposed to a merely acceptable one. This is the critical difference between the -isms.

One can — and I believe must — be both a descriptivist and a prescriptivist in order to be a halfway decent language user. Descriptive knowledge lists your linguistic options, and prescriptive knowledge helps you decide between them. The trouble is that people have trouble keeping the two separate. Prescriptions mutate from “X is worse than Y” to “X is invalid” (see, for instance, Stan Carey’s posts on “not a word”). Some committed descriptivists overreach as well, arguing for a pure descriptivist viewpoint that treats all utterances as equally valid. (However, this seems a much rarer stance than overprescription.)

Why this is so difficult to get a handle on is unclear to me, and I say this as someone who’s only now starting to get a handle on it. It’s obvious in other fields, like architecture. An architect needs both to know what can be done (e.g., the maximum load a given beam can support) and what should be done (e.g., the aesthetics of a building). There are (I presume) no “prescriptivist” architects who would insist that an ugly but structurally sound building is “not a building” in the way that linguistic prescriptivists insist that ain’t isn’t a word.

Maybe the difference is that in architecture, structural soundness is fairly black-and-white, based on calculations and tables, and universal, subject to the same physical laws anywhere on Earth. In language, there are no easy references, and what’s valid in one language need not be in another. There is no rule that says ain’t must or mustn’t be a word, only the usage data that we ourselves, the speakers of English, have generated. I would think that would make it easier to see that language is flexible, yet many prescriptivists overlook the available usage information and insist that language should behave in a way that is largely independent of how language does behave. And many of them only stiffen their resolve when this is pointed out.

I’ve got one final thought, and that’s the contradiction that this site’s motto is “Prescriptivism Must Die!” and yet here I am saying that prescriptivism is important. What I think should die is captial-P Prescriptivism, the reliance on prescriptions and proscriptions everywhere, the barring of perfectly standard English or dialectal English because of misunderstandings, historical accidents, and other foolishly-constructed rules. It’s prescriptivism without descriptivism that must die, I suppose, but that’s more nuance than a motto can reasonably bear.

[I realized late in writing this post just how much it was inspired by Jonathon Owen’s post Continua, Planes, and False Dichotomies from October. If you haven’t read it, or forgot the details since the last time you read it, I strongly suggest you do, as it is in many ways a better version of this post.]

It’s National Grammar Day, so as usual, I’m taking the opportunity to look back on some of the grammar myths that have been debunked here over the last year. But before I get to that, let’s talk briefly about language change.

Language changes. There’s no question about that — just look at anything Chaucer wrote and it’s clear we’re no longer speaking his language. These aren’t limited to changes at the periphery, but at the very core of the language. Case markings that were once crucial have been lost, leaving us with subject/object distinctions only for pronouns (and even then, not all of them). Negation, tense marking, verbal moods, all these have changed, and they continue to do so now.

Some people take the stance that language change is in and of itself bad, that it represents a decline in the language. That’s just silly; surely Modern English is no worse than Old English in any general sense.

Others take a very similar, though much more reasonable, stance: that language change is bad because consistency is good. We want people to be able to understand us in the future. (I’m thinking here of the introductory Shakespeare editions I read in high school, where outdated words and phrases were translated in footnotes.)

So yes, consistency is good — but isn’t language change good, too? We weed out words that we no longer need (like trierarch, the commander of a trireme). We introduce new words that are necessary in the modern world (like byte or algorithm). We adapt words to new uses (like driving a car from driving animals). This doesn’t mean that Modern English is inherently better than Old English, but I think it’s hard to argue Modern English isn’t the better choice for the modern world.

Many writers on language assume that the users of a language are brutes who are always trying to screw up the language, but the truth is we’re not. Language users are trying to make the best language they can, according to their needs and usage. When language change happens, there’s a reason behind it, even if it’s only something seemingly silly like enlivening the language with new slang. So the big question is: is the motivation for consistency more or less valid than the motivation for the change?

I think we should err on the side of the change. Long-term consistency is nice, but it’s not of primary importance. Outside of fiction and historical accounts, we generally don’t need to be able to extract the subtle nuances from old writing. Hard though it may be to admit it, there is very little that the future is going to need to learn from us directly; we’re not losing too much if they find it a little harder to understand us.

Language change, though, can move us to a superior language. We see shortcomings in our native languages every time we think “I wish there was a way to say…” A language is probably improved by making it easier to say the things that people have to or want to say. And if a language change takes off, presumably it takes off because people find it to be beneficial. When a language change appears, there’s presumably a reason for it; when it’s widely adopted, there’s presumably a compelling reason for it.

The benefits of consistency are fairly clear, but the exact benefit or motivation for a change is more obscure. That’s why I tend to give language change the benefit of the doubt.

Enough of my philosophizing. Here’s the yearly clearinghouse of 10 busted grammar myths. (The statements below are the reality, not the myth.)

Each other and one another are basically the same. You can forget any rule about using each other with two people and one another with more than two. English has never consistently imposed this restriction.

There is nothing wrong with I’m good. Since I was knee-high to a bug’s eye, I’ve had people tell me that one must never say “I’m good” when asked how one is doing. Well, here’s an argument why that’s nothing but hokum.

The S-Series: Anyway(s), Backward(s), Toward(s), Beside(s). A four-part series on words that appear both with and without a final s. Which ones are standard, and where?

Amount of is just fine with count nouns. Amount of with a count noun (e.g., amount of people) is at worst a bit informal. The combination is useful for suggesting that the pluralized count noun is best thought of as a mass or aggregation.

Verbal can mean oral. In common usage, people tend to use verbal to describe spoken language, which sticklers insist is more properly described as oral. But outside of certain limited contexts where light ambiguity is intolerable, verbal is just fine.

Twitter’s hashtags aren’t destroying English. I’ve never been entirely clear why, but many people insist that whatever the newest form of communication is, it’s going to destroy the language. Whether it’s the telegraph, the telegram, text messages, or Twitter, the next big thing is claimed to be the nail in English’s coffin. And yet, English survives.

Changing language is nothing at all like changing math. Sometimes people complain that allowing language to change due to common usage would be like letting triangles have more than 180 degrees if enough people thought they did. This is bosh, and here’s why.

And a few myths debunked by others:

Whom is moribund and that’s okay. (from Mike Pope) On rare occasions, I run across someone trying very hard to keep whom in the language, usually by berating people who haven’t used it. But the truth is that it’s going to leave the language, and there’s no reason to worry. Mike Pope explains why.

Uh, um, and other disfluencies aren’t all bad. (from Michael Erard, at Slate) One of the most interesting psycholinguistic papers I read early in grad school was one on the idea that disfluencies were informative to the listener, by warning them of a complicated or unexpected continuation. Michael Erard discusses some recent research in this vein that suggests we ought not to purge the ums from our speech.

Descriptivism and prescriptivism aren’t directly opposed. (from Arrant Pedantry) At times, people suggest that educated linguists are hypocritical for holding a descriptivist stance on language while simultaneously knowing that some ways of saying things are better (e.g., clearer, more attractive) than others. Jonathon Owen shines some light on this by representing the two forces as orthogonal continua — much more light than I’ve shone on it with this summary.

Some redundant stuff isn’t really redundant. (from Arnold Zwicky, at Language Log) I’m cheating, because this is actually a post from more than five years ago, but I found it within the last year. (This is an eleventh myth anyway, so I’m bending rules left and right.) Looking at pilotless drones, Arnold Zwicky explains how an appositive reading of adjectives explains away some seeming redundancies. If pilotless drones comes from the non-restrictive relative clause “drones, which are pilotless”, then there’s no redundancy. A bit technical, but well worth it.

Want to see somewhere between 10 and 30 more debunked myths? Check out some or all of the last three years of NGD posts: 2011, 2010, and 2009.

I was reading through Stan Carey’s recent Macmillan Dictionary post on the 2011 Plain English Campaign awards, and he put together some disparate bits of thoughts that had been floating around my head for years now.

I’ve always felt sort of uncomfortable with the Plain English Campaign and other related groups that push for more straightforward writing. These groups, if you’re not familiar with them, look over various writing and call people out for unclear language, excessive wordiness, muddled explanations, and biased euphemisms. All in all, a good thing for someone to do, right? I’ve always felt like it was, especially on legal forms and important things like that. Yet at the same time, I’ve always felt a twinge of discomfort with it, and I never quite figured out why. I finally decided that it must be because of the latent prescriptivism in it, and the fact that I sometimes disagreed with the changes that the groups wanted to make.

But that’s an irrational stand. Surely, I’m not against prescriptions when they are focused and clearly improve the comprehensibility of writing, right? That would be insane. So, I had to wonder, what’s eating me about it?

Judging from his post, Stan has similar misgivings about Plain English, but he’s figured his out a bit better. Pointing out overnight tonight and temperatures really struggling as two examples the PEC has flagged as “weatherese”, Stan calls them inoffensive. Stan grants that overnight tonight is redundant, but that redundancy is mild and potentially useful.* I agree; mild redundancy is something that I believe is useful rather than harmful, as an error-correcting code in language.

But it’s temperatures really struggling that gets to the heart of my misgivings. Stan allows that this is “a bit vague and anthropomorphic”, and it is. It’s confusing if you have no other context, and you need to know this bit of our collective unconscious in which we think of the weather as trying to get warm rather than trying to get cold. (I imagine this directionality is not universal, but variable from culture to culture.**) As a result, if there is no other context, or you’re talking to someone who doesn’t share the same cultural knowledge, you probably should avoid temperatures really struggling.

But avoiding such usages has its own downsides. Language is interesting because it is both a tool and an art. Yes, we could use always just say things the same way every time we talk, in whatever way is the most straightforward and least ambiguous. Or we could be a little laxer and permit variation, but ban metaphorical language, and it would probably be easier to get what people are saying. We could disavow sarcasm, because that’s hard to catch, particularly around people you don’t already know, or people like me who fail to have sufficient differentiation between their regular and sarcastic voices.

But we don’t want to, and I don’t think we should. Language is a fun thing, a way to make art every day, every minute. We read fiction because it’s not the newspaper. We have such a fetish for artistry in language that we store quotations, making whole books of words that someone else put together in the right way. Sometimes these quotations are stored because they’re so clear, but more often it’s because they’re not so clear. “Neither a borrower nor a lender be: For loan oft loses both itself and friend,” from Hamlet, is a great line, one that has become an idiom as a result. But it could have been said much more clearly as “Do not make a loan or take a loan, because loans ruin friendships.”

A reasonable contrarian may be saying something like, “Well, that’s Shakespeare, not the weather report,” and I don’t disagree. But these aren’t categorical differences; we don’t want to say that artistry is limited to plays and creative writing and whatnot. All writing is creative. The question is the balance between artistry and clarity.

I’m realizing this right now because I am occasionally babysitting my two-year-old nephew (actually first cousin once removed, but never mind). That means that I have to re-phrase things a lot, because I do tend to speak like I write, which to be charitable to myself, I’ll call flowery. When I say something with a lot of rare or long words, he just sort of stares at me, and I have to rephrase them in words that a two-year-old might know. But when I’m back to talking to other adults, that sort of obsessive clarity isn’t necessary, and would make me unpleasant to talk to.

Clarity, contrary to what many writing guides say, is not paramount. One should be as clear as necessary, but not always more. If a bit of anthropomorphism makes the writing more interesting and engaging, it may be worth the potential loss of clarity. The same if a spot of ambiguity enlivens the sentence, or a slight omission makes it flow better. The key is to know how clear your audience needs you to be. If they’re non-native speakers or still in diapers, clarity is king. If they’re academics, heave clarity overboard.***

So in the end, perhaps the source of my discomfort with the Plain English idea is nothing more than being wary of making clarity the major consideration instead of a major consideration. Clarity has its place, but there are other factors, and those may be more important depending on the purpose of your writing.


*: In my idiolect, it’s not redundant at all, because tonight can refer to any block of time between the next sunset and sunrise (most importantly, either to the time before or after I go to bed or both) and overnight could refer to any late night, not necessarily the next one.

**: One example of this sort of expectedly non-universal directionality is time. In most every culture, the past is thought of as being behind you, and the future in front of you. However, for the Aymara, the past is in front of someone and the future behind them.

***: This is not entirely facetious. I once wrote a paper that my co-author worried was too clear; because it was easy to understand the algorithm we were presenting, it didn’t feel like it was a deep insight.

Post Categories

The Monthly Archives

About The Blog

A lot of people make claims about what "good English" is. Much of what they say is flim-flam, and this blog aims to set the record straight. Its goal is to explain the motivations behind the real grammar of English and to debunk ill-founded claims about what is grammatical and what isn't. Somehow, this was enough to garner a favorable mention in the Wall Street Journal.

About Me

I'm Gabe Doyle, currently a postdoctoral scholar in the Language and Cognition Lab at Stanford University. Before that, I got a doctorate in linguistics from UC San Diego and a bachelor's in math from Princeton.

In my research, I look at how humans manage one of their greatest learning achievements: the acquisition of language. I build computational models of how people can learn language with cognitively-general processes and as few presuppositions as possible. Currently, I'm working on models for acquiring phonology and other constraint-based aspects of cognition.

I also examine how we can use large electronic resources, such as Twitter, to learn about how we speak to each other. Some of my recent work uses Twitter to map dialect regions in the United States.



@MGrammar on twitter

Recent Tweets

If you like email and you like grammar, feel free to subscribe to Motivated Grammar by email. Enter your address below.

Join 967 other followers

Top Rated

%d bloggers like this: