the Chomsky question

Someone emailed me to ask me about the recent Tom Wolfe take down of Noam Chomsky in Harper’s Magazine. (It’s paywalled; you should buy a copy of the physical magazine, because the essay is worthwhile and so is the rest of the issue and because Harper’s is a fantastic magazine and deserves a few of your real dolars.)  A more sensible person than I am would likely turn down this invitation for lack of expertise. But I go where eagles dare.

(This post is long. Really long. If that’s a problem, exercise your right not to read it.)

The relationship between the broader study of language, the subfield of syntax, and Noam Chomsky is complex, and everywhere Wolfe’s essay is entangled in those questions. I define myself as a linguist, and have for a long time, though there are surely some who wouldn’t deign to grant me that self-definition, and some of them are surely syntactitians. Within linguistics, it’s true to say that for a long time, syntax reigned as the most prestigious organ in the field, and it’s equally true to say that Chomsky’s stature loomed large in this cosmology. Like so many other things in small subcultures like academic fields, it can be hard to note where real trends end and the enculturated perception of those trends begin. So, for example, from my very limited vantage, the perception that syntactitians tend to look down on those who work in semantics seems to be the product less of that behavior actually being common than of the idea that this behavior is common. (But then, I’ve never been in an elite linguistics department.) It’s true, though, to say that many in the field of linguistics tend to see syntax in the way that many in the natural sciences see quantum physics – the place where the most essential questions are asked and answered, where researchers really engage with the basic matter of life.

I’m not one of them, or even close to them. I define myself in the broad field of applied linguistics because I am interested in the study of language in use by real human beings in real human situations, and I am interested in that study following certain systemized approaches to the generation of new knowledge. Those approaches will be familiar to anyone with an education in research methodology – I want my work to follow procedures that have been designed to maximize the potential to make true observations about the world around us, and in particular to avoid the constant human tendency to see patterns where there are none. I want to do these things even while I recognize the limitations in both myself and in those procedures. I am interested in speaking the language of p-values and effect sizes and different kinds of validity and statistics for reliability because I have been persuaded that this is important, if not for achieving the truth, then for avoiding certain kinds of untruth. Many of my peers in the humanities are not interested in these things, for various reasons, many of them valid. They work their work. I work mine, and this post is not the right place to spell out for you what I study and don’t.

In any event, I have in bits and pieces acquired the kind of skills that I need in order to do the kind of work I want to do, and I have reached a level of confidence where I no longer drape my self-definition with caveats about what I can and can’t do. Whether or not work like mine deserves to be called science is irrelevant to me. I’m just not interested in the question. But many linguists are very invested in this question indeed. Take this post on the Language Log by Geoff Pullum, in which he says that linguistics “is not a domain in which people’s off-the-top-of-the-head opinions and speculations have to be accepted: there is a science of linguistics, and over the past century it has made a wealth of factual discoveries about the human linguistic capability.” I agree, linguistics is a science, more or less, and I agree with the second sentence too. (I also, for the record, agree with his contention in that particular post, not that he’d care for my approval.) The touchiness here, the sense of protesting too much, is at once natural in an academic world where identification as a science has become necessary for a field’s very survival, and at the same time a problem for linguists of a certain stripe.

Wolfe is right to suggest that Chomsky’s approach is not science as usual. Wolfe makes great hay out of the fact that Chomsky’s inquiry does not require fieldwork. I think Wolfe overstates Chomsky’s aversion to (or disrespect for) fieldwork, but it is the case that the root idea of universal grammar means someone can claim to be doing linguistics work of the first order without leaving the office. And in particular, Chomskyan approaches tend to be disdainful of the notion of a sample in the traditional sense. One of Chomsky’s earliest scholastic wars was with the corpus linguists, researchers who believed (rather sensibly) that to study how language works, one would want to study lots of language – to gather as much of it possible together to look for patterns and features. Central to Chomsky’s approach is that language is potentially infinite, that we can always string more and more words together in grammatical order by embedding more and more clauses together forever, and that there is no limit to the number of expressions that we can produce. There is thus, according to Chomsky, no use in assembling a corpus; any collection, no matter how large, is necessarily incomplete, and thus insufficient for scientific purpose. Chomsky, in his typical style, did not merely dismiss corpus linguistics as a tool for understanding the fundamental nature of language. He dismissed corpus linguistics.

It’s precisely that tendency of Chomsky, and his acolytes, to be so self-assured that makes criticism like Wolfe’s piece inevitable. It gets them into predictable trouble. Take this from Corpus Linguistics by McEnery & Wilson.

Chomsky: The verb ‘perform’ cannot be used with mass word objects: one can ‘perform a task’ but one cannot ‘perform labor.’
Hatcher: How do you know, if you don’t use a corpus and have not studied the verb ‘perform?’
Chomsky: How do I know? Because I am a native speaker of the English language.

Such arguments have a certain force– indeed one is initially impressed by the incisiveness of Chomsky’s observation and subsequent defense of it. Yet the quote also underlines why corpus data may be useful…. One can ‘perform magic,’ for example, as a check of a corpus such as the BNC reveals. Native-speaker intuition merely allowed Chomsky to be wrong with an air of absolute certainty.

Here we have the crux of it: the native speaker as sacrosanct object of linguistic inquiry. A native speaker has native speaker intuition; this intuition is inviolate because the speaker has a natural capacity for language; this natural capacity is the product of our genetic endowment. Like so much else in Chomsky’s work, this attitude is elegant. It’s also infuriating. An old joke among applied linguists is that to a Chomskyan linguist, a sample size of 10,000 is inadequate, but a sample size of 1 is ideal. It should go without saying: while reference to native speaker intuition may very well be a valid and useful way to investigate language, it is not scientific in the sense that many people understand the term. Linguistics may very well be a science – like I said, I’m not much interested in the question – but it is not science as usual, as popularly conceived. If you asked me whether the average linguist pursues their research according to what we typically define as the scientific method, my answer would be a straightforward “no.” But then, I’m not particularly invested in that framework.

It turns out that there are many insights that corpus linguistics can provide, of a far broader interest than whether one can perform magic. For example, it is entirely true that language is functionally infinite. It is also entirely true that actual language use is very constrained, that most people speak most of the time in preformed phrases that we repeat over and over again without knowing it. Our language capacity is infinite, our language use is limited; we have corpus linguistics to thank for that understanding. People assembled enough language to look at human discourse at large scale and found that, despite our theoretically infinite capacity to produce new language, we tend to rely on old language in actual human practice. That strikes me as a useful observation. We also have corpus and computational linguistics to thank for Google Translate, which is both a magnificently powerful system and a limited one. Yet such things are routinely waved away.

My own particular area of interest is assessment, assessment of learning. I am interested in how students learn, generally, and how we can fairly and reliably measure that learning for various purposes. Naturally, being a linguist, my experience has largely been in assessing student development in the language arts. Most of my past several years have been devoted to the question of how better to assess the ability of students to use language, particularly second language students, particularly in college, particularly in writing. In an internationalizing university, knowing who can use language adequately for their social, educational, and professional need is very important. To investigate that seems like a natural task for researchers. And yet there are people in the Chomsky mold who dismiss it out of hand.

This is from the forward to Chomsky’s recent book What Kind of Creatures Are We?, a generally brilliant book which strikes me as Chomsky’s recognition that his time on Earth grows short and that he wants to leave one last explanation of his project. Akeel Bilgrami writes,

[The theory of language] is not a theory about external utterances, nor is it, therefore, about a social phenomenon. The nomenclature to capture this latter distinction between what is individual/internal/intensional and what is externalized/social is I-languages and E-languages respectively. It is I-languages alone that can be the subject of scientific study, not E-languages.

Though it comes from a Chomsky surrogate rather than the man himself, this is high Chomskyanism taken to the point of self-parody: that which is not useful to the study of generative grammar is simply not worth studying. Never mind that, at my doctoral institution, whether or not a Korean computer scientist can adequately make himself understood to a class of undergraduates can determine whether he continues on in his PhD program or gets shipped back to Korea. Stakes have never been sufficient to move people wedded to the I-languages/E-languages distinction. There’s the object of serious study, and then there’s academic ambulance chasing for the proles.

I said at the outset that this post is a mistake. That’s because, to a certain strata of linguist, I’m inherently unqualified to weigh in on this controversy. I am not a syntactician, and unfortunately, there are those syntactitians who feel that the field is simply too complex for the average mind to decipher. It happens that I’m actually pretty well informed on this stuff for an amateur – and yes, I define my interest in syntax on the amateur level. I have taken graduate coursework in syntax, and I’ve enjoyed close professional relationships with those actively involved in research syntax, and most importantly I read as much as I can on the subject. But I’m not a syntactitian and I find much in the field baffling and thus there are some who would say that I simply can’t grasp the debate being had in Wolfe’s piece and elsewhere. This is not at all universal, of course; there are many in the field who are remarkably open and approachable and humble. But jeremiads like that of Wolfe draw strength from the imperiousness of a few. It should be enough that I’m informed and I care and I try my best for me to be able to speak, in limited and general ways, about controversies in syntax, but too many within the club will simply say “you just don’t understand the complexities of our models.”

The shame of it is that there’s great beauty, in syntax, and that I know there’s beauty far too deeply hidden for me (or Wolfe) to see. It can be a discipline of such incredible elegance…. I’ll never forget the first time I really grasped spec-head movement, the way that the subject slides so perfectly into place. When a patient teacher shows you why English must include the “it” in the sentence “It seems to me,” when that “it” is totally semantically inert – a pronoun with no coindex, a reference to a thing which is not a thing, a word that exists for grammatical purposes alone in the purest sense – everything seems to snap into place.

But then you also learn the kludgy, weird, and decidedly inelegant sides of syntax, too. To learn about syntax is to go deep into a system that is by turns perfectly engineered and shockingly incomplete. For while you’ve got that beautiful subject formation I noted above, you’ve also got, for example, the so-called “unvoiced feature” PRO, an entirely theoretical idea that has no empirical referent at all. Not just doesn’t have a referent – to some people I’ve talked to, could have no empirical referent at all. (It reminds me of those religious types who argue that not only is evidence for God not available, but the very nature of faith means that no evidence can or should exist.) That imagining PRO exists makes the theory work is itself a problem with this branch of linguistics, and illustrates what I mean when I say that linguistics is not a science the way chemistry is a science. Call it what you like: the existence of theoretical structures that we can’t see but which are necessary to preserve our theories are not scientific in the typical sense.

Meanwhile, there are all kinds of tasks that we need to do which theoretical syntax have proven incapable of doing. Bilgrami is free to think that my interest in teaching a Iranian immigrant how to speak English so that he can secure a better life for himself is unscientific; I’m intent on doing it nonetheless. And while our limited pecking in the world of sample size and confidence intervals may lack the glamour of linguists at MIT decoding the most basic stuff of language through reference to their status as native speakers, applied linguists are trying to solve real-world problems that can’t wait. The fact is that Chomsky and his prodigies have been investigating UG for 60 years, and it remains stubbornly opaque. Ask a follower of Chomsky what the rules of universal grammar are, and they’re likely to say “that’s the wrong question.” Maybe it is, I dunno. But some other people will tell you that in those 60 years, all we’ve found that is genuinely universal are Merge and Move, which are maybe just Move, and anyway we still can’t say if there’s a critical period or when it ends and if a second language is learned or acquired…. What Chomsky and his followers have produced is impressive. Incredible, really. But too many people have students who need help, and we cannot wait for the theory of UG to be fully realized, if such a thing is possible.

There are many stories, among linguists, out there. And the one I hear most often, sometimes loudly, sometimes softly, is this: that it is entirely possible (in fact almost certain) that there is some genetic predisposition towards language in the human species, but that what is actually universal within language may be very limited – that the fundamental language capacity, suggested by the poverty of the stimulus, can exist without inscribing very much at all in terms of necessary or universal rules. My own story is that we’ve got these two remarkable intellectual achievements – Chomsky vs Norvig, theory vs empiricism, sentence trees vs Big Data, science vs engineering, MIT vs Google, the building of sophisticated models intended to draw out the essential structures of thinking vs the building of sophisticated algorithms intended to emulate human language use so perfectly we cannot tell the difference – and that both are incredibly valuable, but neither can talk to each other, and that to really advance we’ve got to bridge them. But I enjoy no prestige or power in either domain, and it seems that the divide is permanent, and that’s the tragedy right now. And, though I admire him as much as any living human, I must admit that part of Chomsky’s legacy is precisely this dismissal of the inexpert voice, because although he’s a genuine egalitarian, his remarkable self-possession has led to generations of linguists that police the boundaries of those who know against those who don’t. I’d fail any sentence tree test you might throw in front of me, and for that reason some influential people in linguistics would ignore anything I have to say, and anyway my project is decidedly not Chomskyan, as Bilgrami’s quote above suggests. So I might be in the market for a Chomsky reconsideration.

But Tom Wolfe is not the guy to do that.

Many people I know have found the essay entertaining, and they are entitled to do so. Myself, I find Wolfe’s… reliance on repeated… ellipses… to be an irritating affect, as I do his constant use of undergraduate interjections (kablamo!). But the basic problem with this essay is both its failure to understand the broader world of contemporary linguistics and, more damningly, its utter lack of empathetic imagination, which is an essential quality in both the sympathetic profile and the truly devastating take down alike.

In the style of Karl Rove, Wolfe attempts to represent Chomsky’s great strengths as an intellect as his weakness. Chomsky has long earned justified praise for his willingness to reconsider past ideas, his adaptability to new evidence. Strange, then, that Wolfe repeatedly paints him as a defensive figure, unable or unwilling to honestly respond to criticism. Chomsky is in fact famous for responding to emails from anyone, from the most established academic bigwig to the most random undergrad. To call him unwilling to change seems particularly bizarre. Indeed, it’s precisely his willingness to adapt to new evidence that has allowed him to remain relevant for all these years. No individual event was more important for Chomsky’s ascendance than his systematic demolition of BF Skinner’s behaviorist theory of language. I recommend you read Chomsky’s famous review of Skinner.  It remains a model of academic ruthlessness precisely because of its intellectual empathy, Chomsky’s willingness to really engage with Skinner’s ideas by putting them in the best possible light. This is what Wolfe won’t, or can’t, provide: the weight of a fully sympathetic reading, the ability to parse the fine distinctions in the work of the person being indicted. I say this with absolutely no belief that Wolfe lacks sufficient expertise to critique Chomsky. This is not about credentials or background knowledge; I am not the linguist sniffing that Wolfe lacks access to the Byzantine theories of contemporary syntax. I am saying that Wolfe displays more than adequate understanding to pillory Chomsky and still fails to do so.

Wolfe represents Chomsky’s idea of recursion as a great leap forward for Chomsky, the completion of the project of generative grammar, the feather in his cap. (It wasn’t, and wasn’t really represented as such at the time, but this contention is essential for the kind of story Wolfe wants to tell. Wolfe has never been one to let the facts get in the way of the story. I hear they call this approach New Journalism.) Wolfe wants to tell you that this idea of recursion was thoroughly undermined; that Chomsky stuck to his guns; and that this shows Chomsky to have been left behind by modern cognitive science. And indeed, it was precisely this inflexibility that led to Skinner’s decline. As unfashionable as behaviorism is, it retains a certain brute logic that I find hard to dispute – that most animals repeat behaviors that are rewarded. Though people sometimes make this claim, Chomsky neither totally rejected all of Skinner’s work nor intended to. Skinner’s problem was that he tried to extrapolate this observation into too many domains, including language, and ended up looking like a fool for doing so, unwilling to admit the limits of his theory.

This is precisely not who Chomsky is, and yet this is the brush with which Wolfe seeks to tar him. Wolfe goes on the offensive about recursion, and then dings Chomsky for minimizing recursion his later work. And yet this is precisely what we would hope an honest researcher would do! As this Language Log post suggests, Chomsky changed his presentation of the evidence in response to new information. To develop a theory, publish on that theory, and then respond to critical reactions by no longer highlighting that theory, seems to me to be an admirable willingness to adapt to new information. And yet Wolfe paints it as a kind of dishonesty, rather than as an honest reevaluation of the evidence. He has to, to preserve his story.

Perhaps Wolfe’s biggest problem is in his repeated overestimation of Chomsky’s contemporary reputation, or his reputation 10 or 25 years ago. For Wolfe’s essay to work the way that it works, it’s important for Chomsky to not just be a Goliath but to be an unchallenged Goliath, unchallenged except for Daniel L. Everett, who Wolfe anoints as his official David. But think for a minute about human nature. Could it possibly be that any living figure in any field of human inquiry could loom as large as Chomsky has in linguistics without engendering a lot of resentment and contrary opinions? Chomsky has indeed enjoyed very unusual dominance in his field; he has also enjoyed that other kind of academic respect, which is constant and vocal disagreement. And, in fact, even absent Everett’s particular line of critique, there are plenty of people, including in syntax, who will tell you that Chomsky is wrong to varying degrees and with varying amounts of opprobrium. That he remains relevant, highly engaged, widely read, and deeply influential is indisputable, and astonishing given his age and the length of his career. But to suggest that he is generally unchallenged, or that he was a decade or two ago, is factually wrong.

Wolfe supposes that his line of attack, or Everett’s, is unique, but this is also not the case. As this post at the Faculty of Language says, it’s an old saw. I should say upfront that I admire Everett, and find a lot of his insights useful. I do not, however, find his work an effective dismissal of Chomsky’s. Chomsky is in fact quoted in Harper’s: he makes the (correct) point to Wolfe that the Piraha can acquire Portuguese with ease, in childhood, and that this demonstrates the fact that while the Piraha language may indeed be unique, the capacity for language of the Piraha people is in fact entirely unremarkable. Wolfe’s response is… nothing. Wolfe has spent thousands of words going after Chomsky, attacked him for paying insufficient attention to language itself, and still failed to understand that Chomsky’s interest is precisely in the language capacity, not the language. Wolfe attacks Chomsky’s project by asserting its basic identity and mistakes that for a rebuttal.

Wolfe then goes on to engage in conspiracy theory, arguing that Chomsky has undertaken a concerted effort to suppress Everett’s theories, that he refuses to respond to them publicly. Of course, Chomsky has responded, at length, as have others, so Wolfe crafts a story of marginalization and silencing largely borne out in the unfortunately glacial pace of academic publishing. Wolfe paints Everett as a figure cast out into the wilderness, when in fact he has been cast out into a position as Dean at a well-respected university in Boston and into the arms of the University of Chicago Press. I would very much like to suffer as Everett is suffering.

If you want my take on the likelihood of Chomsky’s theories to endure, well, I’m afraid I’m unqualified. I will only offer this: the notion that language is purely enculturated and social seems dead forever, and should be. That, for example, without a genetically-encoded language capacity a few hundred lingustically-deprived, impoverished Nicaraguan children could spontaneously generate a functioning human grammar in a few months, even as the adults in charge of them tried to actively suppress it, is absurd. The degree to which a genetic predisposition for language actively shapes human languages, the existence of a universal set of rules undergird all languages, the ability of researchers to determine to what degree language is genetic or enculturated – these remain areas of active and essential human inquiry. To take Chomsky’s thoughts on these topics as gospel would be foolish; he’d be the first to tell you so. To ignore what he says on them would be idiocy.

Contrary to what you might have heard, that idea of purely social or cultural language rules was challenged before Chomsky. Whether the many consequences Chomsky has derived from the poverty of the stimulus argument are true is a far larger question, and will likely not have any kind of simple pleasing “all wrong” or “all right” outcome. Wolfe’s suggestion that Everett’s work has comprehensively “KO’d” Chomskyan linguistics is simply untrue. Wolfe is telling a story, a simple story of underdog vs. heavy, and that story is far too narratively satisfying to be true. Beyond that, you’ll have to ask someone far brighter than me to get satisfactory answers.

I don’t believe that I can achieve perspective on Chomsky, his place in linguistics, the relevance and currency of his beliefs, or the ultimate consequences of his legacy. He’s just too big. His political writing invites scrutiny on his academic work, both good and bad, and don’t imagine they play no role in Wolfe’s disdain. His centrality in the field means that many questions are read through him and his work even when not appropriate. He inspires unthinking reverence; he prompts undeserved disdain. His ideas are respected, reviled, dominant, marginalized, orthodoxy, heresy, generative and useless, all at the same time. I can’t judge him. It’ll take history to do that.

But god, to be in that position! What Wolfe doesn’t seem to understand is that his thousands of words in Harper’s Magaine only solidify Chomsky’s place in Olympus. It’ll take decades to sort out his legacy, and a lot more than a white suit to tear him down.