chipping away

In my first ever Bloggingheads appearance, I told Conor Friedersdorf that I felt the traditional notion of meritocracy — that your economic outcomes are largely or solely the product of your work ethic and your talent, whatever talent is — was becoming empirically indefensible.

Take a look at this chart from this worthwhile piece by Matt O’Brien.

Students who grow up poor and graduate from college, in other words, are somewhat more likely to end up in the top 20% of all earners than students who grow up rich and drop out of high school. But they are no less likely to end up in the bottom 20% of all earners. Clearly, there is a college premium in these numbers; rich high school dropouts end up in the bottom 40% of earners a full 51% of the time, poor college grads only 33% of the time. But the advantage is far lower than you would believe given our national narratives about hard work and education being the ticket to the good life. The point isn’t to doubt the value of sending poor students to college — in fact, research suggests that they get the most benefit from a college degree — but to get real about the size and power of received economic advantage.

Here’s a related chart, from John Marsh’s excellent 2011 book Class Dismissed.

As we can see, parental income is hugely determinative of child income. More children born to parents from the bottom 20% of earners will end up in that quintile than will end up in the top three quintiles combined. Whatever is going on here, it is not a society where your economic outcomes are largely under your own control, no matter what Peter Thiel thinks.

Nor are educational outcomes immune. GPA by parent’s income band:

Chart via the AACU.

Dropout rate by family income:

Capture

Chart via the NCES (PDF).

This is true even concerning tests that are designed to measure “pure” ability.

Chart via The Wall Street Journal.

I am not the kind of person who thinks that every question is an empirical question or that the only way to answer questions usefully or truthfully is through a graph or numbers. Quite the contrary: I deeply believe in the need for humanistic and philosophical claims to truth, along with the empirical quantitative types of knowing that I frequently engage in when researching. The question is, what kind of claims are being made, and what kinds of evidence are appropriate to address them? The question of how much control the average individual has over his or her own economic outcomes is not a theoretical or ideological question. What to do about the odds, that’s philosophical and political. But the power of chance and received advantage — those things can be measured, and have to be. And what we are finding, more and more, is that the outcomes of individuals are buffeted constantly by the forces of economic inequality. Education has been proffered as a tool to counteract these forces, but that claim, too, cannot withstand scrutiny. Redistributive efforts are required to address these differences in opportunity.

In the meantime, it falls on us to chip away, bit by bit, on the lie of American meritocracy.

police aggression will fall inevitably on the poor and the brown, pumpkin edition

If you’d like yet-more proof that elite, educated, white, bourgie “leftist” culture online has ceased to have any political convictions any more and has instead become a set of vapid social and cultural postures adopted to borrow the righteousness we associate with politics for social gain, look no further than the reaction to this pumpkin festival situation. This very Gawkery Gawker post about it pretty much encapsulates the Twitter-obsessed tryhard consensus which dominates our media: police going wild is hilarious when the target is “bros.” There were many of those white-critiques-of-whiteness-that-preemptively-exclude-the-white-person-making-the-critique tweets, as well as rampant unfunny Twitter jokes. (Note: all Twitter jokes are unfunny Twitter jokes.) If you actually care about social equality between races and economic classes, this is all very, very dumb.

First: police violence and aggression is wrong no matter who it targets. Crazy!

Second: police violence and aggression against people we assume have social capital is a signal that those who we know don’t have social capital will get it far worse. If these cops feel that they have this much license to go wild against that white, largely-affluent crew, what do you think they’ll do when they pull over some working class black guy in a run-down car? Treating this as a barrel of laughs throws away a profound opportunity to include these types of people in a very necessary social movement against police violence, which poor people of color desperately need. Instead of using these moments as an opportunity for political coalition building, affluent, educated white people on Twitter use it as an opportunity for levity — precisely because they don’t fear the police and so feel no pressing need to take advantage of that opportunity.

The recent yen for concern trolling due process and free speech rights among our ostensibly-liberal social climber class is destructive in large part because in the main, it will not be the white affluent types that are both their brethren and the subject of their derision that suffer in a world without these rights. It will instead be racial minorities and the poor. Degrading civil liberties for affluent white bros might feel like a blow against them, but the negative effects will fall on the people you claim to speak for, in a flagrantly unequal society.

Of course, the purpose of adopting a left-wing posture on elite Twitter is not actually to advance a left-wing cause, but is rather to draft political resistance and moral righteousness into the pointless post-collegiate social competition between the overeducated whites who run our media. In that regard… mission accomplished, guys!

Update: Of course I understand — of course, I understand — that this is precisely the sort of situation where, if it were black people acting this way, Fox News would be flipping out about “black culture,” which is an indefensible position. And I understand the need to point out that hypocrisy. I’m just saying: let’s keep our eyes on the prize, that’s all. Let’s be careful.

yes, carceral feminism is A Thing

Amber A’Lee Frost critiques this (excellent) Jacobin post by Victoria Law about carceral feminism:

Here are the interpretations of the term I am able to come up with.

a) There is a feminist movement for whom imprisonment is the primary political project or cornerstone policy. I know of no such feminist subculture, and I consider myself pretty well-versed on even the more marginal feminisms. Of course there are feminists who support or fail to criticize incarceration, but that is generally part and parcel to a larger neoliberal politic, making them… liberal feminists. 

or

b) that feminists have an overwhelming and powerful presence in incarceration policy. How strange, that we would have such a concentrated power in such a specific sphere. You’d think we’d have at least passed equal pay by now. I was sure even had bigger priorities than throwing people in jail.

So why is it that the Violence Against Women Act, a properly neoliberal piece of Clinton-era legislation, penned by meathead Joe Biden, is blamed on women? Well… why not? We blame them for everything else.

There’s a few problems here. First: feminists, and particularly feminist women, pushed very hard for the Violence Against Women Act. That’s just a historical fact, as Law points out. The point is not to blame those feminist women; there’s no value in doing that. The point is to demonstrate that the best intentions can result in ugly consequences. That’s an essential historical point and I wish Frost didn’t dismiss it.

More importantly, Frost is confusing a critique of a political tendency with a critique of a political philosophy. Carceral feminism is the tendency of self-identified feminists to become credulous to the emancipatory power of the violent apparatus of the state in their efforts to achieve feminist ends like reductions in violence against women. Of course nobody chooses the name “carceral feminist,” any more than people choose the name neoliberal. But in each case, the term aptly fits a destruction political and rhetorical practice. Mistaking a criticism of a tendency for a criticism of a philosophy is particularly damaging because almost nobody actually has a political philosophy. We instead have a collection of tendencies that we then knit together into something resembling a coherent philosophy out of self-protective and egotistical motives. What’s undeniable, in the present moment, is that many people who consider themselves leftists are betraying a breathtaking amount of trust in the police and prosecutors. They are doing so at precisely the same time that they are passionately animated against the police state in Ferguson, in New York City, and elsewhere. Many are capable of holding together these utterly incompatible positions because they don’t have a political philosophy, but rather a set of cultural and social customs that they confuse with a politics. The result is an incoherent denigration of the police state on one hand and the elevation of that same police state to the role of savior on the other.

Finally: I get that, to a lefty like Frost, “liberal feminist” is a critique that stings. But to the vast majority of feminists — exactly the people that need to be convinced that the police state is not their friend — “liberal feminist” is a badge of honor. You cannot use it to get them to examine the flat inconsistencies in their current political preferences. Saying that there is no such thing as a carceral feminist because there is already such a thing as a liberal feminist is like saying, in the mid-50s, that there is no such thing as a McCarthyist because there is already such a thing as a authoritarian. It’s abstracting away from a particular political crisis to a grand ideological point of almost no immediate political valence. Right now, some feminists are using the mantle of feminism to defend the processes and people that they correctly identify as the source of racism and misery in the black community. The term “carceral  feminism” is as good a term as any to provoke a conversation about that condition.

At the end of the most well-intentioned law in the history of laws, there’s a cop. That’s what we’re talking about here. The rest is window dressing.

sorry for being so cranky

So I want to say that I’m sorry for being such a crabby patty lately. I’m not reversing course on any opinions I’ve shared, but I have been shorter with people than I intend, and I apologize for that.

As far as excuses goes, I’m just busy and stressed. At the moment, I’m dissertating, on the job market, teaching a class, tutoring four hours a week, helping to run a massive assessment of our massive freshman composition program, taking a graduate seminar, editing a textbook, rating students for our oral English examination, doing research assistant work for a program in the Education department, working as Communications Editor for an online journal, representing my department in the graduate student senate, writing a piece for a magazine and a pitch for another magazine, reviewing three books, working on revisions for a journal article, and trying to keep the social fires burning. I’m not complaining. I love my life and I’ve made the choices to be in this position and I know there are some reading this who are far busier. I’m just trying to lay out why I’ve been a bit intemperate.

I’ll try to be better with that stuff in the near future. All of my writing is animated by emotion– I literally don’t know how to do it otherwise– but I can try and choose which emotions dictate how I work. So let’s get going.

reup: difficult reading can be reading for pleasure

Since Austin Kleon and his merry gang are up to their old tricks, defining what pleasure has to be for other people, I would like to reup this post I wrote about difficult reading, and why it can still have value even though absolutely everything in your culture, and every penny in capitalism’s coffers, tells you that the only art that should endure is that which constantly puts out its lips to be kissed.

Not every good thing in life is easy. Not everything that brings you immediate pleasure is good. There are other things to eat  besides candy. Sometimes your deepest pleasures will be those that you have work for the hardest. What you like is not what other people like. Your first instinct is not always right. Your definition of the right life is not right for everyone. What you want to do and what you should do are not always the same. The world does not exist to sate your appetites. You have to find the strength to live in a world where other people don’t make the same choices that you do. Not one word of your manifesto is truer or better than anyone else’s. You are not the only person in the world. You don’t get what you want in life. Grow up.

Update:  The most tiring thing in the world  for me is that we are all expected to live our lives so as not to offend other people’s insecurities, rather than having a social expectation that if your insecurity is illogical, it’s your job get over it.  No one is judging you.  What will it take to get people to accept that?

orienteering, not orientation

I’m writing from the 10th Biennial Thomas R. Watson Conference at the University of Louisville, where the conference theme is Responsivity– responding to student need, to public desires, responding to the world outside of the academy. Public engagement and bringing our work to the wide world has become something of an obsession in the field, which is part of what makes the continued perception of English specifically and the humanities generally as insular or obscure so frustrating. More work to be done.

I just saw a good keynote from Jonathan Alexander of the University of California-Irvine. Alexander’s talk was inspired by the case of Ted Haggard, the mega church pastor who had the ear of the Bush administration and was caught frequenting male “masseuses” regularly. Haggard initially denied everything, and then later admitted that he had engaged in sexual contact with other men many times. But what Haggard refused to do was to accept a simple definition by the media as a gay man. The effort to define him in this way was undertaken as forcefully by liberals as conservatives; Alexander quoted Jon Stewart from the Daily Show mocking Alexander by saying that he couldn’t “run from the gay.” As Alexander noted, this was coming from someone who thought of himself as criticizing only hypocrisy, as standing for gay people and gay rights, and yet his aggression and mockery ultimately had the opposite effect. Alexander asked us to think about why, in an era of rapid advancement for gay rights, we remain culturally indisposed to push people into categories rather than to accept their questioning, their status as unfinished or undefined beings.

Alexander’s talk made me think about a student of mine, who was in the first upper-level course I ever taught. A college junior, he wrote a remarkable essay about his experience as a 17 year old, telling some of his friends that he had been having sex with other men. His friends were eager to be positive and accepting of this information– too eager. His privately sharing this information led them to pressure him to come out on National Coming Out Day. To them, this was a matter of necessity, of him declaring pride in who he was. But at the time, he was not interested in sharing that information more widely. More importantly to him, he didn’t then (and still didn’t, at the time he wrote his essay) identify as gay. It wasn’t a label that he felt applied to him or his life. What was most striking about his essay was how adamantly his friends believed that, in pushing him to adopt an identity he was not yet ready to claim, they were honoring him. Their attitude appears to have been that this is just the way it works– you have sex with other men, you come out as gay.

They were teenagers, and so this attitude is very forgivable. But I think that this story reflects a broader reality of our current social attitudes towards sex between men. (And not, usually, sex between women, which is a whole other story.) We’ve made tremendous leaps forward in terms of acceptance and support for gay and lesbian men and women, particularly youth. But we seem to still need the comfort and structure of categories, and categories limit and constrain as much as they support.

I can already hear the annoyance from people like Andrew Sullivan who, not entirely unfairly, complain about the postmodern tendency to act as if there is no settled sexual identity for anyone at all. I don’t at all doubt the physiological differences that often place people comfortably in one sexual identity or another. Saying that there is a spectrum of sexual preference does not at all imply that human beings are equally distributed along the spectrum. A large majority of people appear to be situated comfortably on the heterosexual extreme, although given the still-prevalent reality of social disdain (or just awkwardness) about being queer, that may be more of a social artifact than we think. And many people seem to be equally comfortably situated on the homosexual side of the spectrum. But to say that is not to deny that there are also people who have never felt comfortable being placed in that way. What respecting people’s sexual and romantic autonomy requires is to respect their self-definition, in part because of the simple fact that only the individual can experience their own sexual desire.

Alexander spoke a bit about the word “orientation” and what it means. I want us to think, culturally, less about orientation as a fixed identity and more about orienteering – the process of figuring out where you are and where you’re heading. Many people will end up pointing definitively in one direction. But we’ve got to respect the people who continue to explore, even when that exploration seems to be a wandering path. And we should take care to remember that everybody comes to comfort with their sexual self at their own pace. We need to give people that time, and that space, without rushing to define them, or joining with Jon Stewart in his insistence that any exploration is a matter of running away.

various things I have been told about affirmative consent

I have been told, in all seriousness, by people who argue vociferously not only for the passage of the bill in question but that the bill in question is such an obvious good that those questioning it are deluded or acting on ulterior motives:

  • That the bill does not at all specify explicit verbal consent but simply affirmative consent, that this standard does not require you to ask your partner for permission for every stage of sexual activity a la the Antioch Rules, that a touch or look or other form of nonverbal communication is sufficient to meet the standard, and that this reliance on inexplicit nonverbal communication somehow avoids the supposed ambiguity that this policy is designed to remove
  • That the bill, or the broader movement for affirmative consent as a universal norm, does require explicit verbal consent, because of course saying “she said yes with her eyes” is exactly the sort of thing that rapists say, and that this “checklist” or “survey” approach is sexy, that this notion of sexy is not at all an imposition of a subjective and normative vision of sexual practice common to the elite educated class that push for this law but a universal truth that has legal and political weight
  • That this bill only refers to the world of college, that we’re trying to simply rewrite college consent rules and thereby create two contrasting standards of sexual consent, which is one of the most crucial legal and moral definitions of human society, that this bill is not intended in any way to affect the world of legal jurisprudence, and simply refers to a very specific and limited set of ad hoc, de facto, amateur courts set up by underqualified and overempowered college administrators, and if you suggest it has consequences for the broader legal world of sexual consent for everyone, most certainly including the legal world
  • That of course the bill is part of a much larger movement to make affirmative consent the universal norm of sexual behavior, that in fact this bill is a “pilot program” to try it out, and that this is merely the step in a long process to redefine consent
  • That no one is talking about getting rid of the presumption of innocence and due process, and how dare you slander anyone by suggesting that they are
  • That in fact this bill is a necessary “brute force” method to undermine due process, because the presumption of innocence is too high a bar to be cleared when it comes to prosecuting sexual assault, and the problem is so big that we need to get our hands dirty
  • That this bill will result in people changing their typical sexual practices and embracing this new standard of consent, which I will remind you our moral overlords think is very sexy, and so all people will engage in explicit consent for every sexual encounter, because that is what morality requires and the law will soon insist on
  • That this bill will of course not have a big impact on your sexual activity and that no one expects everyone to suddenly start changing their sexual behaviors
  • That people who don’t obey this new standard will be punished and held accountable
  • That the notion that people who don’t obey this new standard will be punished and held accountable is a laughable conspiracy theory
  • That this bill changes everything
  • That this bill changes nothing.

All of these opinions have been expressed to me by supporters of this bill. Sometimes they have been expressed by the same people. Sometimes they have been expressed in the same messages, at once. You would think that such passionate advocacy, voiced by people who are sure that the wisdom of this bill is so obvious that anyone who questions it must surely be a creep themselves, would result in a more unified message.

In fact the only thing these advocates seem sure of is that the burden of this bill will never fall on them. For they, surely, are not the kind of people who need to fear the police state, or failures of due process, or being one of the eggs that gets cracked in the making of an omelet. On that, their message is unified: they are the good people who will never have to worry.

Income is more predictive than race for early college success

There’s a lot of meat in the 2014 High School Benchmarks Study from the National Student Clearinghouse Research Center, and I encourage you to dive in yourself. I’m just digging into it now. I want to flag something fairly simple: in terms of both college enrollment rates immediately following high school graduation and the crucial question of first-to-second year persistence, income level is more important than high- or low-minority status across all three identified locales (urban, suburban, and rural).

first fall enrollment

In all sectors, the gap between low- and high-minority schools is smaller than the gap between comparable low- and high-income schools, when it comes to immediate post-high school college enrollment.

first to second year persistence

Persistence rate from first to second year, which is just as important or more than enrollment rate, follows a similar pattern. Again, in most sectors the gap between low- and high-minority schools is smaller than the gap between low-income and high-income schools. And in all sectors, high-minority, high-income schools  outperform low-income, low-minority schools. When it comes to early college enrollment and success, the income gap is more powerful than the racial achievement gap. But because students from racial minorities have lower average parental incomes than white students, enrollment and persistence for racial minorities is depressed. Further, because college attendance is strongly predictive of adult unemployment and income level, the racial income gap is self-perpetuating: lower parental incomes keep black and Hispanic students out of college, and those students go on to have lower incomes as parents, keeping their children out of college. The need for a redistributive approach to solving this gap is clear.

Other tidbits:

1. There is essentially no differences in performance between high-minority schools across locales. There are large gaps, however, between low-minority schools across locales. Among high-income low minority schools, rural schools perform far worse than their urban and suburban counterparts. And the gap between low-income, low-minority urban schools and low-income, low-minority rural schools is a huge 11% in first-year enrollment and 6% in retention.

2. At the lower income level, high-minority schools outperform their low-minority counterparts in the suburban and rural locales. In fact, poor, predominantly white schools in rural areas do as badly or worse than any other segment. In the cities, low-minority schools enjoy significant advantages over their high-minority counterparts.

3. These data point to a general reality of American education and inequality: failure is concentrated among poor minority students in cities and poor white students in rural areas. Meanwhile, rich students in cities and suburbs do well regardless of racial background, with white, rich, urban kids developing a large advantage over the rest of the country.

the burden of expanding the police state’s power to prosecute sex crimes will fall on the poor and the black

Ezra Klein has a piece out about affirmative consent laws that, in many ways, belongs in a time capsule. I can hardly imagine a document that is a better encapsulation of the performative morality of the educated media class that dominates our national conversation, wedded to the broken economics of online journalism, wrapped up in the peculiar pathologies of a group of people who speak of nothing but the vagaries of privilege while working to solidify that privilege at every turn.

It’s interesting to think of this issue from the perspective of data journalism, the supposedly ideology-free approach to life where empirical data is handed down on us from Zeus, wrapped in silk and untouched by the stain of ideology. Yet here, Klein arrives at a straightforwardly illogical conclusion and embraces it. Not that there’s much actual challenge here, of course. Ezra is no dummy. He knows privileged lefty culture too well to risk questioning this law. As he is surely aware, there is all kinds of risk in going against the grain when it comes to this issue, and essentially no reward. After all, as Shikha Dalmia recently found out the hard way, there is a price to be paid for failing to endorse the consensus here.

And as goes Ezra, so will go elite media. Klein has always been one of those interesting media figures, at once a weather vane and the weather. Brad Delong once showed up in my comments to tell me that criticizing Klein is “career-limiting.” That amounts to essentially proving every criticism I’ve ever made about the media. But it’s not wrong.

In this whole fracas, I have found that the supporters of the law that I respect the most are the ones who admit that the purpose of affirmative laws is not to broadly change conventional sexual behavior but simply to remove the presumption of innocence when it comes to sex crimes. Almost no one I talk to about this issue thinks that every couple will start asking for explicit permission at every stage of every sexual encounter, and indeed many, such as Jezebel’s Erin Gloria Ryan, mock the notion that people who fail to follow the letter of the law will ever be prosecuted. (Indeed, my assumption is that the vast majority of people who advocate for these laws publicly do not practice the old Antioch Rules themselves and have no intention of starting, law or no law.) Instead, the purpose of this law is to effectively remove the presumption of innocence and shift the burden to the accused to prove that he sought and obtained consent in a sexual encounter. That the presumption of innocence is essentially the bedrock principle on which Western jurisprudence is built, thanks to a vast history of judicial abuse and overreach, is a fact too inconvenient to be mentioned in the realm of affective politics.

What I would like to point out today is that the police officers and prosecutors who will be given these expanded powers are the same ones that rule our current judicial system, a system of hideous racial and economic inequality that already imprisons Americans at an almost impossible rate and which has revealed itself again and again to be incapable of policing or prosecuting without deep racial inequality. I know these things because — well, because of people like Ezra Klein, because of sites like Vox, because of the self-same elite media that has been so aggressive in making the case for affirmative consent laws. To their immense credit, our media elites have taken to speaking far more plainly and critically about police misconduct. The people who would hand unprecedented license to police and prosecutors turn around and post photos from Ferguson vigils to their Facebooks.

Let’s focus specifically on the question of exoneration for crimes and for sex crimes. That false accusations of rape are rare is a commonplace in this discussion, and one that I agree with entirely. But this is also true: sexual assaults convictions are unusually likely to later be overturned. In a study of 250 convictions that were later overturned, 89 percent were for sex crimes, despite the fact that only 10 percent of our prison population is serving time for such crimes. There’s no contradiction in saying that false rape accusations are rare but that sex crime accusations are more likely than others to later be overturned. And as is true in all aspects of the American criminal justice system, the embedded racism of that system plays a huge part. Black and Hispanic men make up about 14% of the American population, and yet they made up 70% of the exonerated convictions. As the linked piece points out, this percentage exceeds the already large overrepresentation of men of color in the overall prison population. As most false convictions will likely never be overturned, we can be sure that there are far more men of color currently serving time for sex crimes they did not commit. The causes are not hard to ascertain. A related study concerning the same data, closely examining 35 of these cases, found repeated instances of “witness mis-identification, coerced forced confessions, flawed ‘scientific’ evidence, and official misconduct.”

It turns out that America’s racist and incompetent police forces and district attorneys don’t suddenly become competent and enlightened when it comes to prosecuting sex crimes. In fact, the opposite seems to be the case. The police state that will implement affirmative consent laws is the same one that kills poor black men with impunity. How could that possibly be surprising to anyone?

Now there is a wrinkle to all of this, which is the fact that the current California law pertains only to college campuses. As I’ve said before, it seems perverse and dangerous to create separate definitions of consent for one small group of people, but that’s the bill. College sexual assault is a horrific problem, and deserves attention. But like all violent crimes, sexual assaults are committed disproportionately against the poor and uneducated, precisely those who are least likely to be attending college. In this way, the educated social and cultural group that writes our elite media has, for weeks, turned a universal issue that hurts the poor more than anyone and made it an issue about a small slice of our population. With great gravitas, Ezra intones, “This is, in a way, the definition of what it means to be entitled: the rules are designed to protect you from dangers that barely exist at the expense of exposing others to constant threat.” That is one definition of entitlement. Another definition is when you have taken a national conversation on a crime that is more likely to happen to the poor and uneducated and devoted it for weeks to a group that is whiter, richer, and more educated than the country at large.

In any event, if we continue to treat affirmative consent as an issue that only pertains to college students, we are creating a definition of sexual consent that pertains only to a small sliver of our population which comes from particular demographic backgrounds. If we universalize affirmative consent, we unleash a lower standard onto police and prosecutors who have already demonstrated themselves to be incapable of avoiding racial or class prejudice. And I in no way believe that the inequality that is endemic to our judicial system will not fall on people of color and working class students in our universities.

I would like for you to consider the Duke lacrosse rape case. Unlike some, I don’t think that this case demonstrates how rich white kids can’t get ahead in this society. In fact I think it proves the opposite. Yes, the accused suffered through an ugly situation. But they were exonerated, the prosecutor was not only subject to ethics charges but criminal prosecution and disbarment, the police launched an internal investigation, the accused pursued a large civil case, and a large swath of this country was gripped with righteous rage on their behalf. Meanwhile, hundreds of people with less social capital and privilege, many or most of them men of color, have been exonerated of sex crimes they did not commit, and not only have the prosecutors and police responsible faced no official sanction, in many cases they have never bothered to apologize. The discrepancy is not only not surprising, it’s banal: privilege endures, and it will endure in a world of affirmative consent. If you really believe that these laws will end up affecting the privileged frat boys that you’ve imagined, then I think you simply don’t understand America. This is still the America of Emmett Till.

This is reality: if we do indeed lower the burden of proof for the police state to prosecute sexual assault cases, that power will be handed to the same people who are responsible for Ferguson and the NYPD and all the rest. And this is reality: the burden of that increased power will inevitably fall on the poor and the black, because that is who the white police state prosecutes with greater zeal than any other. That is not conjecture. That is not a guess. We know that. We know that the police state is racist. We know that the police state targets the poor. We know that false convictions are far more likely to happen to black and Hispanic men. We know those things. Doing away with the presumption of innocence will not mostly hurt privileged white frat boys. It will hurt  poor people and black people the way that our judicial system always does. So if you, like Klein, want to be breezy and loose in your talk about the consequences of a law that many or most admit is badly flawed, fine. But let’s count those costs like adults. Let’s talk about the prison state we have and not the one you wish we have. Let’s talk about this America and not the one that you’ve invented. Because in this America, we know what happens when you give prosecutors and police greater license. We know who they use that license against

But hey. As long as we’re making an omelet, am I right?

Update: “By this reasoning, we can never criminalize anything, because the burden of criminalization will always fall harder on racial minorities and the poor.”

We can and have to pass laws criminalizing things when necessary. But when we do, we must weigh the potential benefits of that criminalization against the sure knowledge than in an unequal society riven by gender, class, and race inequalities, there will be additional burdens on those who lack social privilege. We have to be confident that the benefits outweigh the potential inequality in outcomes. So: when the supporters of a law are themselves calling it terrible, and when that law expands the ease of prosecuting cases that we know have already resulted in deep racial inequality, do you believe that we’ve met that standard?

“Evaluating the Comparability of Two Measures of Lexical Diversity”

I’m excited and grateful to share with you the news that my article “Evaluating the Comparability of Two Measures of Lexical Diversity” has been accepted for publication in the applied linguistics journal System. The article should appear in the journal in the next several months. I would like to take a minute and explain it in language everyone can understand. I will attempt to define any complex or unusual terms and to explain processes as simply as possible, but these topics are complicated so getting to understanding will take some time and effort. This post will probably be of interest to very few people so I don’t blame you if you skip it.

What is lexical diversity?

“Lexical diversity” is a term used in applied linguistics and related fields to refer to the displayed range of vocabulary in a given text. (Related terms include lexical density, lexical richness, etc., which differ based on changes in how these terms are used and defined — for example, some systems use weightings based on the relative rarity of words.) Evaluating that range seems intuitively simple, and yet developing a valid, reliable metric for such evaluation has proven unusually tricky. A great many attempts to create such metrics have been undertaken with limited success. Some of the more exciting attempts now utilize complex algorithmic processes that would not have been practically feasible before the advent of the personal computer. My paper compares two of them and provides empirical justification for a claim about their mechanism made by other researchers.

Why do we care about it?

Lexical diversity and similar metrics have been used for a wide variety of applications. Being able to display a large vocabulary is often considered an important aspect of being a sophisticated language user. This is particularly true because we recognize a distinction between active vocabulary, or the words a language user can utilize effectively in their speech and writing, and a passive vocabulary, or the words a language user can define when challenged, such as in a test. This is an important distinction for real-world language use. For example, many tests of English as a second language involve students choosing the best definition for an English term from a list of possible definitions. But clearly, being able to choose a definition from a list and being able to effectively use a word in real-life situations are different skills. This is a particularly acute issue because of the existence of English-language “cram schools” where language learners study lists of vocabulary endlessly but get little language experience of value. Lexical diversity allows us to see how much vocabulary someone actually integrates into their production. This has been used to assess the proficiency of second language speakers; to detect learning disabilities and language impairments in children; and to assess texts for readability and grade-level appropriateness, among other things. Lexical diversity also has application to machine learning of language and natural language processing, such as is used in computerized translation services.

Why is it hard to measure?

The essential reason for the difficulty in assessing diversity in vocabulary lies in the recursive, repetitive nature of functional vocabulary. In English linguistics there is a distinction between functional and lexical vocabulary. Functional vocabulary contains grammatical information and is used to create syntactic form, and contains categories like the articles (determiners) and prepositions; lexical vocabulary delivers propositional content and contains categories like nouns and verbs. Different languages have different frequencies of functional vocabulary relative to lexical. Languages with a great deal of morphology — that is, languages where words change a great deal depending on their grammatical context– have less need for functional vocabulary, as essential grammatical information can be embedded in different word forms. Consider Latin and its notorious number of versions of every word, and then contrast with Mandarin, which has almost no similar morphological changes at all. English lies closer on the spectrum to Mandarin than to Latin; while we have both derivational morphology (that is, changes to words that change their parts of speech/syntactic category, the way -ness changes adjectives to nouns) and inflectional morphology (that is, changes to words that maintain certain grammatical functions like tense without changing parts of speech, the way -ed  changes present to past tense), in  comparison to a language like Latin we have a pretty morphologically inert language. To substitute for this, we have a) much stricter rules for word order than a language like Latin and b) more functional vocabulary to provide structure.

What does this have to do with assessing diversity in vocabulary? Well, first, we have a number of judgement calls to make when it comes to deciding what constitutes a word. There’s a whole vast literature about where to draw the line to determine what makes a word separate from another, utilizing terms like wordlemma, and word family. (We have a pretty good sense that dogs is the same word as dog but what about doglike or He was dogging me for days, etc.) I don’t want to get too far into that because it would take a book. It’s enough to say here that in most computerized attempts to measure lexical diversity, such as the ones I’m discussing here, all constructions that differ by even a single letter are classified as different terms. In part, this is a practical matter, as asking computers to tell the difference between inflectional grammar and derivational grammar is currently not practical. We would hope that any valid measure of lexical diversity would be sufficiently robust to account for the minor variations owing to different forms.

So: the simplest way to assess the amount of diversity would simply be to count the number of different terms in a sample. This measure has been referred to in the past as Number of Different Words (NDW) and is now conventionally referred to as Types. The problem here is obvious: you could not reliably compare a 75-word sample to a 100-word sample, let alone a 750-word sample. To account for this, researchers developed what’s called a Type-to-Token  Ratio (TTR). This figure simply places the number of unique words (Types) in the numerator and the number of total words (Tokens) in the denominator, to generate a ratio that is 1 or lower. The highest possible TTR, 1, is only possible if you never repeat a term, such as if you are counting (one, two, three…) without repeating. The lowest possible TTR, 1/tokens, is only possible if you say the same word over and over again (one,  one, one….). Clearly, in real-world language samples, TTR will lie somewhere in between those extremes. If half of all your terms are new words, your TTR would be .50, for example.

Sounds good, right? Well, there’s a problem, at least in English. Because language is repetitive by its nature, and particularly because functional vocabulary like articles and prepositions are used constantly– think of how many times you use the words “the” and “to” in a given conversation– TTR has an inevitable downward trajectory. And this is a problem because, as TTR inevitably falls, we lose the ability to discriminate between language samples of differing lengths, which is precisely why TTR was invented in the first place. For example, a 100-word children’s story might have the same TTR as a Shakespeare play, as the constant repetition of functional vocabulary overwhelms the greater diversity in absolute terms of the latter. We can therefore say that TTR is not robust to changes in sample size, and repeated empirical investigations have demonstrated that this sensitivity can apply even when the difference in text lengths are quite small. TTR fails to adequately control for the confounding variable it was expressly intended to control for.

A great many attempts have been made to adjust TTR mathematically– Guiraud’s Root TTR, Somer’s S– but none of them have worked.

What computational methods have been devised to measure lexical diversity?

Given the failure of straightforwardly mathematical attempts to  adjust TTR, and with the rise of increasingly powerful and accessible computer programs for processing text, researchers turned to algorithmic/computational models to solve the problem. One of the first such models was the vocd algorithm and the metric it returns, D. D stands today as one of the most popular metrics for assessing diversity in vocabulary. For clarity, I refer to D as “VOCD-D” in this research.

Developed primarily by the late David Malvern, Gerard McKee, and Brian Richards, along with others, the vocd algorithm in essence assesses the change in TTR as a function of text length and generates a measure, VOCD-D, that approximates how TTR changes as a text grows in length. Consider the image below, which I’ve photographed from Malvern et. al’s 2004 book Lexical diversity and language development: Quantification and assessment. (I apologize for the image quality.)

ttr over tokens

What you’re looking at is a series of ideal curves depicting changing TTR ratios over a given text length. As we move from the left to the right, we’re moving from a shorter to a longer text. As I said, the inevitable trajectory of these curves is downward. They all start in the same place, at 1, and fall from there. And if we extend these curves far enough, they would eventually end up in the same place, bunched together near the bottom, making it difficult to discriminate between different texts. But as these curves demonstrate, they do not fall at the same rate, and we can quantitatively assess the rate of downward movement in a TTR curve. This, in essence, is what vocd does.

The depicted curves here are ideal in the sense that they are artificial for the process of curve fitting. Curve fitting procedures are statistical methods to match real-world data, which is stochastic (that is, involves statistical distortion and noise), to approximations based on theoretical concepts. Real-world TTR curves are in fact far more jagged than this. But what we can do with software is to match real-world curves to these ideal curves to obtain a relative value, and that’s how vocd returns a VOCD-D measurement. The algorithm contains an equation for the relationship between text length, TTR, and VOCD-D, processes large collections of texts, and returns a value (typically between 40-120) that can be used to assess how diverse the vocabulary is in those texts. (VOCD-D values can really only be understood relative to each other.) The developers of the metric define the relationship between TTR, for tokens, and D for a given value along a TTR curve as TTR = D/N[(1+2n/d)^1/2 - 1]

Now, vocd uses a sampling procedure to obtain these figures. By default, the algorithm takes 100 random samples of 35 tokens, then 36 tokens, then 37, etc., until 50 tokens are taken in the last sample. In other words, the algorithm grabs 100 randomly-chosen samples of 35 words, then 36, etc., and returns an average figure for VOCD-D. The idea is that, because different segments of a language sample might have significantly different levels of displayed diversity in vocabulary, we should draw samples of differing sizes taken at random from throughout each text, in order to ensure that the obtained value is a valid measure. (The fact that lexical diversity is not consistent throughout a given text should give us pause, but that’s a whole other ball of wax.) Several programs that utilize the vocd algorithm also run through the whole process three times, averaging all returned results together for a figure called Doptimum.

VOCD-D is still affected by text length, and its developers caution that outside of an ideal range of perhaps 100-500 words, the figure is less reliable. Typical best practices involve combining VOCD-D with other measures, such as the Maas Index and MTLD (Measure of Textual Lexical Diversity), in order to make research more robust. Still, VOCD-D has shown itself to be far more robust across differing text lengths than TTR, and since the introduction of widely-available software that can measure it, notably the CLAN application from Carnegie Mellon’s CHILDES project, it has become one of the most commonly used metrics to assess lexical diversity.

So what’s the issue with vocd?

In a series of articles, Phillip McCarthy of the University of Memphis’s Institute for Intelligent Systems and Scott Jarvis of Ohio University identified a couple of issues with the vocd algorithm. They argue that the algorithm produces a metric which is in fact a complex approximation of another measure that is a) less computationally demanding and b) less variable. McCarthy and Jarvis argued that vocd‘s complex curve-fitting process actually approximates another value which can be statistically derived from a language sample based on hypergeometric sampling. Hypergeometric sampling is a kind of probability sampling that occurs “without replacement.” Imagine that you have a bag filled with black and white marbles. You know the number of marbles and the number of each color. You want to know the probability that you will withdraw a marble of a particular color each time you reach in, or what number of each color you can expect in a certain number of pulls, etc. If you are placing the marbles back in the bag after checking (with replacement), you use binomial sampling. If you don’t put the stones back (without replacement), you use hypergeometric sampling. McCarthy and Jarvis argued, in my view persuasively, that the computational procedure involved in vocd simply approximated a more direct, less variable value based on calculating the odds of any individual Type (unique word) appearing in a sample of a given length, which could be accomplished with hypergeometric sampling. VOCD-D, according to Jarvis and McCarthy, ultimately approximates the sum of the probabilities of a given type appearing in a sample of a given length. The curve-fitting process and repeated random sampling merely introduces computational complexity and statistical noise. McCarthy and Jarvis’s developed an alternative algorithm and metric. Though statistically complex, the operation is simple for a computer, and this metric has the additional benefit of allowing for exhaustive sampling (checking every type in every text) rather than random sampling. McCarthy and Jarvis named their metric HD-D, or Hypergeometric Distribution of Diversity.

(If you are interested in a deeper consideration of the exact statistical procedures involved, email me and I’ll send you some stuff.)

McCarthy and Jarvis found that HD-D functions similarly to VOCD-D, with less variability and requiring less computational effort. The latter isn’t really a big deal, as any modern laptop can easily churn through millions of words with vocd in a reasonable time frame. What’s more, McCarthy and Jarvis explicitly argued that research utilizing VOCD-D does not need to be thrown out, but rather that there is a simpler, less variable method to generate an equivalent value. But we should strive to use measures that are as direct and robust as possible, so they advocate for HD-D over VOCD-D, as well as calling for a concurrent approach utilizing other metrics.

McCarthy and Jarvis supported their theoretical claims of the comparability of VOCD-D and HD-D with a small empirical evaluation of the equivalence. They did a correlational study, demonstrating a very strong relationship between VOCD-D and HD-D, supporting their argument for the statistical comparability of the two measures. However, their data set was relatively small. In a 2012 article, Rei Koziumi and Yo In’nami argued that Jarvis and McCarthy’s data set suffered from several drawbacks:

(a) it used only spoken texts of one genre from L2 learners; (b) the number of original texts was limited (N = 38); and (c) only one segment was analyzed for 110-200 tokens, which prevented us from investigating correlations between LD measures in longer texts. Future studies should include spoken and written texts of multiple genres, employ more language samples, use longer original texts, and examine the effects of text lengths of more than 200 tokens and the relationships between LD measures of equal-sized texts of more than 100 tokens.

My article is an attempt to address each of these limitations. At heart, it is a replication study involving a vastly larger, more diverse data set.

What data and tools did you use?

I used the fantastic resource The International Corpus Network of Asian Learners of English, a very large, very focused corpus developed by Dr. Shin’ichiro Ishikawa of Kobe University. What makes the ICNALE a great resources is a) its size, b) its diversity, and c) its consistency in data collection. As the website says, “The ICNALE holds 1.3 M words of controlled essays written by 2,600 college students in 10 Asian countries and areas as well as 200 English Native Speakers.” Each writer in the ICNALE data set writes two essays, allowing for comparisons across prompts. And the standardization of the collection is almost unheard of, with each writer having the same prompts, the same time guidelines, and the same word processor. Many or most corpora have far less standardization of texts, making it much harder to draw valid inferences from the data. Significantly for lexical diversity research, the essays are spell checked, reducing the noise of misspelled words which can artificially inflate type counts.

For this research, I utilized the ICNALE’s Chinese, Korean, Japanese, and English-speaking writers, for a data set of 1,200 writers and 2,400 texts. This allowed me to compare results between first- and second-language writers, between writers of different language backgrounds, and between prompts. The texts contained a much larger range of word counts (token counts) than McCarthy and Jarvis’s original corpus.

I analyzed this data set with CLAN, in order to obtain VOCD-D values, and with McCarthy’s Gramulator software, in order to obtain HD-D values. I then used this data to generate Pearson product-moment correlation matrices comparing values for VOCD-D and HD-D across language backgrounds and prompts, utilizing the command-line statistical package SAS.

What did you find? 

My research provided strong empirical support for McCarthy and Jarvis’s prior research. All of the correlations I obtained in my study were quite high, above .90, and they came very close to the measures obtained by McCarthy and Jarvis. Indeed, the extremely tight groupings of correlations across language backgrounds, prompts, and research projects strongly suggests that the observed comparability identified in McCarthy and Jarvis’s work is the result of the mathematical equivalence they have identified. My replication study delivered very similar results using similar tools and a greatly expanded sample size, arguably confirming the previous results.

Why is this a big deal?

Well it isn’t, really. It’s small-bore, iterative work that is important to a small group of researchers and practitioners. It’s also a replication study, which means that it’s confirming and extending what prior researchers have already found. But that’s what a lot of necessary research is — slowly chipping away at problems and gradually generating greater confidence about our understanding of the world. I am also among a large number of researchers who believe that we desperately need to do more replication studies in language and education research, in order to confirm prior findings. I’ve always wanted to have one of my first peer-reviewed research articles be a replication study. I also think that these techniques have relevance and importance to the development of future systems of natural language processing and corpus linguistics.

As for lexical diversity, well, I still think that we’re fundamentally failing to think through this issue. As well as VOCD-D and HD-D work in comparison to a measure like TTR, they are still text-length dependent. The fact that best practices require us to use a number of metrics to validate each other suggests that we still lack a best metric of lexical diversity. My hunch (or more than a hunch, at this point) is not that we haven’t devised a metric of sufficient complexity, but that we are fundamentally undertheorizing the notion of lexical diversity. I think that we’re simply failing to adequately think through what we’re looking for and thus how to measure it. But that’s the subject of another article, so you’ll just have to stay tuned.

tl;dr version: Some researchers said that two ways to use a computer to measure diversity in vocabulary are in fact the same way, really, and provided some evidence to support that claim. Freddie threw a huge data set with a lot of diversity at the question and said “right on.”