Hey gang, I am officially on vacation for the first time since I started my job last September. Posting this coming week will be light, though I expect to have at least a couple pieces up. Thank you for your continued support of the ANOVA. I’m having lots of fun and hope you all are too.
Like so. (Fake charts I made with fake data, btw.)
You already get it, right?
Typically, when we perform some sort of an experiment, we want to look at how a particular number responds to the treatment – how blood pressure reacts to a new drug, say, or how students improve on a reading test when they’re given a new kind of lesson. We want to make sure that the observed differences are really the product of the treatment and not some underlying difference in observed groups. That’s what random controlled trials are for. So we randomly assign subjects to test and control groups, look at what the different averages are for the two different groups, note the size of the effect, and determine whether it is statistically significant.
But sometimes we have real-world conditions that dictate that subjects get sorted into one group or another non-randomly. If we then look at how different groups perform after some treatment, we know that we’re potentially facing severe selection effects thanks to that non-random assignment. But consider if we have assignment based purely on some quantitative metric, with a cutoff score that sorts people into one group or another. (Suppose, for example, students only became eligible for a gifted student program if they score above a cut score on some test.) Here we have a non-random distribution that we can actually exploit for research purposes. A regression discontinuity design allows us to explore the impact of such a program because, so long as students aren’t able to impact their assignment beyond their score on that test, we can be confident that students just above or just below the cutoff score are very similar.
Regression analyses will be run on all of the data, with subjects below and above the cut score combined but flagged into different groups. Researchers will run statistical models to determine whether there is a difference between groups who receive the treatment and those who don’t. As you can see in the scatterplots above, a large effect will be readily apparent in how the data looks. In the above scenario, the X axis represents the score students received on the test, the cut score is 15, and the Y axis represents performance on some later educational metric. In the top scatterplot, there is no meaningful difference from the gift students program, as the relationship between these two metrics is the same above and below the cut score. But in the bottom graph, there’s a significant jump at the cut score. Note that even after the intervention, the relationship is still linear – students who did better on the initial test do better on the later metric. But the scores of everyone have jumped right at the cut score.
There are, as you’d probably imagine, a number of potential pitfalls here, and assumption checks and quality controls are essential. All of the people tested would have to be able to be sorted into the gifted program solely on the basis of the test, the cutoff score has to be near the mean, and you need sufficient numbers to see the relationship on either side of the cut score, among other things. But if you have the right conditions, regression discontinuity design is a great way to get near-random experimental design quality in situations where you can’t do that for pragmatic or ethical reasons.
Matt Bruenig critiques the concept of the “Success Sequence” quite convincingly here. There are a lot of just-so stories in our culture about what it takes to be a success. Typically, these stories are confusing the lines of causation all over the place, failing to see that confounds and covariates are doing most of the explaining.
I sometimes get anxious emails from parents, wondering what they need to do to make sure their children are going to be OK academically. And because of networking effects and the nature of who reads this small-audience education blog, I can mostly tell them accurately that they don’t really have to do much of anything; they’ve already set up their children to succeed simply by virtue of having them. Here’s the real Academic Success Sequence:
- Be born to college-educated parents.1
- Be born to middle-class-or-above parents.
- Be born without a severe cognitive or developmental disability.
- Don’t be exposed to lead in infancy or early childhood.
- Don’t be born severely premature or at very low birth weight.
- Don’t be physically abused or neglected.
If you are one of those lucky enough to tick off these boxes, congratulations. You’ve got the vast majority of the accounted-for variance breaking in your favor. Is everything accounted for? No. We’ve got a lot of variance in cognitive and educational outcomes that never seems to be systematically explainable. I actually think that’s a good thing – perfect determinism is contrary to the fight for human meaning – but it’s important to say that this variance is not only not currently accounted for, it is likely never-to-be accounted for. This is what the behavioral geneticists call the “gloomy prospect“: the possibility that large portions of unaccounted-for variation in psychological traits like intelligence are the product of truly non-systematic events, like particular psychological traumas, getting a concussion, meeting the right person, having the right conversation at the right time….
Thus it’s the case that some people can “win” in all of the above categories and still suffer from real hardship in life, just as some can be on the wrong side in many or all of them and flourish. Still: if you’re an educated, employed parent raising a healthy child in a stable home environment, the odds are strongly in the favor of that child’s eventual academic success. Of course, none of this stuff is stuff that individuals can control, and much of it is not stuff that parents can control either – particularly given that the parents were once the children whose outcomes were similarly conditioned….
Now many people will say, well yeah, of course these things matter. But what do we do beyond that stuff? How do we set our kids up to succeed? I’m not going to say that nothing you do matters. But in terms of moving the quantitative indicators that people are, sadly, most fixated on are stubborn and hard to move. Some things appear to work – intensive one-on-one or small-group tutoring seems to me to have the most promising research literature – but we’re playing with small effect sizes here, particularly in comparison to the influence of the factors listed above. Of course you want to bend as much of the variance in a positive direction as you can. But the effects tend to be so small, and thus so subject to being offset by minor random fluctuations in uncontrolled variation, that it’s just not worth worrying about them. The best thing you can do for your kid is to be present and kind and supportive and then stop stressing out.
The great irony is that we’ve seen this growing culture of panic on the part of bourgie parents about their child rearing practices at the exact historical moment that we’ve learned conclusively that these practices just don’t mean very much.
In particular, the Baby Einstein stuff, trips to museums, violin lessons, edutainment software – my understanding is that there just is little to no rigorous research that shows that this stuff works to move the needle on SAT scores or GPA or similar, once you control for the kinds of confounds listed above. Does that mean that this stuff doesn’t matter, that you shouldn’t do them? Of course not. Children should all have the opportunity to lead intellectually enriched, challenging, and varied lives. I’m very grateful that I had that chance myself. But you need to appreciate them for their own sake and on their own terms, not as a means to goose test scores. And obsessing over getting your kid into the right preschool is pointless too, as is worrying over selective high schools. It may make you feel like the right kind of parent to fixate on this stuff; it may, more cynically, help you feel competitive with other parents. But extant evidence suggests it just doesn’t matter. What does matter is giving your child commitment, love, structure, and a moral education, because life is about so much more than where you go to college.
Of course, many people in our society are not lucky enough to have been born into the kind of advantaged position described above. Given that fact, you’d think that our system would be set up to minimize the impact of these unchosen factors. Instead we work to maximize their impact and call the resulting system “meritocracy.”
Louis Menand in The New Yorker:
The funny thing about the resistance all these writers put up to the idea that poems can change people’s lives is that every one of them had his life changed by a poem. I did, too. When I was fourteen or fifteen, I found a copy of “Immortal Poems of the English Language” in a book closet in my school. It was a mass-market paperback, and the editor, Oscar Williams, had judged several of his own poems sufficiently deathless to merit inclusion. But he was an excellent anthologist, and I wore that book out. It changed my life. It made me want to become a writer.
I had an almost identical experience, with an anthology put together by XJ Kennedy, a poet, essayist, translator, and all around man of letters. That’s my copy pictured here. In sophomore year of high school my old Latin teacher Mrs. Montgomery (gone, now, but never forgotten) had wanted to share a poem with me, and had dug around in her closet to find this old, little-loved and forgotten literature collection. It was divided into three sections: fiction, poetry, and drama. In time I would read the whole thing cover to cover, but at the time I obsessed over the poetry section. Growing up in a arts- and literature-obsessed home, I had gotten plenty of exposure to poetry, but this was the first time I really felt like I had the time and inclination to truly explore the form on my own. I got a real poetry education from that book, and learned not just Keats and Housman but Linda Pastan’s “Ethics” and Chesterton’s “The Donkey” and Amiri Baraka’s “Preface to a Twenty Volume Suicide Note.” I read it under my desk during algebra class and in the cafeteria and on the bus rides home from cross country meets, and today the cover is held on with masking tape, because I wore the damn thing out. When high school was over, I stole it.
I am, as you know, skeptical of the degree to which quantitative educational metrics like test scores can be changed by teachers and schools. But this carries with it the essential qualification: that test scores are not the measure of education’s value. Because I read and talk about quantitative research, and because I acknowledge that these tools are broadly predictive of all manner of eventual academic outcomes, I am often in agreement with those who view education in a reductive light. But my objections to that reductive thinking are as real and important as my objections to those who think that all individual students can be brought to the same levels of achievement on standardized tests. Indeed, precisely because differences in academic ability are real, we must take seriously all the things that education can do which are not expressible in a test score. I doubt that this book made the slightest difference to my SAT scores. Yet like Menand’s, my life was forever changed.
To the Muse
by XJ Kennedy
Give me leave, Muse, in plan view to array
Your shift and bodice by the light of day.
I would have brought an epic. Be not vexed
Instead to grace a niggling schoolroom text;
Let down your sanction, help me to oblige
Him who would leash fresh devots to your liege,
And at your altar, grant that in a flash
They, he, and I know incense from dead ash.
This past week, the Los Angeles Times was kind enough to run a revised version of an argument I had made here in the recent past – that Republican support of colleges and universities has collapsed, likely because of constant incidents on campus that create a widespread impression of anti-conservative bias, and that since our public universities are chartered and funded as non-partisan institutions, and because Republicans control enormous political power, our institutions are deeply threatened. I stand by that case.
I have gotten the usual grab bag of responses, most of them unmoored from specific principles about who should be able to say what on campus, and some of them directly contradictory with each other. As is typical, the number one rhetorical move has been to insist that student activists are only targeting the worst of the worst, Milo Yiannopoulos and Richard Spencer and the like. The idea is that people with mainstream views are entirely free to say whatever they want without issue because they don’t directly threaten marginalized people. That idea is factually incorrect, as anyone with the barest grasp on the facts should know.
- Student activists at Amherst University demanded that students who had criticized their protests be formally punished by the university and forced to attend sensitivity training.
- At Oberlin, students made a formal demand that specific professors and administrators be fired because the students did not like their politics.
- The Evergreen State College imbroglio involved students attempting to have a professor fired for criticizing one of their political actions.
- At Wesleyan, campus activists attempted to have the campus newspaper defunded for running a mainstream conservative editorial.
- A Dean at Claremont McKenna resigned following student backlash to an email she sent in response to complaints about the treatment of students of color.
- Students at Reed College attempted to shut down an appearance by Kimberly Peirce, the director of Boys Don’t Cry, removing posters advertising her talk and attempting to shout her down during her presentation.
- At Yale, students called for the resignation of Erika Christiakis for an email she wrote about culturally insensitive Halloween costumes and for the resignation of her husband Nicholas Christiakis for defending her.
- At the University of California Santa Barbara, the student government voted for mandatory trigger warnings, which would enable any student to skip class material that they decided was offensive.
- Laura Kipnis, a feminist professor at Northwestern, was the subject of a literal federal investigation because she published an essay students didn’t like.
- Mount Holyoke canceled the Vagina Monologues on campus under student pressure.
- American Sniper, a perfectly mainstream American blockbuster, was temporarily pushed off campus by student activists.
- Activists are Western Washington demanded the creation of a 15 person panel that would engage in surveillance of students, professors, and administrators in order to monitor everyone involved on campus for any expressions or actions that body deemed “racist, anti- black, transphobic, cissexist, misogynistic, ablest, homophobic, islamophobic, xenophobic, anti-semitism, and otherwise oppressive behavior.” That body would have the ability to discipline campus community members, including firing tenured faculty.
- A yoga class for disabled students at a Canadian university was canceled after students complained that yoga is a form of cultural appropriation.
There are more. You are free to support any or all of these student actions. But you are not free to pretend there is no trend here. Exactly how many of these incidents must pile up before people are willing to admit that many campus activists pursue censorship of ideas and expressions that they don’t like?
The obsession with Milo and Richard Spencer makes this conversation impossible in left circles. Those people are discussed endlessly because leftists believe that doing so makes it easy to argue – “what, you want Milo to be free to harass POC on campus?!?” But in fact because most conservatives on campus will simply be mainstream Republicans, this side conversation will be almost entirely pointless. What really matters is the way that perfectly mainstream positions are being run out of campus on a regular basis. And of course with a list like this we can be sure that there are many, many more cases that went unnoticed and unreported in the wider world.
You would think it would be easy for progressives and leftists simply to say “I support many actions that campus protesters take, but these censorship efforts are counterproductive and wrong.” But that almost never happens. That’s because in contemporary life, politics has almost nothing to do with principle, or even with political tactics. Instead it has to do with aligning yourself with the right broad social circles. To criticize specific actions of campus activists sounds to too many leftists like being “the wrong kind of person,” so they refuse to criticize students even when their actions are minimally helpful and maximally counterproductive. That in turn ensures that there’s no opportunity for the students to reflect, learn, and evolve.
There have, of course, been many leftist professors who have been the subject of censorship too. I have written about these cases and fought for those professors over and over again. They come not from student pressure but from administrative fecklessness, which is to be expected, as the administrators that sometimes accede to student censorship demands and those who silence leftist professors are working under the same philosophy: a corporate desire to avoid controversy and to protect the campus as a neoliberal institution. That students so often petition these same administrators to silence on their behalf speaks to the failure to truly grapple with the nature of administrative power.
Awhile back I laid out my frustrations with this conversation. In particular, almost no one who defends campus activist attempts to censor has ever articulated a coherent policy about who is and is not allowed to say what.
Whatever else, defenders of activists attempting to censor opinions they don’t like have to stop claiming that these censorship efforts only target the most extreme cases. Because that is simply, factually false. Stop obsessing about the most extreme cases and grapple with the clear and growing attempts to censor mainstream views on campus. It’s an important conversation to have. Or you can keep shouting “Milo!” over and over again because that’s easy and doesn’t force you into any difficult choices or conversations. That will ensure that we have no coherent defense against bias claims while the Republican party sets out to dismantle our institutions, brick by brick.
Today’s Study of the Week, by Olivier Marie and Ulf Zölitz, considers the impact of access to legal marijuana on college performance. (Via Vox’s podcast The Weeds.) The researchers took advantage of an unusual legal circumstance to examine a natural experiment involving college students and marijuana. For years now, the Netherlands has been working to avoid some of the negative consequences of its famous legal marijuana industry. While most in the country still support decriminalization, many have felt frustrated by the influx of (undoubtedly annoying) tourists who show up to Dutch cities simply looking to get high. This has led to some policies designed to ameliorate the negative impacts of marijuana tourism without going backwards towards criminalization.
In the city of Maastricht, one such policy involved only selling marijuana to people who had citizenship identification from the Netherlands, Germany, and Belgium, and not from other nationalities. These specific countries seem to have been chosen as a matter of geography – look at Maastricht on a map and you’ll see it’s part of a small Dutch “peninsula” wedged between Germany and Belgium. Importantly for our purposes here, Maastricht features a large university, and like a lot of European schools it attracts students from all over the continent. That means that when the selective-enforcement policy went into effect in 2011, one group of students still had access to marijuana, while another lost it, at least legally. That provided an opportunity to study how decriminalization impacts academic outcomes.
This research thus does not amount to a true randomized experiment, although I suppose that’s one that you could really do, given the long-established relative safety of marijuana use. (“Dude, I’ll slip you $100 not to end up in the control group! No placebo!”) Instead, like a couple of our Studies of the Week in the past, this research utilizes a difference-in-difference design, comparing outcomes for the two different groups using panel data, with a lot of the standard quality checks and corrections to try and root out construct-irrelevant variance between the groups. Ultimately they looked at 4,323 students from the School of Business and Economics. Importantly for our purposes here, there were about equivalent dropout rates between the two treatment groups, which can potentially wreak havoc on this kind of analysis if they are not closely matched.
There’s a couple obvious issues here. First, not only are these groups not randomly selected, they are deliberately selected by nationality. This could potentially open up a lot of confounds and makes me nervous. Still, it’s hard to imagine that there is a distinct impact of smoking marijuana on brains of people from different European nationalities, and the authors are quite confident in the power of their models to wash out nonrandom group variance. Second, you might immediately object that of course even students who are not legally permitted to smoke marijuana will frequently do so, and that many who can won’t. How do we know there aren’t crossover effects? Well, this is actually potentially a feature of the research, not a bug. See, that condition would be true in any decriminalization scheme; there will inevitably be people who use under a period of illegality and who don’t when decriminalized. In other words, this research really is looking at the overall aggregate impact of policy, not the impact of marijuana smoking on individual students. Much like the reasoning behind intent-to-treat models, we want to capture noncompliance because noncompliance will be present in real-world scenarios.
So what did they find? Effects of legal access to marijuana are negative, although to my mind quite modest. Their summary:
the temporary restriction of legal cannabis access increased performance by on average .093 standard deviations and raised the probability of passing a course by 5.4 percent
That effect size – not even a tenth of an SD – is interesting, as when I heard this study discussed casually, it sounded as if the effect was fairly powerful. Still, it’s not nothing, and the course-passing probability makes a difference, particularly given that we’re potentially multiplying these effects across thousands of students. The authors make the case for its practical significance like so:
Our reduced form estimates are roughly the same size as the effect as having a professor whose quality is one standard deviation above the mean (Carrell and West, 2010) or of the effect of being taught by a non-tenure track faculty member (Figlio, Shapiro and Soter, 2014). It is about twice as large as having a same gender instructor (Hoffmann and Oreopoulos, 2009) and of similar size as having a roommate with a one standard deviation higher GPA (Sacerdote, 2001). The effect of the cannabis prohibition we find is a bit smaller than the effect of starting school one hour later and therefore being less sleep-deprived (Carell, Maghakian & West, 2011).
This context strikes me as mostly being proof that most interventions into higher ed are low-impact, but still, the discussed effects are real, and given that marijuana use is associated with minor cognitive impairment, it’s an important finding. Interestingly, the negative effects were most concentrated in women students, lower-performing students, and in quantitative classes, suggesting that the average negative impact of legalization would be unequally distributed. One important note: these findings were consistent even when correcting for time spent studying, suggesting that it wasn’t merely that students who had access to marijuana were less inclined to work but actually performed less well on their tasks on a minute-per-minute basis.
What do we want to do with this information? Does this count as evidence supporting continued marijuana criminalization? No, not to me. Part of what makes achieving a sensible drug policy difficult lies in this shifting of the burden of proof: things that are already illegal are often treated as worthy of decriminalization only if they can be proven to be literally harmless. But all number of behaviors that are perfectly legal involve harms. Alcohol and tobacco use are obvious examples, but there are others, including eating junk food – which is not just legal but actively subsidized by our government, thanks to a raft of bad laws and regulation that provide perverse incentives for food production. Part of freedom means the freedom to make bad choices. The question is when those choices are so bad that society feels compelled to prevent individuals from making them. Even if you aren’t as attached to civil liberties as I am, I think you can believe that marijuana use simply doesn’t qualify.
As for myself, I actually mostly stopped smoking when I got to grad school. In part that’s because I didn’t enjoy it anymore the way I once did. But it was also because I knew I simply couldn’t read and write effectively after I had smoked, and graduate study required me to be reading and writing upwards of 12 hours a day. That’s by no means universal; some people I know find it helps them concentrate. Likewise, I am useless as a writer after more than one beer, though of course there are many writers who famously wrote best when soused. Still, it seems to me entirely intuitive that habitual marijuana use would have minor-but-real negative impacts on academic outcomes. Marijuana, as safe as it is, and as ridiculous as its continued federal illegality in the United States is, does tend to cause minor cognitive impairments, and it would be foolish to assume there’s no negative educational impacts associated with it.
I’d still rather have college kids getting stoned than binge drinking constantly. And ultimately this is a question of pluses and minuses that individual people should be able to weigh for themselves, just as they do when they decide on a cheeseburger or a salad. That’s what freedom is all about, and one part of college is giving young people a chance to make these kinds of adult decisions for themselves.
Let’s imagine a bit of research that we could easily perform, following standard procedures, and still get a misleading result.
Say I’m an administrator at Harvard, a truly selective institution. I want to verify the College Board’s confidence that the SAT effectively predicts freshman year academic performance. I grab the SAT data, grab freshmen GPAs, and run a simple Pearson correlation to find out the relationship between the two. To my surprise, I find that the correlation is quite low. I resolve to argue to colleagues that we should not be requiring students to submit SAT or similar scores for admissions, as those scores don’t tell us anything worthwhile anyway.
Ah. But what do we know about the SAT scores of Harvard freshmen? We know that they’re very tightly grouped because they are almost universally very high. Indeed, something like a quarter of all of your incoming freshman got a perfect score on the (new-now-old) 2400 scale:
|Section||Average||25th Percentile||75th Percentile|
The reason your correlation is so low (and note that this dynamic applies to typical linear regression procedures as well) is that there simply isn’t enough variation in one of your numbers to get a high metric of relationship. You’ve fallen victim to a restriction of range.
Think about it. When we calculate a correlation, we take pairs of numbers and see how one number changes compared to the other. So if I restrict myself to children and I look at age in months compared to height, I’m going to see consistent changes in the same direction – my observations of height at 6 months will be smaller than my observations at 12 months and those will in turn be smaller than at 24 months. This correlation will not be perfect, as different children are of different height and grow at different rates. The overall trend, however, will be clear and strong. But in simple mathematical terms, in order to get a high degree of relationship you have to have a certain range of scores in both numbers – if you only looked at children between 18 and 24 months you’d be necessarily restricting the size of the relationship. In the above example, if Harvard became so competitive that every incoming freshman had a perfect SAT score, the correlation between SAT scores and GPA (or any other number) would necessarily be 0.
Of course, most schools don’t have incoming populations similar to Harvard’s. Their average SAT scores, and the degree of variation in their SAT scores, would likely be different. Big public state schools, for example, tend to have a much wider achievement band of incoming students, who run the gamut from those who are competitive with those Ivy League students to those who are marginally prepared, and perhaps gained admission via special programs designed to expand opportunity. In a school like that, given adequate sample size and an adequate range of SAT scores, the correlation would be much less restricted – and it’s likely, given the consistent evidence that SAT scores are a good predictor of GPA, significantly higher.
Note that we could also expect a similar outcome in the opposite direction. In many graduate school contexts, it’s notoriously hard to get bad grades. (This is not, in my opinion, a problem in the same way that grade inflation is a potential problem for undergraduate programs, given that most grad school-requiring jobs don’t really look at grad school GPA as an important metric.) With so many GPAs clustering in such a narrow upper band, you’d expect raw GRE-GPA correlations to be fairly low – which is precisely what the research finds.
Here’s a really cool graphic demonstration of this in the form of two views on the same scatterplot. (I’m afraid I don’t know where this came from, otherwise I’d give credit.)
This really helps show restricted range in an intuitive way: when you’re looking in too close at a small range on one variable, you just don’t have the perspective to see the broader trends.
What can we do about this? Does this mean that we just can’t look for these relationships when we have a restricted range? No. There are a number of statistical adjustments that we can make to estimate a range-corrected value for metrics of relationship. The most common of these, Thorndike’s case 2, was (like a lot of stats formulas) patiently explained to me by a skilled instructor who guided me to an understanding of how it works which then escaped my brain in my sleep one night like air slowly getting let out of a balloon. But you can probably intuitively understand how such a correction would work in broad strokes – we have a certain data set restricted on X variable, the relationship is r strong along that restricted range, its spread is s in that range, so let’s use that to guide an estimate of the relationship further along X. As you can probably guess, we can do so with more confidence if we have a stronger relationship and lower spread in the data that we do have. And there is a certain degree of range we have to have in our real-world data to be able to calculate a responsible adjustment.
There have been several validation studies of Thorndike’s case 2 where researchers had access to both a range-restricted sample (because of some set cut point) and an unrestricted sample and were able to compare the corrected results on the restricted sample to the raw correlations on unrestricted samples. The results have provided strong validating evidence for the correction formula. Here’s a good study.
There are also imputation models that are used to correct for range restriction. Imputation is a process common to regression when we have missing data and want to fill in the blanks, sometimes by making estimates based on the strength of observed relationships and spread, sometimes by using real values pulled from other data points…. It gets very complicated and I don’t know much about it. As usual if you really need to understand this stuff for research purposes – get ye to a statistician!
For a large academic project I’m working on, I’ve been trying to do something that is rather rare: discuss cultural studies and its practices in the academy in a nuanced and evenhanded way. Unfortunately, cultural studies and related fields have become the Battle of Verdun in our culture war, and typically any support is sorted by critics into “SJW bullshit” and any criticism into “reactionary proto-fascism” by supporters.
This is unfortunate because like all fields cultural studies has its strengths and its weaknesses. Has cultural studies been stereotyped and caricatured by its critics, reduced to a set of entirely unfair associations and impressions, forced constantly to defend the worst excesses of individual member, and in general been equated with its most controversial work while its most powerful and generative goes largely undiscussed? Absolutely yes. Is there also a powerful culture of groupthink and political conformity in the field, a social system of mutual surveillance where everyone constantly monitors each other for the slightest possible offense, and a set of publishing incentives that actively encourage obscurity and indigestible prose? I think the answer is also yes. But as long as the field is a battlefront in a much larger political-culture war, very few people will feel comfortable nuancing these distinctions, sorting the good from the bad/
What’s hard for people outside of academia (and some within academia) to understand is that cultural studies has a habit of, if you’ll forgive the term, colonizing other fields in the humanities and social sciences. As bad as the reputation of these assorted fields has gotten outside of the academy, and as tenuous as funding is, they have been remarkably successful at insinuating their views and language into other fields – sometimes in good ways, sometimes bad.
I self-identify as an applied linguist, specializing in educational assessment, and I spend most of my time these days reading, researching, and writing work that many would identify with the field of education. But I came up through programs in writing studies/rhetoric and composition, and I retain an interest in that field. I left it, spiritually if nothing else, because I am interested in quantitative empirical approaches to understanding writing, language learning, and assessment, and it had become clear that there was no room for empirical approaches as commonly defined in the field, at least beyond case studies of a handful of students or texts. I don’t think my own path is particularly interesting, but I think it is interesting and relevant how composition changed over time.
See, a lot of the origin story of rhet/comp/writing had to do with its methodological diversity. In the 1960s and 1970s, scholars who valued teaching writing and wanted to do it better were stymied in their English departments, where most faculty considered the study of literature preeminent and pedagogical work unimportant, especially when it came to writing. (This is an origin story, remember, so exaggeration and generalization are to be expected.) At an extreme, some professors who were interested in researching writing pedagogy were told not to bother to put pedagogical articles into their tenure files. These scholars, concentrated particularly in large land grant public universities in the Midwest, decided that they could never be taken seriously within literature-dominant programs and set out to create their own disciplinary and institutional structures.
Core to their new scholarly identity was methodological diversity. Their work was empirical, because investigating what works and what doesn’t when teaching students to write is a necessarily empirical practice. Their work was theoretical, because much of writing pedagogy involves considerations of how students think as well as write, and because the basic tool of humanistic inquiry is abstraction. Their work was also often literary, as many of these professors were trained in literature, retained interest in that field, and saw literature as a key lens through which to teach students to write. Their work was historical, as they often used the ancient study of rhetoric as a set of principles to guide the teaching of writing, supplying a time-tested array of habits and ideas to the somewhat nebulous subject-domain of writing. I could go on.
So you have someone like my grandfather, who predated the field but was something of a proto-member at the University of Illinois, whose large published corpus includes pragmatic pedagogical advice for how to teach students to read and write, essays on poetry that would appear comfortably in a literature journal, research articles where he hooked students up to polygraph machines to better understand how anxiety impacted their writing habits, and political treatises about why the humanities teach us to oppose war in all of its forms. The ability to do so much as a researcher, and get published doing all of it, always seemed very attractive to me.
I had always envisioned a field of writing studies that was as methodologically and philosophically diverse as its lingering reputation. There would be an empirical wing and a cultural studies wing and a practical pedagogy wing and a digital wing, etc…. There’s no reason these things would be mutually exclusive. But as I found as I moved through my graduate programs, in practice cultural studies pretty much ate the field, or so is the case that I’ll be making in this ongoing project I referenced earlier. That’s a big case to make and it requires a healthy portion of a book-length project to make it fairly. I can tell you though that if you pull a random article from a random journal in writing studies you will likely find very little about writing as traditionally understood and a great deal about hegemony, intersectionality, and the gendered violence of discourse. Empirical work as traditionally conceived is almost entirely absent. Today I talk to people in other wings of the humanities who tell me, straight out, that they can’t understand how composition/writing studies is distinct from cultural studies at all.
Why? Well, academia is faddish, particularly as pertains to the job market, and the strange forms of mentorship and patronage that are inherent to its training models means that there are network effects and path dependence that dictate subfields. But more, I think, the moral claims of cultural studies make it uncomfortable to study anything else. Because these critiques tend to make methodological differences not abstract matters of different legitimate points of academic view, but rather straightforwardly moralizing claims about the illegitimacy of given approaches to gathering and disseminating knowledge.
I want to preface this by saying that I know “cultural studies professors say it’s bigoted to do science” sounds like a conservative caricature of the humanities, but it is absolutely a position that is held straightforwardly and unapologetically by many real-world academics. I’m sorry if it seems to confirm ugly stereotypes about the humanities, but it is absolutely the case that there are prominent and influential arguments within the field that represent quantification as not just naive “scientism” but as part of a system of social control, a form of complicity with racism, sexism, and the like. I know this sounds like a story from some bad conservative novel, but it is not unheard of for rooms full of PhDs to applaud when someone says that, for example, witchcraft is just another way of knowledge and that disputing factual claims to its power is cultural hegemony.
The idea that conventional research and pedagogy are straightforwardly tools of power are abundant. +Take Elizabeth Flynn:
…beliefs in the objectivity of the scientist and the neutrality of scientific investigation serve the interests of those in positions of authority and power, usually white males, and serve to exclude those in marginalized positions….
Feminist critiques of the sciences and the social sciences have also made evident the dangers inherent in identifications with fields that have traditionally been male-dominated and valorize epistemologies that endanger those in marginalized positions.
This might sound pretty anodyne, but in the context of academic writing, it’s extreme. In particular, the notion that empirical methodologies actually endanger marginalized people is a serious charge, and one that is now ubiquitous in fields that are social sciences-adjacent. There are those in academia who believe not just that empirical approaches to knowledge are naive or likely to serve the interests of power but actively, materially dangerous to marginalized people. And there are those who prosecute this case within our institutions and journals quite stridently and personally.
This results in some awkward tensions between pedagogical responsibility and political theory. Patricia Bizzell exemplified the perspective that the purpose of teaching is to inspire students to resist hegemony, rather than to learn, say, how to write a paper – and that professors have a vested interested in making sure they stay on that path:
…our dilemma is that we want to empower students to succeed in the dominant culture so that they can transform it from within; but we fear that if they do succeed, their thinking will be changed in such a way that they will no longer want to transform it.
This strange, self-contradictory attitude towards students – valorizing them as agents of political change who should rise up and resist authority while simultaneously condescending to them and assuming that it is the business of professors to dictate their political project – remains a common facet of the contemporary humanities.
The broad rejection of research as a process of learning more about a world outside our heads, and of pedagogy as an attempt to share what we’ve learned therein with students, is quite prevalent. Take the late James Berlin, offering up a critique of these supposedly-naive assumptions:
Certain structures of the material world, the mind, and language, and their correspondence with certain goals, problem-solving heuristics, and solutions in the economic, social, and political are regarded as inherent features of the universe, existing apart from human social intervention. The existent, the good, and the possible are inscribed in the very nature of things as indisputable scientific facts, rather than being seen as humanly devised social constructions always remaining open to discussion.
Well. I am a rather postmodern guy, actually, compared to many, but I confess that I do believe that certain structures of the material world are inherent features of the universe. Though I am always open to a good discussion.
There are many other critics of the pursuit of knowledge as commonly understood in the field’s history, such as Mary Lay, Nancy Blyler, and Carl Herndl, and some of them are quite adamant in their rejection of the inherent hegemonic impulses of conventional research. This post will already be quite long, so I don’t want to get off on a tangent about postmodernism and change. I’ll just quote this apt observation from Zygmunt Bauman:
behind the postmodern ethical paradox hides a genuine practical dilemma: acting on one’s moral convictions is naturally pregnant with a desire to win for such convictions an ever more universal acceptance; but every attempt to do so just smacks of the already discredited bid for domination.
In any event, by 2001 John Trimbur and Diana George would write “cultural studies has insinuated itself into the mainstream of composition.” By 2005, Richard Fulkerson would say plainly, “in point of fact, virtually no one in contemporary composition theory assumes any epistemology other than a vaguely interactionist constructivism. We have rejected quantification and any attempts to reach Truth about our business by scientific means.” And so a field that in my grandfather’s era enjoyed great epistemological and methodological diversity became a field that only told one kind of story.
I don’t mean to exaggerate the uniformity. There are of course critics of the “cultural turn,” whether from empiricists like Davida Charney and Richard Haswell or theorists like Richard Miller and Thomas Rickert. And there is a diversity of subjects in writing studies research. But to a remarkable degree, the epistemological assumptions of cultural studies rule the field, and indeed the way that diversity is achieved is through applying a cultural studies lens to different subjects – a library full of dissertations on the cultural studies approach to Dr. Who, the cultural studies approach to Overwatch, the cultural studies approach the the communicative practices of the EPA, the cultural studies approach to Andrew Pickering’s theoretical construct of “the mangle.” I don’t dismiss any of these projects as projects; I am generally committed to radical cosmopolitanism when it comes to other people’s research interests. But I maintain a belief that the field would be healthier and more capable of defending its disciplinary identity (and its funding) were it to include more straightforward pedagogical work, historical work, and empirical work. But there is genuine fear among graduate students and early-career academics over whether one can wander too far from the field’s contemporary obsessions.
In applied linguistics/second language studies, a fairly close sibling field, I have seen less of a field-wide colonizing and more of a split into two very different camps, which do not have conflict as much as mutual incomprehension. This may be largely an idiosyncratic reading for me, colored more by my personal perceptions than anything else. But I do know that there are people who share the SLS banner whose work cannot talk to each other in any meaningful way. In grad school there were a large number of students whose approach to second language research was entirely in keeping with the mania for critical pedagogy, with student after student writing papers about how second language students should be encouraged to resist the hegemony of first-language practices and recognize the equal value of their own English dialect. At an extreme, this leads you to the position of someone like Suresh Canagarajah, who has long argued that research that compares the linguistic habits of second language speakers to first language counterparts is inherently judgmental and thus inherently offensive.
Meanwhile, these grad students, whose work was almost entirely theoretical and political in its methods and typically eschewed quantification altogether, would attend seminars next to students in language testing, corpus linguistics, or phonology whose work was almost purely quantitative. One group would cite Freire and Foucault while the other would run regressions and hierarchical linear models. This never erupted into real interpersonal conflict; it just meant that you had people whose work was not compatible in any meaningful way. This was always my frustration and fear myself in writing studies: when I spoke the language of effect sizes, ANOVAs, and p-values, I could make my work comprehensible to people from a large variety of fields. When I spoke to people outside the field about work I had read concerning, say, what Bourdieu could tell us about the rhetoric of play in Super Mario Bros, we both ended up at a loss. I know that sounds like a terribly cutting value judgment of that kind of work, but I don’t intend it to be. I mean simply that over time I became too frustrated by how incomprehensible the work I was reading for school appeared to anyone outside of a small handful of subfields. If I understand the field correctly as an outsider, this is a similar dynamic to that of anthropology, where evolutionary anthropologists engage in some of the “hardest” science possible while in the same departments many cultural anthropologists reject their work as inherently masculinist, naively positivist, and hegemonic.
From my completely anecdotal standpoint, the political and cultural side of second language studies is growing, the quantitative side shrinking or breaking off to join other broad disciplinary identities. I might be wrong about that. But either way, I am left to ponder whether these trends threaten the long-term existence of these fields – and by extension the humanities writ large, which are now dominated by a narrow set of political theories that insist on the inherent immorality of many conventional ways of looking at the world and thinking about it. As I have said, I value many things that have emerged from cultural studies. But those within the field sometimes seem eager to confirm every ugly stereotype the outside world has about hectoring, obscure, leftist academics, and there appears to me to be little in the way of professional or social incentives to compel professors to think and speak in a more pragmatically self-defensive way. One of my core beliefs about the academy is that how we talk about our research and teaching matters, that we can act as better or worse defenders of our fields and institutions if we pay attention to what the wider world values. But we are not currently doing a good job of that. At all.
Perhaps this is all just a long fad, and times will change for writing studies and the humanities writ large. There are trends like digital humanities which cut in the opposite direction, though they are ferociously contested in academic debates. I suppose time will tell. I worry that by the time some of these trends have worked themselves out, there will not be much left of the humanities to fight over.
Today’s Study of the Week, via SlateStarCodex, considers the impact of intervention programs to help ameliorate the impact of lead exposure on children. Exposure to lead, even at relatively low doses, has a long-established set of negative consequences, particular pertaining to cognitive functioning and behavioral control. This dynamic has long been hypothesized as a source of a great deal of social problems, perhaps even explaining the dramatic rise and fall in crime rates in America in the 20th century, given the rise and fall of leaded gasoline. Those broader questions are persistently controversial and will take years to answer. In the meantime, we have interventions designed to ameliorate the negative impacts of lead exposure, but little in the way of large-scale responsible research to measure their impact. This study is a step in closing that gap.
In the study, written by Stephen B. Billings and Kevin T. Schnepel, a set of observational data is analyzed to see how children eligible for inclusion in a program of interventions for lead exposure compared to a control group that did not receive the intervention. The data, taken from North Carolina programs in the 1990s, is robust and full-featured, allowing the researchers to consider behavioral outcomes for children, later-in-life criminal behavior, educational outcomes, and some other metrics of overall quality of life.
For obvious reasons, the study is not a true experiment – you can’t expose children to lead as an experimental treatment and note the difference. But they are able to approximate an experimental design, first thanks to the number of statistical controls, and second thanks to a trick of the screening process. Lead testing is notoriously finicky, so children are usually tested twice in early childhood. If children were tested once and found to have lead levels higher than the threshold, they would then be tested again several months later. If they were found to have again exceeded that threshold, they would be assigned to the intervention protocol. This provided researchers with the opportunity to examine children who tested above the threshold the first time but not the second and compare them to those who tested above the threshold both times. Because only those who were above threshold twice were subject to interventions, these formed natural “control” and “test” groups, subject to quality and robustness checks. Because those in the intervention groups had higher lead exposure overall, their outcomes were statistically corrected for comparison to the control group.
As discussed in the last Study of the Week, this research uses an intent-to-treat model (“once randomized, always analyzed”) because it was not possible to tell what portion of test subjects actually completed the interventions, and because there will certainly be noncompliers in other real-world populations as well, helping to avoid overly strong estimates of intervention effects.
The interventions included education and awareness campaigns, general medical screenings for overall childhood health, nutritional interventions which are believed (but not proven) to be effective at mitigating the effects of lead exposure, educational interventions, and for higher levels of exposure, efforts to physically locate and remove the sources of contamination, usually lead paint. These efforts can be quite expensive, with an estimated average cost of intervention for in-study participants of $5,288. To my mind this is precisely the kind of thing a healthy society should ensure is paid for.
I want to note that this study strikes me as a monumental task. The sheer amount of types of data they pulled – birth records, housing records, educational data, criminal justice data, and others – must have taken great effort, and wrangling that amount of data from that many different sources is no mean feat. They even investigate which of their research subjects may have lived in the same house. And the sheer number of controls and quality tests employed here are remarkable. It’s admirable work, which will serve as a good model for replication going forward.
Unsurprisingly, lead exposure has a serious impact on educational outcomes:
This is consistent with a large body of research, as suggested previously. The behavioral outcomes are even more pronounced, which you can investigate in the paper. Bear in mind that in the raw numbers there are many confounds – poor people and people of color are disproportionately likely to live in lead-tainted environments, and they are also more likely to suffer from educational disadvantage in general, thanks to many social factors. But these trends are true within identified demographic groups as well.
Luckily, the intervention protocol does have an impact. To estimate it, the researchers combine this data (math and reading at 3rd and 8th grade and grade retention from grades 1-9) into an educational intervention index. They find an overall effect of .117 SD improvement relative to the control group in this index, though with a p-value only significant to .10, not typically considered significant in many contexts. This is perhaps explained in part by the n of 301 and may be improved with larger-n replications. There is a great deal of difference in the metrics that make up the index, listed in Table 4, so I urge you to investigate their individual effect sizes and p-values.
This overall effect size of .117 for the educational index is somewhat discouraging, even though they suggest the intervention does have a positive impact. The biggest positive educational interventions achievable by policy, such as high-intensity tutoring, tend to have a .4-.5 SD effect on quantitative outcomes; the black-white achievement gap in many metrics is around 1 SD. So we’re talking about modest gains that don’t close the educational disadvantages associated with lead exposure. This can perhaps intuitive given that these efforts are largely aimed at preventing more exposure, rather than counteracting the impact of past behaviors. Still, in a world where we’re grasping for the positive impacts we can, and given our clear moral responsibility to help children grow in lead-free environments regardless of the educational impacts, it’s an encouraging sign.
What’s more, the behavioral indices were more encouraging, The researchers assembled an antisocial behavior index including metrics related to school discipline and criminality. Here the effect size was .184, significant to an alpha of .05. These non-cognitive skills make a big impact on the quality of life of students, parents, teachers, and peers. Still fairly modest in impact, but more than worth the costs.
Seems pretty clear to me that we need robust efforts to clean up lead in our environment and to mitigate the damage done to people already exposed. This is an important study and I’m eager to see replication attempts.
I did a brief interview with someone who was writing a story about crowdfunded academic writing, which appears to have been killed by the prospective publication. In the interview the journalist asked me how I would define my basic philosophy on education, which I said was deeply out of fashion with most education writing. What I came up with off the cuff was “Mechanism Agnostic Low Plasticity Educational Realism,” which is I guess as good a gloss as any. This is my alternative to the Official Dogma of Education.
The basic idea is that both the overwhelming empirical evidence and common sense tells us that different people have different levels of academic ability, that they sort themselves into various achievement bands early in life, that this sorting is at scale and in general remarkably persistent over time and across a wide variety of educational contexts, and that our pedagogical and policy efforts will be most constructive and fruitful if we recognize this reality. This is not a claim that people can’t learn, or that they can’t be taught in better or worse ways. It is a claim that the portion of the variability in outcomes in any given educational metrics that can be controlled by teachers or parents is dramatically lower than that which is commonly assumed.
I say low plasticity because the presumed degree to which any individual or group’s educational outcomes can be altered via schooling is usually assumed to be quite high – that is, the “no excuses” school of education philosophy, the “if you believe it you can achieve it” attitude that pervades our discourse, acts as though educational outcomes are highly plastic and subject to molding. And in contrast I suggest that the average level of plasticity in any given student’s outcomes is probably relatively low. Not zero, obviously – there are interventions that work better or worse, and we should work to maximize every student’s performance within ethical reason. And the degree of plasticity is probably variable as well; a child with a severe cognitive disability probably has more severe constraints on their outcomes than one without, just as a student who enjoys the benefits of extreme socioeconomic privilege and activist parents probably has a higher floor than the average. But across the system we should expect much less plasticity in outcomes than is commonly assumed.
I say mechanism agnostic because I am not entirely confident that we know why different people have consistently better academic outcomes than others, but we still know with great confidence that they do. Obviously, a lot of evidence suggests that differences in individual academic performance is genetic in its origin. The degree and consistency of that genetic influence will need to continue to be investigated. But the details of how educational outcomes are shaped, while of immense importance, don’t change the remarkably consistent finding that different people have different levels of academic ability and that these tend not to change much over the course of life. In a policy context that has spawned efforts like No Child Left Behind, which assumes universal ability to hit arbitrary performance benchmarks, this is an essential insight.
The implied policy and philosophical changes for such a viewpoint are open to good-faith debate. As I have written in this space before, I think that recognizing that not all students have the same level of academic ability should agitate towards a) expanding the definition of what it means to be a good student and human being, b) not attempting to push students towards a particular idealized vision of achievement such as the mania for “every student should be prepared to code in Silicon Valley,” and c) a socialist economic system. Some people take this descriptive case and imagine that it implies a just-deserts, free market style of capitalism where differences in ability should be allowed to dictate differences in material wealth and security. I think it implies the opposite – a world of “natural,” unchosen inequalities in ability is a world with far more pressing need to achieve social and economic equality through communal action, as that which is uncontrolled by individuals cannot be morally used to justify their basic material conditions.
Before we get to those prescriptive conclusions, though, we need to get to the empirical observation – that the existence of a broad distribution of people into various tiers of academic ability, at certain predictable intervals and percentages, is not some error caused by the failure of modern schooling, but an inevitable facet of the nature of a world of variability. Until and unless we can have a frank discussion of the existence of persistent differences in academic ability within any identifiable subgroup of students, we can’t have real progress in our education policy.