genetic behaviorism supports the influence of chance on life outcomes

 

I’ve been trying, in this space, to rehabilitate the modern science of genetic influence on individual variation in academic outcomes to progressives. Many left-leaning people have perfectly reasonable fears about this line of inquiry, as in the past similar-sounding arguments have been used to justify eugenics, while in the present many racists make pseudoscientific arguments based on similar evidence to justify their bigotry. Like others, I am interested in showing that there are progressive ways to understand genetic behaviorism that reject racism and which support, rather than undermine, redistributive visions of social justice.

I can’t deny, though, that there are many regressive ways to make these arguments. That’s particularly true given that there’s a large overlap in the Venn diagram of IQ determinists and economic libertarians. I want to take a moment and demonstrate how conservatives misread and misuse genetic behaviorism to advance their ideological preferences for free market economics.

In this post, Ben Southwood of the conservative Adam Smith Institute uses evidence from genetic behaviorism and education research to argue that luck really doesn’t play much of a role in life outcomes. To prove this point, he cites many high-quality studies showing that random assignment (or last in/last out models) to schools of supposedly differing quality has little impact on student academic outcomes. He argues that our understanding of genetic influence on intelligence should influence our perception of how much schools can really do to help struggling students. This is, in general, a line of thinking that fits with my own. But he makes a leap into then suggesting that what we call luck (let’s say the uncontrolled vicissitudes of chance and circumstance that are beyond the control of the individual) has little or nothing to do with life outcomes. He does so because this presumably lends credence to libertarian economics, which are based on a just deserts model – the notion that the market economy basically rewards and punishes people in line with their own merit. This leap is totally unsupportable and is undermined by the very evidence he points to.

To begin with, Southwood ignores a particularly inconvenient fact for his brand of conservative determinism: the large portion of unaccounted-for variation in IQ and academic outcomes even when accounting for genetics and the shared environment (code for the portion of the environment in a child’s life controlled by parents and the family). There is famously (or notoriously) a portion of variation in measurable psychological outcomes that we can’t explain, a large portion – as much as half of the variation, maybe, depending on what study you’re looking at. And this portion seems unlikely to be explainable in systematic terms. Plomin and Daniels called this the “gloomy prospect,” writing

One gloomy prospect is that the salient environment might be unsystematic, idiosyncratic, or serendipitous events such as accidents, illnesses, or other traumas . . . . Such capricious events are likely to prove a dead end for research.

Turkheimer wrote recently:

scientific study of  the nonshared environment and molecular aspects of the genome have proven much harder than anyone anticipated.  But I still feel bad about harping on it, as though I am spoiling the good vibes of hardworking scientists, who are naturally optimistic about the work they are conducting.  But ever since I was in graduate school, I have felt that biogenetic science has always oversold their contribution, tried to convince everyone that the next new method is going to be the one that finally turns psychology into a real natural science, drags our understanding of ourselves out of the humanistic muck.  But it never actually happens.

The gloomy prospect, in other words, represents exactly the influence of what we usually refer to as luck. Southwood claims that genetics explains perhaps .90 of the variation at adult, but this represents extreme upper bound predictions for that influence. Most of the literature suggests significantly more modest heritability estimates than that. So we are left with this big uncontrolled portion, which as Turkheimer says has proven resistant to systematic understanding and which likely reflects truly idiosyncratic and individual impacts on the lives of individuals. Unfortunately for progressives who want to dramatically improve educational outcomes by changing the home environment of children, quality studies consistently find that the impact of changes to that environment is minor. Unfortunately for Southwood, the unexplained portion of academic outcomes (and subsequent economic outcomes) looks precisely like chance, or at least, that which is uncontrolled by either the individual or his or her parents. The last line of his post is thus totally unsupported by the evidence.

But there’s an even bigger issue for Southwood here: no one is in control of their own genotype. It’s bizarre when conservative-leaning people endorse genetic determinism as a justification for just-deserts economic theories. Genetic influence on human behavior stands directly in contrast to the notion that we control our own destinies. How then can Southwood advance a vision of free market economics as a system in which reward is parceled out fairly, given that the distribution of genetic material between individuals is entirely outside of their control? Which genetic code you happen to be born with is a lottery. I happen to not have gotten a scratch off ticket that allows me to have been an NFL player or a research physicist. That’s not a tragedy because I am still able to secure my basic material needs and comforts. But not everyone is so lucky, and for many the free market will result only in suffering and hopelessness.

It is immoral, and irrational, to build a society in which conditions you do not choose dictate whether you live rich and prosperous or poor and hopeless. That is true if this inequality is caused by inheriting money from your rich parents or by inheriting their genes or by being deeply influenced by the vagaries of chance. The best, most rational system in a world of uncontrolled variation in outcomes is a system that guarantees a standard of living even under the worst of luck – that is, socialism.

public services are not an ATM

Built into the rhetoric of school choice is a deeply misguided vision of how public investment works.

You sometimes hear people advocating for charters or voucher programs by saying that parents just want to take “their share” of public education funds and use it to get their child an education, whether by siphoning it from traditional public schools towards charters or by cutting checks to private schools. The “money should follow the child,” to use another euphemism. But this reflects a strange and deeply conservative vision of how public spending works. There is no “your share” of public funds. There is the money that we take via taxation from everyone which represents the pooled resources of civic society, and there is what civic society decides to spend it on via the democratic process. You might use that democratic process to create a system where some of the money goes to charter schools or private school vouchers or all manner of things I don’t approve of. But it’s not your money, no matter how much you paid into taxes. And the distinction matters.

To begin with, the constantly-repeated claim that charter schools don’t cost traditional public schools money is just proven wrong again and again. People lay out these theoretical systems where they don’t, like you can just subtract one student and all of the costs associated with that student and just shift the kid and the money to another school. But this reflects a basic failure to understand pooled costs and economies of scale. And when we go looking, that’s what we find: after years of promises that charters are not an effort to defund traditional public schools, our reality checks show they have that effect. Take Chicago, where the charter school system has absolutely contributed to the fiscal crisis in the traditional public schools. Or Nashville. Or Los Angeles. I could go on.

But suppose we knew that we could extract exactly as much, dollar for dollar and student for student, from public education for each student who leaves. Would that be a wise thing to do? Not according to any conventional progressive philosophy towards government.

Do we let you take “your share” out of the public transportation system so that you can use it to defray the cost of buying your own car? Can you take “your share” out of the police budgets to hire your own private security? Can I extract my tax dollars from the public highway system I almost never use in order to build my own bike lanes? Of course not. In many cases this simply wouldn’t make sense; how can you extract your share from a building, or a bridge, or any other type of physical infrastructure? And besides: the basic progressive nature of public ownership means that we are pooling resources so that those who have the least ability to pay for their own services can benefit from the contributions of those with the most ability to pay. To advance the notion of people pulling “their” tax dollars out from public schools undermines the very conception of shared social spending. And governmental spending should require true democratic accountability; letting the Bill and Melinda Gates Foundation dictate public education policy, Mark Zuckerberg become the wholly unqualified education czar of Newark, or the Catholic church control public education dollars through voucher programs directly undermines that accountability.

So of course there’s a deep and widening split opening up within the school reform coalition, which has always been filled with self-styled progressives. There’s a major, existential disagreement at play about the basic concepts of social spending and the public good. These have been papered over for years by the missionary zeal of choice acolytes and their crisis narrative. But there was never a coherent progressive political philosophy underneath. The Donald Trump and Betsey Devos education platform is a disaster in the making, but at least it has brought these basic conflicts into the light. These issues are not going away, nor should they, and the “progressive” ed reform movement is going to have to do a lot of soul searching.

Reporting Regression Results Responsibly

We’re in a Golden Age for access to data, which unfortunately also means we’re in a Golden Age for the potential to misinterpret data. Though the absurdity of gated academic journals persists, academic research is more accessible now than ever before. We’ve also seen a rapid growth in the use of arguments based on statistics in the popular media in the last several years. This is potentially a real boon to our ability to understand the world around us, but it carries with it all of the potential for misleading statistical arguments.

My request is pretty simple. All statistical techniques, particularly the basic parametric statistical techniques that are most likely to show up in data journalism, require the satisfaction of assumptions and checking of diagnostic measures to ensure that hidden bias isn’t misleading us. Many of these assumptions and diagnostics are ultimately judgment calls, relying on practitioners to make informed decisions about what degree of wiggle room is appropriate given the research scenario. There are, however, conventions and implied standards that people can use to guide their decisions. The most important and useful kind of check, though, is the  eyes of other researchers. Given that the ability to host graphs, tables, and similar kinds of data online is simple and nearly free, I think that researchers and data journalists alike should provide links to their data and to the graphs and tables they use to check assumptions and diagnostic measures. In the digital era, it’s crazy this is still a rare practice. I don’t expect to find these graphs and tables sitting square in the center of a blog post, and I expect that 90% of readers wouldn’t bother to look. But there’s nothing to risk in having them available, and transparency, accountability, and collaboration to gain.

That’s the simple part, and you can feel free to close tab. For a little more:

What kind of assumptions and diagnostics am I talking about? Let’s consider the case of one of the most common types of parametric methods, linear regression. Whether we have a single predictor for simple linear regression or multiple predictors for multilinear regression, fundamentally regression is a matter of assessing the relationship between quantitative (continuous) predictor variables and a quantitative (continuous) outcome variable. For example, we might ask how well SAT scores predict college GPA; we might ask how well age, weight, and height predict blood pressure. When someone talks about how one number predicts another, the strength of their relationship, and how we might attempt to change one by changing the other, they’re probably making an appeal to regression.

The types of regression analysis, and the issues therein, are vast, and there are many technical issues at play that I’ll never understand. But I think it’s worthwhile to talk about some of the assumptions we need to check and some problems we have to look out for. Regression has come in for a fair amount of abuse lately from sticklers and skeptics, and not for no reason; it’s easy to use the techniques irresponsibly. But we’re inevitably going to ask basic questions of how X and Y predict Z, so I think we should expand public literacy about these things. I want to talk a little bit about these issues not because I think I’m qualified to teach statistics to others, or because regression is the only statistical process that we need to see assumptions and diagnostics for. Rather, I think regression is an illustrative example through which to explore why we need to check this stuff, to talk about both the power and pitfalls of public engagement with data.

There are four assumptions that need to be true to run a linear (least squares) regression: independence of observations, linearity, constancy of variance, and normality. (Some purists add a fifth, existence, which, whatever.)

Independence of Observations

This is the biggie, and it’s why doing good research can be so hard and expensive. It’s the necessary assumption that one observation does not affect another. This is the assumption that requires randomness. Remember that in statistics error, or necessary and expected variation, is inevitable, but bias, or the systematic influence on observations, is lethal.

Suppose you want to see how eating ice cream affects blood sugar level. You gather 100 students into the gym and have them all eat ice cream. You then go one by one through the students and give them a blood test. You dutifully record everyone’s values. When you get back to the lab, you find that your data does not match that of much of the established research literature. Confused, you check your data again. You use your spreadsheet software to arrange the cells by blood sugar. You find a remarkably steady progression of results running higher to lower. Then it hits you: it took you several hours to test the 100 students. The highest readings are all from the students who were first to be tested, the lowest from those who were tested last. Your data was corrupted by an uncontrolled variable, time-after-eating-to-test. Your observations were not truly independent of each other – one observation influenced another because taking one delayed taking the other. This is an example that you’d hope most people would avoid, but the history of research is the history of people making oversights that were, in hindsight, quite obvious.

Independence is scary because threats to it so often lurk out of sight. And the presumption of independence often prohibits certain kind of analysis that we might find natural. For example, think of assigning control and test conditions to classes rather than individual students in educational research. This is often the only practical way to do it; you can’t fairly ask teachers to only teach half their students one technique and half another. You give one set of randomly-assigned classes a new pedagogical technique, while using the old standard with your control classes. You give a pre- and post-test to both and pop both sets of results in an ANOVA. You’ve just violated the assumption of independence. We know that there are clustering effects of children within classrooms; that is, their results are not entirely independent of each other. We can correct for this sort of thing using techniques like hierarchical modeling, but first we have to recognize that those dangers exist!

Independence is the assumption that is least subject to statistical correction. It’s also the assumption that is the hardest to check just by looking at graphs. Confidence in independence stems mostly from rigorous and careful experimental design. You can check a graph of your observations (your actual data points) against your residuals (the distance between your observed values and the linear progression from your model), which can sometimes provide clues. But ultimately, you’ve just got to know your data was collected appropriately. On this one, we’re largely on our own. However, I think it’s a good idea for academic researchers to provide online access to a Residuals vs. Observations graph when they run a regression. This is very rare, currently.

Here’s a Residuals vs. Observations graph I pulled off of Google Images. This is what we want to see: snow. Clear nonrandom patterns in this plot are bad.

Linearity

The name of the technique is linear regression, which means that observed relationships should be roughly linear to be valid. In other words, you want your relationship to fall along a more or less linear path as you move across the x axis; the relationship can be weaker or it can be stronger, but you want it to be more or less as strong as you move across the line. This is particularly the case because curvilinear relationships can appear to regression analysis to be no relationship. Regression is all about interpolation: if I check  my data and find a strong linear relationship, and my data has a range from A to B, I should be able to check any x value within A and B and have a pretty good prediction for y. (What “pretty good” means in practice is a matter of residuals and r-squared, or the portion of the variance in y that’s explained by my xs.) If my relationship isn’t linear, my confidence in that prediction is unfounded.

Take a look at these scatter plots. Both show close to zero linear relationship according to Pearson’s product-moment coefficient:

And yet clearly, there’s something very different going on from one plot to the next. The first is true random variance; there is no consistent relationship between our x and y variables. The second is a very clear association; it’s just not a linear relationship. The degree and direction of y varying along x changes over different values for x. Failure to recognize that non-linear relationship could compel us to think that there is no relationship at all. If the violation of linearity is as clear and consistent as in this scatter plot, it can be cleaned up fairly easily by transforming the data.

Regression is fairly robust to violations of linearity, and it’s worth noting that any relationship that is sufficiently lower than 1 will be non-linear in the strict sense. But clear, consistent curves in data can invalidate our regression analyses.

Readers could check data for linearity if scatter plots are posted for simple linear regression. For multilinear regression, it’s a bit messier; you could plot every individual predictor, but I would be satisfied if you just mention that you checked linearity.

Constancy of variance

Also known by one of my very favorite ten-cent words, homoscedasticity. Constancy of variance means that, along your range of x predictors, your y varies about as much; it has as much spread, as much error. Remember, when I’m doing inferential statistics, I’m sampling, and sampling means sampling error – even if I’m getting quality results, I’m inevitably going to get differences in my data from one collection of samples to the next. But if our assumptions are true, we can trust that those samples will vary in predictable intervals relative to the true mean. That is, if an SAT score predicts freshman year GPA with a certain degree of consistency for students scoring 400, it should be about as consistent for students scoring 800, 1200, and 1600, even though we know that from one data set to the next, we’re not going to get the exact same values even if we assume that all of the variables of interest are the same. We just need to know that the degree to which they vary for a given is constant over our range.

Why is this important? Think again about interpolation. I run a regression because I want to understand a relationship between various quantitative variables, and often because I want to use my predictor variables to… predict. Regression is useful insofar as I can move along the axes of my x values and produce a meaningful, subject-to-error-but-still-useful value for y. Violating the assumption of constant variance means that you can’t predict y with equal confidence as you move around x(s); the relationship is stronger at some points than others, making you vulnerable to inaccurate predictions.

Here’s a residuals plot showing the dreaded megaphone effect: the error (size of residuals, difference between observations and results expected from the regression equation) increases as we move from low to high values of x. The relationship is strong at low values of x and much weaker at high values.

We could check homoscedasticity by having access to residual plots. Violations of constant variance can often be fixed via transformation, although it may often be easier to use techniques that are more inherently robust to this violation, such as quantile regression.

Normality

The concept of the normal distribution is at once simple and counterintuitive, and I’ve spent a lot of my walks home trying to think of the best way to explain it. The “parametric” in parametric statistics refers to the assumption that there is a given underlying distribution for most observable data, and frequently this distribution is the normal distribution or bell curve. Think of yourself walking down the street and noticing that someone is unusually tall or unusually short. The fact that you notice is in and of itself a consequence of the normal distribution. When we think of someone that is unusually tall or short, we are implicitly assuming that we will find fewer and fewer people as we move further along the extremes of the height distribution. If you see a man in North American who is 5’10, he is above average height, but you wouldn’t bat an eye; if you see a man who is 6’3, you might think yourself, that’s a tall guy; when you see someone who is 6’9, you say, wow, he is tall!, and when you see a 7 footer, you take out your cell phone. This is the central meaning of the normal distribution: that the average is more likely to occur than extremes, and that the relationship between position on the distribution and probability of occurrence is predictable.

Not everything in life is normally distributed. Poll 1,000 people and ask how much money they received in car insurance payments last year and it won’t look normal. But a remarkable amount of naturally occurring phenomena are normally distributed, simply thanks to the reality of numbers and extremes, and the central limit theorem teaches us that essentially all averages are normally distributed. (That is, if I take a 100 person sample of a population for a given quantitative trait, I will get a mean; if I take another 100 person sample, I will get a similar but not exact mean, and so on. If I plot those means, they will be normal even if the overall distribution is not.)

The assumption of normality in regression requires our data to be roughly normally distributed; in order to assess the relationship of y as it moves across x, we need to know the relative frequency of extreme observations to observations close to the mean. It’s a fairly robust assumption, and you’re never going to have perfectly normal data, but too strong of a violation will invalidate your analysis. We check normality with what’s called a qq plot. Here’s an almost-perfect one, again scraped from Google Images:

That strongly linear, nearly 45 degree angle is just what we want to see. Here’s a bad one, demonstrating the “fat tails” phenomenon – that is, too many observations clustered at the extremes relative to the mean:

Usually the rule is that unless you’ve got a really clear break from a straightish 45 degree angle, you’re probably alright. When the going gets tough, seek help from a statistician.

Diagnostics

OK, so 2000 words into this thing, we’ve checked out four assumptions. Are we good? Well, not so fast. We need to check a few diagnostic measures, or what my stats instructor  used to call “the laundry list.” This is a matter of investigating influence. When we run an analysis like regression, we’re banking on the aggregate power of all of our observations to help us make responsible observations and inferences. We never want to rely too heavily on individual or small numbers of observations because that increases the influence of error in our analysis. Diagnostic measures in regression typically involve using statistical procedures to look for influential observations that have too much sway over our analysis.

The first thing to say about outliers is that you want a systematic reason for eliminating them. There are entire books about the identification and elimination of outliers, and I’m not qualified to say what the best method is in any given situation. But you never want to toss an observation simply because it would help your analysis. When you’ve got that one data point that’s dragging your line out of significance, it’s tempting to get rid of it, but you want to analyze that observation for a methodology-internal justification for eliminating it. On the other hand, sometimes you have the opposite situation: your purported effect is really the product of a single or small number of influential outliers that have dragged the line in your favor (that is, to a p-value you like). Then, of course, the temptation is simply to not mention the outlier and publish anyway. Especially if a tenure review is in your future…

Some examples of influential observation diagnostics in regression include examining leverage, or outliers in your predictors that have a great deal of influence on your overall model; Cook’s Distance, which tells you how different your model will be if you delete a given observation; DFBetas, which tells you how a given predictor observation influences on a particular parameter estimate; and more. Most modern statistical packages like SAS or R have commands for checking diagnostic measures like these. While offering numbers would be nice, I would mostly like it if researchers reassured readers that they had run diagnostic measures for regression and found acceptable results. Just let me know: I looked for outliers and influential observations and things came back fairly clean.

*****

Regression is just one part of a large number of techniques and applications that are happening in data journalism right now. But essentially any statistical techniques are going to involve checking assumptions and diagnostic measures. A typical ANOVA, for example, the categorical equivalent of regression, will involve checking some of the same assumptions. In the era of the internet, there is no reason not to provide a link to a brief, simple rundown of what quality controls were pursued in  your analysis.

None of these things are foolproof. Sums of squares are spooky things; we get weird results as we add and remove predictors from our models. Individual predictors are strongly significant by themselves but not when added together; models are significant with no individual predictors significant; individual predictors are highly significant without model significance; the order you put your predictors in changes everything; and so on. It’s fascinating and complicated. We’re always at the mercy of how responsible and careful researchers are. But by sharing information, we raise the odds that what we’re looking at is a real effect.

This might all sound like an impossibly high bar to clear. There are so many ways things can go wrong. And it’s true that, in general, I worry that people today are too credulous towards statistical arguments, which are often advanced without sufficient qualifications. There are some questions where statistics more often mislead than illuminate. But there is a lot we can and do know. We know that age is highly predictive of height in children but not in adults; we know that there is a relationship between SAT scores and freshman year GPA; we know point differential is a better predictor of future win-loss record than past win-loss record. We can learn lots of things, but we always do it better together. So I think that academic researchers and data journalists should share their work to a greater degree than they do now. That requires a certain compromise. After all, it’s scary to have tons of strangers looking over your shoulder. So I propose that we get more skeptical and critical on our statistical arguments as a media and readership, but more forgiving of individual researchers who are, after all, only human. That strikes me as a good bargain.

And one I’m willing to make myself, so please email me to point out the mistakes I’ve inevitably made in this post.

diversifying the $5 reward tier

Hey gang, first I’m sorry content has been a bit light on the main site this week. Good things are coming in bunches soon. I have been releasing archival content to all subscribers on the Patreon page at a steady clip. I wanted to let you know that I’ve decided to diversify the $5 patron content a little. It’s not so much that I’m not keeping up with the book reading – it’s been a bit tough but not bad – but rather that I’m feeling a little constrained by the review format. So I’m going to alternate between book reviews and more general cultural writing, reading recommendations, considerations of contemporary criticism, etc. There will still not be any explicitly political content, which I host on Medium.

Book reviews return this weekend at last, though, and thanks for your patience. I’ve got a number of good ones coming up. Thank you for your continued support. If you aren’t yet a Patreon patron, please consider it. Also, thanks so much for the emails, and I apologize if I haven’t gotten back to you. I’ve taken some unexpected heat lately, and the support means more than I can say.

g-reliant skills seem most susceptible to automation

This post is 100% informed speculation.

As someone who is willing to acknowledge that IQ tests measure something real, measurable, and largely persistent, I take some flak from people who are skeptical of such metrics. As someone who does not think that IQ (or g, the general intelligence factor that IQ tests purport to measure) is the be-all, end-all of human worth, I take some flak from the internet’s many excitable champions of IQ. This is one of those things where I get accused of strawmanning – “nobody thinks IQ measures everything worthwhile!” – but please believe me that long experience shows that there are an awful lot of very vocal people online who are deeply insistent that IQ measures not just raw processing power but all manner of human value. Like so many other topics, IQ seems to be subject to a widespread binarism, with most people clustered at two extremes and very few with more nuanced positions. It’s kind of exhausting.

I want to make a point that, though necessarily speculative, seems highly intuitive to me. If we really are facing an era where superintelligent AI is capable of automating a great deal of jobs out from under human workers, it seems to me that many g-reliant jobs are precisely the ones most likely to be automated away. If the factor represents the ability to do raw intellectual processing, then it seems likely to me that the g-factor will become less economically worthwhile when such processing is offloaded to software. IQ-dominant tasks in specific domains like chess have already been conquered by task-specific AI. It doesn’t seem like a stretch to me to suggest that more obviously vocational skills will be colonized by new AI systems.

Meanwhile, contrast this with professions that are dependent on “soft” skills. Extreme IQ partisans are very dismissive of these things, often arguing that they aren’t real or that they’re just correlated with IQ anyway. But I believe that there are social, emotional, and therapeutic skills that are not validly measured by IQ tests, and these skills strike me as precisely those that AI will have the hardest time replicating. Human social interactions are incredibly complex and are barely understood by human observers who are steeped in them every day. And human beings need each other; we crave human contact and human interaction. It’s part of why people pay for human instructors in all sorts of tasks that they could learn from free online videos, why we pay three times as much for a drink at a bar than we would pay to mix it at home, why we have set up these odd edifices like coworking spaces that simply permit us to do solo tasks surrounded by other human beings. I don’t really know what’s going to happen with automation and the labor market; no one does. But that so many self-identified smart people are placing large intellectual bets on the persistent value of attributes that computers are best able to replicate seems very strange to me.

You could of course go too far with this. I don’t think that people at the very top of their games need to worry too much; research physicists, for example, probably combined high IQs and a creative/imaginative capacity we haven’t yet really captured in research. But the thing about these extremely high performers is that they’re so rare that they’re not really relevant from a big picture perspective anyway. It’s the larger tiers down, the people whose jobs are g-dependent but who aren’t part of a truly small elite, that I think should worry – maybe not that group today, but its analog 50 or 100 years from now. I mean, despite all of the “teach a kid to code” rhetoric, computer science is probably a heavily IQ-screened field and it’s silly to try and push everyone into it anyway. But even beyond that… someday it’s code that will write code.

Predictions are hard, especially about the future. I could be completely wrong. But this seems like an intuitively persuasive case to me, and yet I never hear it discussed much. That’s the problem with the popular conversation on IQ being dominated by those who consider themselves to have high IQs; they might have too much skin in the game to think clearly.

why universities can’t be the primary site of political organizing

This is not a political publication, but I am definitely interested in discussing campus issues in this space, and I would like to take a second and lay out some reasons why Amber A’Lee Frost is correct that the university can’t be the key site of left-wing (or any other) organizing. (If you think that idea’s a strawman, I invite you to read the Port Huron Statement.)

Please note that this is a series of empirical claims, not normative ones. I’m not saying it would be good or bad for campus to be the key site of a given movement’s organizing strategy. I’m saying that it’s not going to work, for good or bad.

There’s not a lot of people on campus. There’s a lot of universities out there, and you could be forgiven for overestimating the size of the student population. But NCES says there’s only about 20 million students, grad and undergrad, enrolled in degree-granting post-secondary institutions. There’s also about 4 million people who work in those institutions. Back of the envelope that means that there’s about 7.5% of the American population regularly on campus in one capacity or another, setting aside questions of online-only education. Is 7.5% nothing? Not at all. It’s a meaningful chunk of people. But even if all of them were capable of being politically organized – which of course is far from the truth – you’re still leaving out the vast majority of the adult population.

Campus activism is seasonal. You aren’t going to hear a lot about campus protests for a few months. Why? Because of summer break. Vacation is notoriously hard on student protest groups. Why did the “campus uprising” of a few years ago fizzle out? In large measure because of Christmas break – the spring semester wasn’t nearly as active as the fall – and then summer break. Activism requires momentum and continuity of practice, and the regularity of vacation makes that quite difficult. Organizations that are careful and have strong leadership in place can take steps to adjust for this seasonal nature, but there’s just always going to be major lulls in campus organizing according to the calendar. And politics happens year-round.

College students are an itinerant population. Speaking of continuity of practice, campus political groups constantly have to replace membership and leadership because students (we hope) will eventually graduate. Again, that problem can be ameliorated with hard work and forethought by these groups, but it’s very difficult to have consistent strength of numbers and a coherent political vision when you’re seeing 100% turnover in a 5-6 year span.

Town and gown conflicts can make local organizing difficult. Sadly, many university towns are sites of tension and mutual distrust between the campus community and the locals. The degree of these tensions varies widely from campus to campus, and they can be ameliorated. In fact, making attempts to heal those divides can be the best form of campus activism. But it’s the case that the complex conflicts between colleges and the towns in which they’re housed will often make it difficult to build meaningful solidarity across the campus borders, which often serve as an invisible wall of attention and community.

Students are too busy to devote too much time to organizing. 70% of college students work. A quarter have dependent children. These students must also do all of the necessary work of being students. We should be realistic and fair with their time and recognize that a majority of students will not be able to engage politically for many hours out of the week.

College students have a natural and justifiable first-order priority of getting employed. Everyone who works is of course at risk of having professional repercussions for their political engagement, but college students perhaps have a unique set of worries about being publicly politically active, particularly in the era of the internet. Nowadays, we’re all constantly building an easily-searchable, publicly-accessible archive of the things we once thought and did. This is particularly troublesome for those who have not yet gotten their first jobs and have yet to build the kind of social capital necessary to feel secure in their ability to get work with a controversial political past. It’s my impression that a lot of college students are inclined to be political but who feel that they simply can’t risk it, and that’s a fear that we should respect given the modern job market.

College activism can either be a low-stakes place where students learn and grow safely, or an essential site of organizing – but it can’t be both. Oftentimes, when campus activists make mistakes (such as forcing a free yoga class for disabled students to be shut down because yoga is “cultural appropriation”), defenders will say, hey, they’re just college kids – they need a chance to screw up, to make mistakes, to be free to fail. And there’s some real truth to that. The problem is that this attitude cannot coexist with the idea that campus has to be a central site or the central site of left-wing political organizing. If what happens on campus is crucial to the broader left movement, it can’t then be called not worth worrying about; if campus organizing is a space that is largely free of consequences for young activists, then it can’t be a space where essential political work gets done. These ideas are not compatible.

Organize the campus’s workforce according to labor principles. None of this means that organizing shouldn’t take place on campus; it absolutely should. But like Frost I think that the left is far too fixated on what happens in campus spaces, likely because these spaces are some of the only areas where the left appears to hold any meaningful power. Student activists should be encouraged to engage politically in order to learn and grow, but we should not imagine that they are the necessary vanguard of the young left, given that only a third of Americans ever gets a college degree. Meanwhile, we absolutely must continue to organize the campus as a workplace. (For the record, Frost is a member of a campus union, as am I.) But that organization takes place according to labor principles, not according to any special dictates of academic culture. And this returns to Frost’s basic thesis: it is the organization of labor, not of students, that must be the primary focus and goal of the American left.

correlation: neither everything nor nothing

via Overthinking

One thing that everyone on the internet knows, about statistics, is this: correlation does not imply causation. It’s a stock phrase, a bauble constantly polished and passed off in internet debate. And it’s not wrong, at least not on its face. But I worry that the denial of the importance of correlation is a bigger impediment to human knowledge and understanding than belief in specious relationships between correlation and causation.

First, you should read two pieces on the “correlation does not imply causation” phenomenon, which has gone from a somewhat arcane notion common to research methods classes to a full-fledged meme. This piece by Greg Laden is absolute required reading on correlation and causation and how to think about both. Second, this piece by Daniel Engber does good work talking about how “correlation does not imply causation” became an overused and unhelpful piece of internet lingo.

As Laden points out, the question is really this: what does “imply” mean? The people who employ “correlation does not imply causation” as a kind of argumentative trump card are typically using “imply” in a way that nobody actually means, which is as synonymous with “prove.” That’s pretty far from what we usually mean by “implies”! In fact, using the typical meaning of implication, correlation sometimes implies causation, in the sense that it provides evidence for a causal relationship. In careful, rigorously conducted research, a strong correlation can offer some evidence of causation, if that correlation is embedded in a theoretical argument for how that causative relationship works. If nothing else, correlation is often the first stage in identifying relationships of interest that we might then investigate in more rigorous ways, if we can.

A few things I’d like people to think about.

There are specific reasons that an assertion of causation from correlation data might be incorrect. There is a vast literature of research methodology, across just about every research field you can imagine. Correlation-causation fallacies have been investigated and understood for a long time. Among the potential dangers is the confounding variable, where an unknown variable is driving the change in two other variables, making them appear to influence one another. This gives us the famous drownings-and-ice cream correlation – as drownings go up, so do ice cream sales. The confounding variable, of course, is temperature.1 There are all sorts of nasty little interpretation problems in the literature. These dangers are real. But in order to have understanding, we have to actually investigate why a particular relationship is spurious. Just saying “correlation does not imply causation” doesn’t do anything to actually improve our understanding. Explore why, if you want to be useful. Use the phrase as the beginning of a conversation, not a talisman.

Correlation evidence can be essential when it is difficult or impossible to investigate a causative mechanism. Cigarette smoking causes cancer. We know that. We know it because of many, many rigorous and careful studies have established that connection. It might surprise you to know that the large majority of our evidence demonstrating that relationship comes from correlation studies, rather than experiments. Why? Well, as my statistics instructor used to say – here, let’s prove cigarette smoking causes cancer. We’ll round up some infants, and we’ll divide them into experimental and control groups, and we’ll expose the experimental group to tobacco smoke, and in a few years, we’ll have proven a causal relationship. Sound like a good idea to you? Me neither. We knew that cigarettes were contributing to lung cancer long before we identified what was actually happening in the human body, and we have correlational studies to thank for that. Blinded randomized controlled experimental studies are the gold standard, but they are rare precisely because they are hard, sometimes impossible. To refuse to take anything else as meaningful evidence is nihilism, not skepticism.

Sometimes what we care about is association. Consider relationships which we believe to be strong but in which we are unlikely to ever identify a specific causal mechanism. I have on my desk a raft of research showing a strong correlation between parental income and student performance on various educational metrics. It’s a relationship we find in a variety of locations, across a variety of ages, and through a variety of different research contexts. This is important research, it has stakes; it helps us to understand the power of structural advantage and contributes to political critique of our supposedly meritocratic social systems.

Suppose I was prohibited from asserting that this correlation proved anything because I couldn’t prove causation. My question is this: how could I find a specific causal mechanism? The relationship is likely very complex, and in some cases, not subject to external observation by researchers at all. To refuse to consider this relationship in our knowledge making or our policy decisions because of an overly skeptical attitude towards correlational data would be profoundly misguided. Of course there’s limitations and restrictions we need to keep in mind – the relationship is consistent but not universal, its effect is different for different parts of the income scale, it varies with a variety of factors. It’s not a complete or simple story. But I’m still perfectly willing to say that poverty is associated with poor educational performance. That’s the only reasonable conclusion from the data. That association matters, even if we can’t find a specific causal mechanism.

Correlation is a statistical relationship. Causation is a judgement call. I frequently find that people seem to believe that there is some sort of mathematical proof of causation that a high correlation does not merit, some number that can be spit out by statistical packages that says “here’s causation.” But causation is always a matter of the informed judgment of the research community. Controlled experiments are the gold standard in that regard, but there are controlled experiments that can’t prove causation and other research methods that have established causation to the satisfaction of most members of a discipline.

Human beings have the benefit of human reasoning. One of my frustrations with the “correlation does not imply causation” line is that it’s often deployed in instances where no one is asserting that we’ve adequately proved causation. I sometimes feel as though people are trying to protect us from mistakes of reasoning that no one would actually fall victim to. In an (overall excellent) piece for the Times, Gary Marcus and Ernest Davis write, “A big data analysis might reveal, for instance, that from 2006 to 2011 the United States murder rate was well correlated with the market share of Internet Explorer: Both went down sharply. But it’s hard to imagine there is any causal relationship between the two.” That’s true – it is hard to imagine! So hard to imagine that I don’t think anyone would have that problem. I get the point that it’s a deliberately exaggerated example, and I also fully recognize that there are some correlation-causation assumptions that are tempting but wrong. But I think that, when people state the dangers of drawing specious relationships, they sometimes act as if we’re all dummies. No one will look at these correlations and think they’re describing real causal relationships because no one is that senseless. So why are we so afraid of that potential bad reasoning?

Those disagreeing with conclusions drawn from correlational data have a burden of proof too. This is the thing, for me, more than anything. It’s fine to dispute a suggestion of causation drawn from correlation data. Just recognize that you have to actually make the case. Different people can have responsible, reasonable disagreements about statistical inferences. Both sides have to present evidence and make a rational argument drawn from theory. “Correlation does not imply causation” is the beginning of discussion, not the end.

I consider myself on the skeptical side when it comes to Big Data, at least in certain applications. As someone who is frequently frustrated by hype and woowoo, I’m firmly in the camp that says we need skepticism ingrained in how we think and write about statistical inquiry. I personally do think that many of the claims about Big Data applications are overblown, and I also think that the notion that we’ll ever be post-theory or purely empirical are dangerously misguided. But there’s no need to throw the baby out with the bathwater. While we should maintain a healthy criticism of them, new ventures dedicated to researched, data-driven writing should be greeted as a welcome development. What we need, I think, is to contribute to a communal understanding of research methods and statistics, including healthy skepticism, and there’s reason for optimism in that regard. Reasonable skepticism, not unthinking rejection; a critical utilization, not a thoughtless embrace.


 

you learn by being taught

Forgive the relative quiet lately; I’ve been enjoying my birthday weekend and then catching up on a ton of work. There’s a bunch of good things coming this week, including the return of book reviews after a brief (and unplanned) break.

This morning I spoke to an entire public high school, where I was invited to discuss being a product of public schools, higher ed, and success. It was very funny for me to be asked, though flattering – as I told the kids today, I would never think of myself casually as a success. Who ever thinks that way, beyond the wealthy and the deluded? But it was flattering and fun. I told them that there was no great wisdom in life, just a series of decisions before you, and hopefully with time the perspective to be able to choose better from worse. And, because I think this is important, I told them that they needed to cultivate a sense of “good enough” in their lives. At that age, they are being told constantly that they should pursue their dreams. But very few of us get what we’ve dreamed of, and those who have often find it’s far less grand than they’d imagined. So I told them to learn and experience and enjoy and to figure out how to live in the essential disappointment of human life.

It wasn’t as much of a bummer as it sounds!

I have been reflecting on the value of teachers. I have been accused a lot, lately, of not believing that teachers matter. That’s the opposite of the truth, really. I just think that this notion of casting the value of teachers in purely quantitative terms is a mistake, and a very recent one. The entire history of the Western canon, from Socrates to Aquinas to Locke to Dewey to Baldwin, contains arguments against this reduction. But this fight, to define what I mean and what I don’t against the tide, is a fight I suspect I will always have to keep fighting, and I intend to.

Our culture celebrates autodidacts. It talks constantly of “disrupting” education. It insists always that we need to radically reshape how we teach and learn. It treats as heroic the rejection of teachers and traditional mentorship. The self-help aisle of the bookstore abounds with writers who insist that they truly learned by rejecting the typical method of education and became, instead, self-taught, self-made. It’s an unavoidable trope.

What amazes me about my own education is just how far that is from the truth for me personally. I’ve learned, over decades, how I learn. It’s pretty simple: teachers teach me. That was true in kindergarten and it’s true now that I have my doctorate. I can’t tell you how often I have found myself feeling lost and ignorant, only to have patient, kind teachers take me through the familiar processes of modeling and repetition that are cornerstones of education. I think back to my graduate statistics classes, where I often feel like the slowest person in class, but where I always ended up getting there, thanks to steady and reassuring teaching. When I don’t get what I need from class, I’d go to office hours, or I’d go to the statistics help room, where brilliant graduate students eagerly shared knowledge and experience with me. None of this is fundamentally any different than when Mrs. Gebhardt taught me to cut shapes out of paper or when Mr. Shearer taught me simple algebra or when Mr. Tucci taught me to read poetry or when Dr. Nunn taught me to write a real research paper. The process is always the same, and in every case, I have succeeded not through rejecting the authority of teachers but by accepting their help, by recognizing their superior knowledge and letting them use it to enrich my life.

Is that a contradiction of what I’ve said about the limited ability of teachers to control the outcomes of their students? I don’t think so. The question is, do you want us to have a fuller and more humane vision of what it means to learn? I do.

They say that great men see farther than others by standing on the shoulders of giants. I think most of us are enabled to see as far as others because others have collectively reached their hands down and pulled us up.

another notch in the belt

It’s my birthday today. Wasn’t that long ago that I was part of a vanguard of young writer types. What the hell happened?

This project’s about three months old now, and I gotta tell you guys: I haven’t had this much fun writing in ages. It’s been better than I could have hoped. Thanks for coming along.

I woke up one day to find that my life had gotten pretty damn good. My job’s not perfect, but it’s still pretty great. I miss teaching, and I’d love to be in a position where I had some motivation to get peer reviewed stuff published. But I’m working at a great college with a gorgeous campus in a system I admire immensely. It’s part of my job to stay on top of the research literature, so I’m reading books and articles at a good clip. Polyani said that a scholar is someone who lives with the questions, and I do, and that’s enough. Very few people get that opportunity. It’s a privilege.

It’s also a privilege to live in this city. The other day I was walking home, cutting through Prospect Park right after dusk. I came to the Long Meadow, which a few hours before had been absolutely packed with people picnicking and jogging and flying kites and walking dogs. For a brief moment I found it utterly empty, not another soul in sight, alone in one of the most popular parks in the city. And I knew in that moment that it was all for me.

two economists ask teachers to behave as irrational actors

I was considering doing a front-to-back fisking of this interview of Raj Chetty, Professor of Economics at Stanford University, conducted by the libertarian economist Tyler Cowen. Despite Chetty’s obviously impressive credentials, he says several things in the interview that simply don’t hold up to scrutiny, in particular regarding the simultaneity problem1 and the impact of the shared environment2 I’ve decided to just focus on one key point, though.

The standard neoliberal ed reform argument goes like this: the major entrenched socioeconomic and racial inequalities in this country are no excuse for poor quantitative outcomes for groups of students; teachers and schools, despite all of the evidence to the contrary, control most of the variation in educational outcomes; therefore our perceived education problems are the result of lazy, untalented teachers; introducing a market for schooling will force schools to get rid of those teachers and metrics will improve. Now this story has failed to play out this way again and again in places like Detroit and Washington DC, but we’ll let that slide for now. If we accept this argument on its own terms, we need to get many talented people into teaching and replace the hundreds of thousands of “bad” teachers we’d be getting rid of.

Ed reform types are typically cagey about the scale of teacher dismissals – they hate to actually come out and say “I’d like to get hundreds of thousands of teachers fired” – but based on their own numbers, their own claims about the size and extent of the problem, that’s what needs to happen. You can’t simultaneously say that there’s a nationwide education crisis that needs to be solved by firing teachers and avoid the conclusion that huge numbers need to be fired. If reformers claim that even one out of every ten public teachers needs to be let go (a low number in reform rhetoric), we’re talking about more than 300,000 fired teachers.

I’ve argued before that the idea that market economics are effective means to solve educational problems falls apart once you recognize that, unlike a factory building a widget, educators don’t control most of what contributes to a child’s learning outcomes. But suppose you do believe in the standard conservative economics take on school reform: how can Chetty’s ideas make sense, if we trust young workers in a labor market to act in their own rational best interest? Chetty believes that we need, at scale, to “either retrain or dismiss the teachers who are less effective, [to] substantially increase productivity without significantly increasing cost.” Without increasing costs, in other words, by raising teacher salaries. The median teacher in this country makes ~$57,000 a year; the 75th percentile makes ~$73k, and the 25th percentile, ~$45k. Compare with median lawyer salaries well above $100,000 a year and median doctor salaries close to $200,000, or an average of $125,000+ for MBA graduates. So we’re not going to pay teachers more, and we’re going to sufficiently erode labor protections, if we’re going to dismiss those less effective teachers. This doesn’t sound like a good deal already.

Of course, teachers don’t just suffer from low median wages compared to people with similar levels of schooling. They also suffer from far lower social status than they are typically afforded in other countries, as Dr. Chetty acknowledges:

Yeah, I think status seems incredibly important. My sense of the K–12 education system in the US is, unfortunately for many kids graduating from top colleges, teaching is not near the top of the list of professions that they’d consider. It’s partly because, in a sense, they can’t afford to be teachers because it entails such a pay cut. But also because they feel that it’s not the most prestigious career to pursue.

Why yes, Dr. Chetty, it’s true! Teachers don’t get a lot of prestige in this country! Maybe that’s because well-paid celebrity academics who make several times the median teacher salary – people like you – talk casually about firing them en masse and insist that they are the source of poor metrics! The ed reform movement has insulted the profession of public school teacher for years. Popular expressions of that philosophy, like the execrable documentary Waiting for “Superman, have contributed to widespread assumptions that students are failing because their teachers are lazy and corrupt. How can a political movement that has relentlessly insulted the teaching profession not contribute to declining interest in being part of that profession?

Here in New York, the numbers are clear: we’re already facing a serious teacher shortage.

What Chetty and Cowen are asking for makes no sense according to their own manner of thinking. Dr. Chetty, Dr. Cowen: there is no bullpen. Even if I thought that teachers controlled far more of the variance in quantitative education metrics than I do, and even if I didn’t have objections about fair labor practices against removing hundreds of thousands of teachers, we would be stuck with this simple fact. We do not have hundreds of thousands of talented young professionals, eager to forego the far greater rewards available in the private sector, ready to jump in and start teaching. And we certainly won’t have such a thing if we share Chetty’s resistance to paying teachers more and his commitment to making it easier to fire them.

So: no higher salaries for a relatively low-paying profession, eroding the job security that is the most treasured benefit of the job, continuing to degrade and insult the current workforce as lazy and undeserving, getting rid of hundreds of thousands of them, and yet somehow attracting hundreds of thousands of more talented, more committed young workers to become teachers.

According to what school of economics, exactly, is such a thing possible?