Like so. (Fake charts I made with fake data, btw.)

You already get it, right?

Typically, when we perform some sort of an experiment, we want to look at how a particular number responds to the treatment – how blood pressure reacts to a new drug, say, or how students improve on a reading test when they’re given a new kind of lesson. We want to make sure that the observed differences are really the product of the treatment and not some underlying difference in observed groups. That’s what random controlled trials are for. So we randomly assign subjects to test and control groups, look at what the different averages are for the two different groups, note the size of the effect, and determine whether it is statistically significant.

But sometimes we have real-world conditions that dictate that subjects get sorted into one group or another non-randomly. If we then look at how different groups perform after some treatment, we know that we’re potentially facing severe selection effects thanks to that non-random assignment. But consider if we have assignment based purely on some quantitative metric, with a cutoff score that sorts people into one group or another. (Suppose, for example, students only became eligible for a gifted student program if they score above a cut score on some test.) Here we have a non-random distribution that we can actually exploit for research purposes. A regression discontinuity design allows us to explore the impact of such a program because, so long as students aren’t able to impact their assignment beyond their score on that test, we can be confident that students just above or just below the cutoff score are very similar.

Regression analyses will be run on all of the data, with subjects below and above the cut score combined but flagged into different groups. Researchers will run statistical models to determine whether there is a difference between groups who receive the treatment and those who don’t. As you can see in the scatterplots above, a large effect will be readily apparent in how the data looks. In the above scenario, the X axis represents the score students received on the test, the cut score is 15, and the Y axis represents performance on some later educational metric. In the top scatterplot, there is no meaningful difference from the gift students program, as the relationship between these two metrics is the same above and below the cut score. But in the bottom graph, there’s a significant jump at the cut score. Note that even after the intervention, the relationship is still linear – students who did better on the initial test do better on the later metric. But the scores of everyone have jumped right at the cut score.

There are, as you’d probably imagine, a number of potential pitfalls here, and assumption checks and quality controls are essential. All of the people tested would have to be able to be sorted into the gifted program solely on the basis of the test, the cutoff score has to be near the mean, and you need sufficient numbers to see the relationship on either side of the cut score, among other things. But if you have the right conditions, regression discontinuity design is a great way to get near-random experimental design quality in situations where you can’t do that for pragmatic or ethical reasons.