Instead, I used precision-recall on the minority class (e.g. closed schools) to evaluate model performance. In this context, precision - recall can be interpreted as follows:

The map below shows the proportion of closed schools by county. Interestingly, it looks like rural areas are often more affected by closure than more urban environments; with the exception of the District of Columbia and the greater Detroit metropolitan area, the areas hit the hardest by closure are rural (e.g. Northern Maine, Arizona, Western Nebraska, etc.). This was an unexpected observation, given school closure antidotes most often take place in large metropolitan areas.

The beginning of the solution is establishing a clear process for closing schools. The simplest and best way to put this into operation is the charter model. Each school, in conjunction with the state or district, would develop a five-year contract with performance measures. Consistent failure to meet goals in key areas would result in closure. Alternatively, the state could decide that districts only have one option—not five—for schools reaching NCLB-mandated restructuring: closure.

Finally, I evaluated performance for various downsampling and upsampling ratios for random forest (our best classifier performance-wise). For downsampling, I fit the model with a balanced number of open-closed schools, as well as 2x, 5x, and 10x the number of open schools to closed schools. When upsampling, I sampled (with replacement) 1x, 2x, 5x, and 10x the overall number of closed schools from both the closed and open schools in the training set.

We shouldn’t be surprised then that turnarounds in urban education have largely failed. The surprise and shame is that urban public education, unlike nearly every other industry, profession, and field, has never developed a sensible solution to its continuous failures. After undergoing improvement efforts, a struggling private firm that continues to lose money will close, get taken over, or go bankrupt. Unfit elected officials are voted out of office. The worst lawyers can be disbarred, and the most negligent doctors can lose their licenses. Urban school districts, at long last, need an equivalent.

Next, I tried downsampling the majority class (e.g. schools remaining open) when training the model. This is a common technique when dealing with imbalanced classes; as the model will seek to maximize accuracy, it will often underestimate predictions for the less common occurrence (in this case school closures). Downsampling counteracts this behavior by fitting a model to all closed schools an equally large randomly selected sample of open schools. After downsampling, I got a slight bump in random forest performance, but nothing to write home about.

A small number of progressive leaders of major urban school systems are using school closure and replacement to transform their long-broken districts: Under Chancellor Joel Klein, New York City has closed nearly 100 traditional public schools and opened more than 300 new schools. In 2004, Chicago announced the Renaissance 2010 project, which is built around closing chronically failing schools and opening 100 new public schools by the end of the decade.