Murphy-Epstein’s Law

Predicting the value of a continuous real variable based on historical observations, covariates etc is a routine problem. It has never been easier to create sophisticated statistical models from data. Sadly however, it often turns out that the predictions of the fancy model are not much better than a simple mean of the historical observations.

The output of the best predictive models (determined by cross-validation for example) always shows less variance than the observations. This fact is called shrinkage.

Shrinkage can be understood from an identity known in weather forecasting as Murphy-Epstein decomposition[*].

    \[\text{forecast skill} = \rho^2 - \left( \rho - {\sigma_f \over \sigma_o} \right)^2  \]

\rho is the correlation between forecasts and observations, and \sigma_o and \sigma_f are standard deviations of the observations and forecasts respectively.

To maximise skill, the second term needs to be made as small as possible. For example, \rho \ll 1 requires \sigma_f \ll \sigma_o.

Having low variance compared to the observations may seem strange. It makes your predictive model seem a less realistic description of reality.  Yet shrinkage is a feature of any imperfect (\rho < 1) but optimised predictive model.

[*] Statistical Analysis in Climate Research, H. Storch and F Zwiers, Cambridge University Press, 2002


Antarctic winds

Antarctica has impressive surface winds. They are unusual because they are related in a simple way to topography. The map shows a 1979-2014 climatology of surface winds derived from ECMWF’s ERA-interim.  In the interior of the continent, the combination of a strong temperature inversion (radiative cooling of the ice cap under clear skies) and sloping terrain generates an “inversion wind”[1]. The cold bottom layer simply slips downhill. Large-scale motion is affected by the earth’s rotation, deflecting winds to the left. Near the coast, steeper gradients generate extreme “katabatic” winds, especially when channelled into straits or valleys.

The graph below shows the distribution of \mathrm{cos} \theta over the Antarctic continent, where \theta is the angle between wind direction and local surface slope vector (both vector fields taken at the resolution of the ERA data =0.75^o). As expected, winds run overwhelmingly downslope, but with a large deflection from 180^o due to coriolis effect.


[1] The Inversion Wind Pattern over West Antarctica, Parish & Bromwich, 1986