SPC and Cusum Charts

Bulding an SPC Chart

First, let's start with some simple Elm-Vega to show change in a variable (notional reported crimes) over time:

Add some formatting to clean up the presentation:

We can provide some context for the variation by rescaling the y-axis to focus on the range in the data while showing lines representing z-scores of 0, ±1, ±1.5 and ±3 standard deviations from the mean: Each horizontal line is displayed in it's own layer in elm-vega.

To make this a true SPC chart we can highlight parts of the time series that are consistently above or below some threshold. Here we distinguish runs of 7 or more that are above or below the mean and additionally symbolise points with a z-score outside the ±3 range. The design intentionally mimics that of the original SPCs in the Dynamic Document Design page.

The chart looks a little cluttered, so let's change the z-score lines to regions and point symbols to line colours: Elm-vega uses two layers for the lines, one for the thin grey line, the other for the highlighted SPC shifts

CUSUM Charts

An alternative to the SPC chart is the CUSUM chart which shows not the raw (crime) data values but the cumulatative difference between each data value and some target. In the example below, we set the target to be average number of reported crimes per month over the whole time period.

Consider some point in time along the line. If at that point, the line is above 0, it indicates that the total number of crimes up to that point has exceeded what we would expect if instead the crime rate had remained at the target level. Note the three 'bumps' in the falling trend between 2013-15. Separating the signal from the noise is easer than with an SPC.

One advantage of the CUSUM chart is that it smooths out minor fluctuations in the data making trends more obvious. Another is that early trend detection is easier than with an SPC chart as we don't have to wait for a run of 7 years before signalling a trend.

The orientation of the CUSUM line is in part determined by the target. Suppose instead of setting the target to the mean (21,840) we set a stricter target of 21,000 reported crimes:

The form of the line can very sensitive to the target when there are many data items as only small but sytematic deviation from a target value soon leads to a trend away from the baseline. We can explore the effect of different targets interactively by allowing the baseline target crime rate to be changed with a slider: