How our election forecast model at the Polling Observatory works

The Polling Observatory forecasting model has been a long time in the making, and builds on our effort in 2010, where we fared relatively successfully both in absolute terms and compared to other forecasters. (Our forecast before the start of the official campaign proved even more successful.) Forecasting like this is inherently limited, as there will always be some factors which can impact elections but are difficult to quantify or model effectively.

We know that things like candidate quality or local campaign intensity can influence results, but these things are hard to measure, and next to impossible to estimate ahead of an election. Our forecast is therefore of the form 'given what we know about things we can measure, this is what we think is most likely to happen'. In this post, we explain the Polling Observatory forecasting model in full. We start with a recap of our method for estimating support for the parties and projecting forward from current polling to Election Day, before explaining how we generate our forecasts at constituency level.

Part 1: Noise and the polls

Since early in the parliament, the Polling Observatory team have been keeping track of movements in the polls on a regular basis (see our inaugural post here).

As we have warned repeatedly, the key to tracking public opinion is to be cautious about over interpreting short term fluctuations – since much of the movement turns out to be random noise or the result of the methodological choices of individual pollsters – and focus on the underlying trends. Of course, there have been notable shifts in vote intentions too, such as the flight of voters away from the Liberal Democrats after their decision to go into coalition, the substantial drop in support for the Conservatives after the infamous 'omnishambles' budget in spring 2012, and the slow but steady decline in Labour support, and rise in UKIP support, since late 2012.

Our method takes this noisiness of the polls into account, producing estimated ranges of support for the parties that adjust for the confidence we can have that the polls lie within a particular range of values after taking into account the inherent uncertainty in each poll. This approach is preferable to a simple poll-of-polls, since our estimates do not vary as a function of the mix of polling firms that have been in the field more recently or more frequently. As of March 1, our estimates show the race as neck-and-neck, with Labour support on 32.2% and the Conservatives on 31.5% – but the confidence intervals are such that we cannot say with confidence that the Labour lead is greater than zero.

Part 2: Lessons from polling history

As we noted when we introduced our inaugural polling forecast last May (drawing on our previous research), the polls component of our forecast uses the historical relationship between polls and the election day vote from past elections to project forward from the current polling. A key idea here is 'regression to the mean': that if a party is polling above its historical equilibrium, it will tend to underperform the polls on election day. If a party is below the equilibrium, it will tend to do better. For this reason our initial forecast a year out suggested that both Conservatives and Labour would get around 36%.

Of course, this election cycle has seen sustained support UKIP, and for this reason the polls have not come into line with the original forecast. Support for the main parties has remained below the equilibrium that history suggests, as voters have turned instead to alternative options – UKIP, the SNP, and more recently the Greens. The longer the structure of preferences remains in place, the less and less we expect the shares of the Conservatives and Labour to increase. Our forecast of the vote share now stands at 33.7% for the Conservatives and 33.7% for Labour, with our estimates having edged down gradually in each new forecast. These estimates are listed in the graph below with confidence intervals to indicate our estimate about the degree of uncertainty that still remains with two months to go.

Part 3: From votes to seats

The Polling Observatory can now unveil the third part of its forecast, in which we translate our estimated vote shares on May 7 into probabilities of victory for each of the parties in each parliamentary constituency and a forecast of the total number of seats each party will receive. We produce these seat forecasts through many simulations of the constituency level electoral outcomes.

In each simulation, we calculate the forecasted vote for each of the Conservatives, Labour and the Liberal Democrats as equal to the constituency vote in 2010 plus a uniform Scottish swing (for constituencies in Scotland) or a uniform England and Wales swing (for constituencies in E&W), in addition to constituency specific deviations from uniform swing. The swing in each constituency thus reflects how the parties are faring in the national and Scottish polls, but incorporates more local information as well.

The uniform swings are calculated as follows. Firstly, the forecasted 2010 to 2015 swing for Britain is calculated as the difference between the 2010 national vote result and the forecasted 2015 national vote result (based on Part 2 of our forecast). Second, the most recent polls of vote intention in Scotland and Britain are used to estimate the current vote in England and Wales. Here we simply take the vote shares from the latest Scottish polls, and subtract them from the polling for the corresponding period for Britain, adjusting for population size, to calculate the vote shares for England and Wales.

This means that our estimate of swing may lag the British polls a little – as we require Scottish polls for the same period. These estimates of current vote shares are used to calculate the current Britain, Scotland and England and Wales swings (swings from 2010 to the most recent polls). Third, we make the assumption that any change in the swing for Scotland and England and Wales between now and election day will be proportionately the same. For example, if the swing in Scotland increases by 1% (not percentage point), the swing in England and Wales will also increase by 1%. Using this assumption, we can use the forecasted 2010/2015 Britain swing and the current Britain, Scotland and England and Wales swings to calculate forecasted 2010/2015 Scotland and England and Wales swings.

For the constituency specific deviations from uniform swing, we have constituencies in which we have no additional information since 2010 and constituencies in which we have constituency specific polls (largely through the extensive polling activities of Lord Ashcroft). If we have no additional information, the constituency-specific deviation from uniform swing is a random draw from a distribution determined by the distribution of the deviations from uniform swing between 2005 and 2010. In other words, we use the range of actual swings which occurred in 2005 and 2010 to inform the range of possible swings in 2015.

Each simulation randomly draws one possible swing from this range for each seat. So the overall range of swings we project remains the same in each simulation, following the distribution seen in other elections, but the projected swing in each seat varies – each simulation picks a new swing for each seat from within the possible range.

If we have a constituency specific poll, we use that poll information to update our expectation for the deviation from uniform swing for that constituency. In a couple of cases, specifically Clacton and Rochester and Strood, we use the result of the by-election as equivalent to a constituency poll with a large sample, since these provide useful information on deviation from uniform swing (this is not least important because the sitting MPs in these two seats may benefit from incumbency advantages in a way that other UKIP candidates may not).

One challenge for our forecasting approach based on the historical relationship between polls and the vote is that we do not have historical polling data for UKIP or the SNP (and so cannot generate our Part 2 forecast). For them, we follow the same procedure as we do for the Conservatives, Labour and the Liberal Democrats, except that we do not have a forecasted 2015 national vote. What we do have is a forecasted 2015 national vote for 'other' parties (i.e. parties other than the Conservatives, Labour and the Liberal Democrats). We therefore need to determine the proportion of the other uniform swing (both the other swing for England and Wales and for Scotland) that will go to UKIP and the SNP. We do this on a constituency by constituency basis. The default for the SNP is that 0.95 of the other uniform swing will go to SNP in Scotland constituencies and 0 in England & Wales. The default for UKIP is that 0.7 of the other uniform swing goes to UKIP in constituencies in England and Wales and zero in Scotland.

To calculate the distribution for the deviations from uniform swing for UKIP we used the distribution of swings to UKIP in the 2014 European Parliament election. This is not a perfect proxy for the general election for several reasons, but we felt it would give a better idea as to the distribution of UKIP swings than the last general election – not least because the overall rise in UKIP support between 2009 and 2014 in European Parliament elections (10.6 percentage points) is fairly similar to the rise in UKIP support we are currently seeing in general election polling.

The current SNP surge also makes 2010 a poor choice for defining the distribution of SNP swings. Here we use the distribution of constituency voters in the 2011 Scottish Parliament election, when the SNP vote rose by 12.5 percentage points. This is a rather smaller swing than is currently being predicted in Scottish polling, but was nonetheless a major surge in support and should therefore prove a more useful yardstick than the 2010 general election, when SNP support barely rose. Note also that we are not using this data to estimate the SNP swing but rather the distribution of the deviations from uniform swing.

As before, if we have a constituency poll (as we do for a total of 136 constituencies), we use that information to update our expectation for the deviation from uniform swing for that constituency. (Note that the deviation of constituency polls from uniform swing is calibrated against our daily poll estimate at the time the fieldwork was carried out.) Also note that for UKIP we use the sum of votes for radical right parties in 2010 as our estimate of the 2010 UKIP vote – as we expect that UKIP, as the dominant representative of the radical right, will win most of the support that previously went to the BNP, the National Front and the England Democrats. Even if this proves not to be the case, support for these smaller fringe right parties should still provide a useful proxy measure of UKIP potential in individual seats.

For the remaining 'other' parties, we follow the same procedure as for UKIP and the SNP, with these parties receiving the proportion of the Scotland and England and Wales uniform swings that was not allocated to UKIP and SNP. It was not necessary to include a deviation from uniform swing, as this is largely already accounted for by the way we specified a different proportion of the other uniform swing that goes to remaining 'others' in each constituency in which such parties are expected to be relevant.

For example, constituencies in which we expect Plaid Cymru to receive a large proportion of the 'other' swing. These proportions were determined on the basis of the ratio between Plaid Cymru and UKIP support in the 2010 election. For example, in Monmouth in 2010, Plaid won 2.7% of the vote and UKIP 2.4%, and so we allocate the other vote to them as 0.53 and 0.47 respectively. In each case, the remaining proportion of the other uniform swing is specified as going to the other 'other' party – in this case Plaid Cymru. In England, we follow a similar procedure for UKIP and the Greens.

This procedure was repeated for each simulation an arbitrarily large number of times (we ran 15,000 simulations in total), so that for each simulation we have an estimate of the number of seats won by each party. By aggregating over all these simulations, we get a range of possible outcomes. We are also able to calculate the percentage of times each party wins each constituency across all simulations. We interpret these as the probability that each party will win each constituency. That is the Polling Observatory method for forecasting constituency vote, but what is the net result?

The Polling Observatory seat forecast

Our latest forecast draws on national, Scottish and constituency polling up to March 1. As noted, our current estimate of vote intention puts Labour on 32.2%, the Conservatives on 31.5%, UKIP on 14.8%, the Liberal Democrats on 8.4%, and the Greens on 6.4% – and from this, our forecast of the election day vote puts Labour and the Conservatives both on 33.7% in a dead heat, and the Liberal Democrats on 8.8%. Inputting these vote forecasts into our constituency level model, our forecast of seats won at the election in May (excluding Northern Ireland) is as follows:

CON 265 (235, 293)
LAB 285 (260, 313)
LD 24 (17, 33)
UKIP 3 (1, 5)
SNP 49 (34, 56)
OTH 6 (4, 9)
Northern Ireland (not forecast): 18 seats

While our seat forecast suggests that, as the polls stood on March 1, there is a greater probability that Labour will be the largest party (77%), the implied probabilities suggest that the likelihood of a Labour majority is tiny (less than 0.5%), and the likelihood of Labour and the Liberal Democrats being able to form a parliamentary majority is also quite small (put at 19%). The possibility of Labour and the SNP holding a majority of seats is much higher (78%).

The chances of the Conservatives being able to form a majority are currently very small, and their paths to a governing coalition even more winding on this arithmetic. Even with the backing of the Liberal Democrats (combined 289 seats, 24 short of a majority) or with both the Liberal Democrats and the Northern Irish DUP (combined 297 seats, assuming the DUP once again win 8 seats), or even adding UKIP to that two party combination (300 seats total) . It would be very hard, with this seat outcome, for the Conservatives to sustain a government without some form of co-operation from the Scottish Nationalists.

What these sums illustrate is just how difficult the likely parliamentary arithmetic will be in May, unless the polls move sharply in the next two months. A Labour-SNP pact of some form would be the easiest route to the 323 votes needed to sustain a government (assuming Sinn Fein once again return 3 MPs, and they once again refuse to take their seats), but is unlikely to be an easy or comfortable arrangement for either party. On the other hand, without the involvement of the SNP there would be no two party combination capable of commanding a majority.

Note that our results do not change much calculated from a 'nowcast', in which we assume that the vote shares are equal what we see in current polling, i.e., without any regression to the mean. The resulting seat nowcast is as follows:

CON 258 (229, 286)
LAB 282 (254, 311)
LD 25 (18, 33)
UKIP 5 (3, 12)
SNP 53 (42, 58)
OTH 8 (6, 11)

The main difference between our forecast and nowcast is that the bigger parties do a little bit worse in the latter. This only complicates the post-election arithmetic, as it lowers the odds of an outright majority as well as the coalitional ones. Thus, while our prediction in 2015 is, as in 2010, that no party will secure a majority, the differences in the detail are crucial. Currently, a different party leads on seats, but falls further short of the number needed to govern, while the mix of smaller parties in parliament is larger, and the difficulty of building a majority coalition is likely to be much greater. If our forecast proves accurate, the 2015 hung parliament is likely to remain hung well beyond election day.

This article first appeared on the Huffington Post.

How our election forecast model at the Polling Observatory works

Leave a Reply Cancel Reply