Tuesday, 7 April 2009
Andy Cooke on the poll averaging debate
(This is a comment that Andy Cooke posted on the main PB site on the debate that often flares up here on poll averaging between Mike Smithson and Rod Crosby)
On poll averaging: Rod Crosby would be absolutely correct - if the only variability of the polling were random error (imprecision). If systematic error (inaccuracy) creeps in, then the averaging technique would fail. We can, in fact, use Rod's correct statistical assertion to test the assumption that variations between all the polls from the polling companies is purely random error - if averaging tends to work over a specified technique of selecting a single poll (which we don't have to come up with - Mike has long subscribed to the technique "Whichever one is worst for Labour"), the assumption holds. If the specified technique tends to win, the assumption fails.
Quick test: Looking at the final 3 days polling data from the last 4 elections (information readily available; timescale chosen to provide sufficient polls to actually average (number of 4-7 over the 4 elections), data from most polling companies at each election, yet minimise the possibility of a late swing), we can take the average (Crosby's Rule, or "CR") and worst-for-Labour (Smithson's Rule, or "SR")
1992: (Lead in GB; Con score-Lab score; error)
Actual: Con 7.6%; 42.8-35.2
CR: Lab 1.4%; 38.1-39.5; 9.0% to Lab
SR: Con 0.5%; 38.5-38.0; 7.1% to Lab
1997: (Lead in GB; Con score-Lab score; error)
Actual: Lab 13.0; 31.4-44.4
CR: Lab 18.0%; 30.0-48.0; 5.0% to Lab
SR: Lab 10.0%; 33.0-43.0; 3.0% to Con
2001: (Lead in GB; Con score-Lab score; error)
Actual: Lab 9.3%; 32.7-42.0
CR: Lab 12.8%; 31.6-44.4; 3.5% to Lab
SR: Lab 10.0%; 33.0-43.0; 0.7% to Lab
2005: (Lead in GB; Con score-Lab score; error)
Actual: Lab 3.0%; 33.2-36.2
CR: Lab 4.9%; 32.4-37.3; 1.9% to Lab
SR: Lab 3.0%; 33.0-36.0; 0.0% to Lab/Con
SR wins in 4 out of 4 cases. If the assumption that error is purely random were to hold, then in each case, SR would be heavily odds against to win. A 4-horse accumulator all at odds against on the order of 4/1 to 6/1 would seem extremely unlikely (if all were 4/1, for example, we'd be looking at a 624/1 accumulator, which beats Mike's Obama bet totally hollow).
Ergo the assumption that all polling error is purely random fails. Polling averaging (across companies, at least) is contraindicated.