Sunday 17 February 2008
Our prediction was fully based on the polls, since the polls failed to predict the outcome our prediction didn't succeed either.
Final results at http://www.ekloges.gov.cy
Saturday 9 February 2008
Our focus has always been the objective presentation of all available polls and news about the upcoming elections. In this direction, we believe that objective and detail analysis and presentation of election results is of paramount importance.
In this direction, we are today releasing a prediction for the upcoming elections.
Our prediction (95% confidence) is that the final results of the election will be (in parenthesis the range based on our calculated margin of error - see below for explanation):
Tassos Papadopoulos 34.5% (33.5%-35.5%)
Dimitris Xristofias 32.9% (31.9%-33.9%)
Ioannis Kasoulidis 31.2% (30.2%-32.2%)
Matsakis ~ 1%
Themistokleous ~ 0.5%
Our prediction is not a poll, it is instead a scientific outcome of data analysis.
Our prediction is based on scientific meta analysis of 23 polls that have been published since 1st of January 2008.
Our analysis of the 23 polls is based on the following methodology/assumptions:
- We believe in the objectivity of all 23 polls. A lot has been written in the Cyprus press about the subjectivity of some polls. We want to believe that all polling companies completed their polls in a scientific way. In any case, given the large number of polls we believe that small errors average out when all polls are taken into account together. This is one extra reason why we believe that the meta analysis we present here is stronger than a single poll.
- The sample of all 23 polls was added together, giving us a total of 26,500 ballots.
- We are treating all 23 polls as a unified single poll running from 7 January to 8 February with a sample size of 26,500 ballots. We believe that this is a short enough period of time to justify the application of this methodology.
- For such a large sample (26,500 ballots) the margin of error is quite small. The margin of error was calculated using an established methodology (used in USA elections, see: http://www.usaelectionpolls.com/polling/margin-of-error.html)
- We used 95% confidence to calculate a margin of error for this sample (Margin of error at 95% confidence = 0.6%, where n=26,500). Therefore our margin of error is +- 0.6%
- It should be noted that this is based on analysis of polls published up to the 10th of Feb. Unfortunately the Cyprus law does not allow for poll results to be released after that day. Given this we have increased our margin of error to 1% to account for small changes that might happen during the last week of the election campaign. Given the trends from the polls (see graphs in next blog entry) in the last month, we believe this is a realistic margin.
- We are happy to receive comments/criticism/be-challenged on this analysis. Use the comment feature of this blog to leave your message.
- Antonis points out that summing up 23 polls might be a bit risky given that people surveyed might have changed their mind. The truth is that the 23 polls we have included in this analysis are from a short period of time (a month) and also all polls seem to more or less agree in their results (the fluctuation is small). To prove this point and provide Antonis' with some additional information we went ahead and did some additional analysis of these 23 polls. First we provide below in a table the range of scores (the average % and the single polls that gave the top and lowest % for each candidate) for each candidate in these 23 polls. As can be seen in the table the range of all polls is quite small, proving that all polls are quite in agreement.
- To further address Antonis' point we went ahead and analysed the 23 polls in a longitudinal way. We divided the polls into three periods (a) polls conducted between 1-15 January, (b) polls conducted from 16-31 January, (c) polls conducted between 1-10 February. This as you will agree gives us additional indication as to whether there were any big fluctuations of candidates' share during those three time periods. We provide below our findings:
1 – 10 Feb
- As can be seen there is very very small fluctuation in the average % of each candidate in these three periods. In our view this shows that the public's opinions is quite stabilized and unlikely to dramatically change in the next week. This is further supported by the low standard deviation of the % of all 3 candidates when calculated across all 23 polls. The standard deviation for Papadopoulos is 0.6%, for Christofias 0.7% and for Kasoulidis 0.8%.
- As we said in our analysis we gave a 1% margin of error to our prediction to account for any additional small fluctuations in the week to come. But, we actually believe that a 0.6% margin of error as originally calculated is more than adequate for this type of small fluctuations as shown in the tables above.
- So to answer Antonis' questions: We believe the polls show that people have not changed their minds that much in the last month. We also believe the sample size is too large, so the effect of any minimal double counting (people participating in more than one poll) is insignificant, if at all present.
- We also received some questions as to whether our prediction takes into account the fact that around 15,000-20,000 Cypriot voters from overseas will be voting in this election. The answer is yes. All polls are conducted taking into account the official distribution of age and gender as per the official electorate register (i.e. in their samples they have included the appropriate percentage of all age groups and genders, a percentage that takes into account all registered voters irrespective of where they reside). Ofcourse, polls released only collected data from the residents in Cyprus. In our view, the only possibility of an effect from the overseas vote is if one was to assume that on average the 18-25 year olds who are residents in Cyprus and have participated in all 23 polls will vote significantly different than the 18-25 year olds who are studying overseas and will be travelling to Cyprus to vote. We don't believe that such a significant difference between the way these two groups will vote exists.
- aceras asks whether our data shows any distinct differences between ballot box based polls and telephone based polls. Unfortunately the overwhelming majority of the polls released in this election are telephone based so any meaningful comparison between these two types of polls is not possible with the available data. The only poll that was ballot box based is the one by CyBC. It should be though noted that the CyBC poll's results (when excluding the undecided vote) is within our predictions' margin of error. More specifically the CybC poll showed Papadopoulos at 34.03%, Christofias at 33.48% and Kasoulidis at 30.19%.
- If anyone has access to 2006 parliamentary election polls please contact us.
- Thanks for your interest in this analysis. We will continue to respond and provide additional analysis as requested. Feel free to point out additional requests in the comments. The whole goal here is to help all of us get a better understanding of the published polls.
- Simerini (to be published sunday 10 Feb)
- Politis 1000 people, 2-7 Feb
- ANT1 1356 people, 31 Jan - 8 Feb
- Phileleftheros 1112 people, 1-7 Feb
- SIGMA 1000 people, 21 Jan - 5 Feb
- Haravgi 2055 people, 24 Jan - 3 Feb
- TV Plus 2000 people, 31 Jan -2 Feb
- Simerini 800 people, -1 Feb
- MEGA 1495, 28 Jan-1 Feb
- ANT1 1661, 17-30 Jan
- Symmetron 1200, 30 Jan
- CyBC 1600 people, 19-29 Jan
- Simerini 800 people, -25 Jan
- Politis 800 people, 19-23 Jan
- PA College 1298 people, 15-21 Jan
- Simerini 800 people, -18 Jan
- Phileleftheros 1004 people, 10-17 Jan
- Symmetron 1200 people, 7-15 Jan
- SIGMA 1002 people, 7-14 Jan
- ANT1 2130 people, 17 Dec - 14 Jan
- Simerini 800 people, -13 Jan
- Politis 800 people, 7-11 Jan
- TV Plus 2000 people, 9-11 Jan
- Simerini was running a poll every 2 days. As their sample between polls was overlapping, we only took into account the polls that they published every Sunday to eliminate this problem.
- The date of the poll is the last day of its data collection. For example if a poll was run from 10-15 of June then the date of poll displayed on the graphs is 15th of June.
- Undecided vote was distributed to each of the four candidates as a share of their decided vote.