Did Big Data Predict the Kenya Elections? The correlation between online searches and offline behaviour

By William Dekker

The Kenya General Election 2017 provided an opportunity to demonstrate that technology – in the prospect of big data – can be used to harness events in the human brain so as to make real life predictions.

By the simplest definition, an election is a mass decision making process of choosing (people to hold public office or some other positions) by voting. But further down, a political decision is a construct of an individual’s convictions, fears, uncertainties, hopes and resolutions that build up to the formal judgment on the ballot. All these can be harnessed digitally today, thanks to the continued growth of internet and its technologies.

Google Search Trends, a publicly available tool that indicates search patterns over a period of time, showed the search interest in the candidates contesting for different positions countrywide, and the top questions around these individuals or about the elections itself.

A little more than two years ago, Google made the Trends data available in real time. The vast amount of searches — trillions that take place every year — make Google Trends one of the world’s largest real time datasets. Examining what people search for provides a unique perspective on what they are currently interested in and curious about. In this case, it is safe to interpret that by searching for an individual, you are either interested in that individual or uncertain about a particular prospect to which you seek answers.

Into perspective; search trends for the period just preceding the polls, reveals the interest of Kenyans in different candidates and different political issues that filled the election period. In the sample cases below, we collated search results and indexed them to 100 (percent). After the elections we’ve compared these search results vis-à-vis the election results (as announced by IEBC) and found a stunning correlation. The sample cases delve into individuals and related positions that received substantially large search volumes.

i. Nairobi County Women Rep Position

 

 

ii. Mombasa County Gubernatorial Position

 

iii. Kirinyaga County Gubernatorial Position

From the graphs above, the pre-election search results from Google were significantly reflected in the final election results (as given by the Independent Electoral & Boundaries Commission (IEBC). The small variances are reconciled by other factors. Key among them is the fact that there existed other candidates who were intentionally not factored in the initial search queries. In entirety, the search results in these sampled cases accurately ‘predicted’ the winners.

Trending Questions Vs Election Outcome

Apart from the search interest on the positions sampled above, questions around the election would be used to make conclusions around the presidential election.

The consistency of the question (in general reference to the presidential election) demonstrates an atmosphere of uncertainty. Up to a week before the polls, no one – including the opinion poll researchers – could tell who would actually win the position of the President.  This means both the top candidates had a near-equal chance of clinching the top seat. On the same note, no side was assured of an undisputable margin of victory. This can be partially used to explain the existence of a dispute of the final announcement.

The fact that this question topped the search trends on the Election Day signifies a gap in voter education. Up until the polling day, many Kenyan voters were still uncertain of the modality for voter identification – the very first step in the voting day procedures. This reasoning is further strengthened by some of the reports by the observer groups that civic education might not have been sufficient. The John Kerry led Carter Centre Pre-Election Statement issued on 27th July reads in part:

“With less than two weeks until the election, Center observers have noted a lack of education on voting day procedures. The Center urges the IEBC, political parties, and civil society to use the available time before Election Day to increase voter education and outreach efforts.”

While there are several instances where the predictive trends data proved to be accurate, there exist a few cases of variance (between the trends results and the election outcome). A solid example is the Machakos County Gubernatorial position.

Machakos Gubernatorial Position

The variance can however be justified by more than one explanation. Wavinya Ndeti’s court battles and a consequential mishap in pronouncing a popular Swahili proverb may have given her “internet prominence” in regions far away from her voting bloc. Furthermore the search results showed a near equal prominence (51:49) a factor that could only be broken by an undecided populace. She has since disputed the poll results.

Authenticity of google trends data

The authenticity of the trends data is strengthened by the fact that it is an unbiased sample of Google search data. It’s anonymized (no one is personally identified), categorized (determining the topic for a search query) and aggregated (grouped together). This allows us to measure interest in a particular topic across search, from around the globe, and can be filtered right down to city/town-level geography. This simply means one can analyze what people are searching for in real time as events unfold. It is a unique and powerful dataset, which can complement others, like demographic data from the census. It is a great tool for storytelling as it can allow us to explore the magnitude of different moments and how people react to those moments.

The future

In an age where “big data” is no longer a buzzword, it is interesting to acknowledge how big the big data on google trends can get into predicting the present and the near future. The emergence of big data has meant that everything we do online leaves digital traces – big data. Analysis of such data can help scientists, business executives, medical practitioners, and advertising and government agencies to: find new correlations between variables, spot business trends, prevent diseases, combat crime and research new markets. However it is important to appreciate that such data (data sets) are so large or complex that traditional data processing applications are inadequate to deal with them.

However the search data on Google trends has been synthesized and the volume indexed for easy use.   Today in other parts of the world, Google trends has been used to predict economic activities, and not just predetermining elections outcomes alone. It remains interesting to see how Kenyans will take advantage of the free platform to explore the insurmountable opportunities it offers.

 

~ The writer is a Digital Communication Strategist at Impact Africa Limited and an Informaticist with special interest in Big Data. The views expressed in this article are solely of the writer and not of Impact Africa or Google Inc.