Login | Register

Predicting US Elections with Social Media and Neural Networks


Predicting US Elections with Social Media and Neural Networks

Chan, Yin Nang Ellison (2019) Predicting US Elections with Social Media and Neural Networks. Masters thesis, Concordia University.

[thumbnail of Chan_MCompSc_S2019.pdf]
Text (application/pdf)
Chan_MCompSc_S2019.pdf - Accepted Version


Increasingly, politicians and political parties are engaging their electors using social media. In the US Federal Election of 2016, candidates from both parties made heavy use of Social Media, particularly Twitter. It is then reasonable to attempt to find a correlation between popularity on Twitter, and eventual popular vote in the election. In this thesis, we will focus on using the subscriber ‘location’ field in the profile of each candidate to estimate support in each state.
A major challenge is that the Twitter location field in a user profile is not constrained, requiring the application of machine learning techniques to cluster users according to state.
In this thesis, we will train a Deep Convolutional Neural Network (CNN) to classify place names by state. Then we will apply the model to the Twitter Subscriber ‘location’ field of Twitter subscribers collected from each of the two candidates, Hillary Clinton (D), and Donald Trump (R). Finally, we will compare predicted popular votes in each state, to the actual results from the 2016 Presidential Election.
The hypothesis is that a city name has a strong correlation to the people who founded it and then incorporated it. Further, it’s hypothesized that the original settlers were mostly homogeneous, relative to the country of origin and shared a common language, thus resulting in place names using the language of their origin.
In addition to learning the pattern related to the State Names, this additional information may help a machine learning model learn to classify locations by state.
The results from our experiments are very promising. Using a dataset containing 695,389 cities, correctly labelled with their state, we partitioned the cities into a training dataset containing 556,311 cities, a validation dataset containing 111,262, and a test dataset containing 27,816. After the trained model was applied to the test dataset. We achieved a Correct Prediction rate of 84.4365%, a False Negative rate of 1.6106%, and a False Positive rate of 1.0697%.
Applying the trained model on Twitter Location data of subscribers of the two candidates, the model achieved an accuracy of 90%. The trained model was able to correctly pick the winner, by popular vote, in 45 out of the 50 states. With another US and Canadian election coming up in 2019, and 2020, it would be interesting to test the model on those as well.

Divisions:Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering
Item Type:Thesis (Masters)
Authors:Chan, Yin Nang Ellison
Institution:Concordia University
Degree Name:M. Comp. Sc.
Program:Computer Science
Date:1 March 2019
Thesis Supervisor(s):Krzyzak, Adam and Suen, Ching Y.
ID Code:985259
Deposited By: Ellison Yin Nang Chan
Deposited On:06 Feb 2020 03:21
Last Modified:06 Feb 2020 03:21
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Research related to the current document (at the CORE website)
- Research related to the current document (at the CORE website)
Back to top Back to top