US Election Model

~Sharat Kumar & Vasanth

The power of predictive modelling and Machine Learning in particular. We had an opportunity to put it to test, by working on the prediction of the US election, a couple of weeks prior to the US Presidential Election. WE approached this using supervised learning. First, we considered two states of America, Texas (Republican/healthy Red state) and California (Democratic/ strong Blue state). We scraped around 10000 tweets for 5 Days from twitter for both the states. We tagged red as Pro-Trump/Anti-Biden and treated Blue as  Pro-Biden/Anti-Trump. 

We selected 100 top Fans of Biden and Trump across the US, and we scraped tweets relevant to US elections from those handles for one day. We trained a model using those tagged data into an AI machine. After training, the tool considers 70% of the data to train and the rest 30% for validation and testing. During the process, a confusion matrix is formed. A confusion matrix is a table describing the performance of a classification model (or “classifier”) on a set of test data for which the correct values are known. 

This table shows how often the model classified each label correctly (in blue), and which labels were most often confused for that label (in grey). 

We created two models out of them, in one model, when we feed an individual’s tweet history into the machine, it will identify whether the person is a Trump/Biden fan. Another model is when we provide the Twitter data of a particular state; the model predicts who has more support in that state. We used the second model, and we fed the twitter data of 11 swing states in the US. 

States Red Blue Predicted Result Actual Result
Concord (New Hampshire) 36.48% 63.52% Blue Blue
Florida 47.29% 52.71% Blue Red
Iowa 51.01% 48.99% Red Red
Michigan 31.34% 68.66% Blue Blue
Minnesota 43.79% 56.21% Blue Blue
Nevada 37.22% 62.78% Blue Blue
Ohio 42.32% 57.68% Blue Red
Pennsylvania 43.18% 56.82% Blue Blue
Raleigh (North Carolina) 46.27% 53.73% Blue Red
Virginia 44.80% 55.20% Blue Blue
Wisconsin 43.01% 56.99% Blue Blue

In 8 states among the 11, we predicted the right result. We got an accuracy of 72.72% as our result. 

This gave a first hand experience of the power of AI and Machine learning. While these tools can be used to predict and prescribe inputs for various business decisions it can also be an effective platform for addressing several complex societal problems. We are excited to engage and learn more on the underlying capabilities these new age platforms offer.