~Sharat Kumar & Vasanth
The power of predictive modelling and Machine Learning in particular. We had an opportunity to put it to test, by working on the prediction of the US election, a couple of weeks prior to the US Presidential Election. WE approached this using supervised learning. First, we considered two states of America, Texas (Republican/healthy Red state) and California (Democratic/ strong Blue state). We scraped around 10000 tweets for 5 Days from twitter for both the states. We tagged red as Pro-Trump/Anti-Biden and treated Blue as Pro-Biden/Anti-Trump.
We selected 100 top Fans of Biden and Trump across the US, and we scraped tweets relevant to US elections from those handles for one day. We trained a model using those tagged data into an AI machine. After training, the tool considers 70% of the data to train and the rest 30% for validation and testing. During the process, a confusion matrix is formed. A confusion matrix is a table describing the performance of a classification model (or “classifier”) on a set of test data for which the correct values are known.
This table shows how often the model classified each label correctly (in blue), and which labels were most often confused for that label (in grey).
We created two models out of them, in one model, when we feed an individual’s tweet history into the machine, it will identify whether the person is a Trump/Biden fan. Another model is when we provide the Twitter data of a particular state; the model predicts who has more support in that state. We used the second model, and we fed the twitter data of 11 swing states in the US.
States | Red | Blue | Predicted Result | Actual Result |
Concord (New Hampshire) | 36.48% | 63.52% | Blue | Blue |
Florida | 47.29% | 52.71% | Blue | Red |
Iowa | 51.01% | 48.99% | Red | Red |
Michigan | 31.34% | 68.66% | Blue | Blue |
Minnesota | 43.79% | 56.21% | Blue | Blue |
Nevada | 37.22% | 62.78% | Blue | Blue |
Ohio | 42.32% | 57.68% | Blue | Red |
Pennsylvania | 43.18% | 56.82% | Blue | Blue |
Raleigh (North Carolina) | 46.27% | 53.73% | Blue | Red |
Virginia | 44.80% | 55.20% | Blue | Blue |
Wisconsin | 43.01% | 56.99% | Blue | Blue |
In 8 states among the 11, we predicted the right result. We got an accuracy of 72.72% as our result.
This gave a first hand experience of the power of AI and Machine learning. While these tools can be used to predict and prescribe inputs for various business decisions it can also be an effective platform for addressing several complex societal problems. We are excited to engage and learn more on the underlying capabilities these new age platforms offer.