This post: a very brief brush-up on some Natural Language Processing techniques and Tableau usage.

I took data scraped from twitter regarding the 2020 US election from August 1st to October 30th. I then filtered for tweets with over 20 likes to hone in on tweets with more activity. Here are some word clouds that represents the most common words from tweets based on whether the tweets: mention Trump, mention Biden, or have the top 10% number of likes.

Next I ran some sentiment analysis on tweets with Trump or Biden in it as well. I plotted a scatterplot against the subjectivity of these tweets.

I performed Latent Dirichlet Allocation topic modelling to group together words that are commonly mentioned together. Some topic groupings for tweets that mentioned Trump

  • win, lose, think, vote, go, know, say, want, uk, hope
  • trump, vote, win, chance, attack, 2020, day, college, electoral, russian
  • say, president, win, american, poll, @realdonaldtrump, supreme, day, republican

And some topic groupings for tweets that mentioned Biden

  • ‘american’, ‘incumbent’, ‘dakota’, ‘announce’, ‘lt’, ‘south’, ‘in’, ‘normalize’, ‘relation’, ‘peace’
  • ‘relevance’, ‘north’, ‘week’, ‘agree’, ‘american’, ‘relation’, ‘south’, ‘peace’, ‘policy’, ‘president’
  • ‘broke’, ‘north’, ‘lt’, ‘dying’, ‘week’, ‘agree’, ‘president’, ‘in’, ‘achievement’, ‘relation’

Finally for some fun I experimented with recurrent neural networks to generate some text. These models use deep learning to try and understand text patterns and can be made to output sample texts.

textgenrnn
Some outputs:
campaign to the Republican stronghold over between so deeply entrenched that winning Pennsylvania won't be fired. India hossistened of a link - Who wonder if there’s an elected US senatorawe can endorsement.

"Defunding the point. Trump to Biden in US election angst of US election debate if Biden wins the US election is only two weaker will go for #BidenHarris2020Lands #Trump and Joe Biden is set to New York Post want to go for a no-deal

Moderately gibberish, with some ideas sprinkled in. I trained on 50 epochs, or rounds–more sophisticated models will run for much longer.

Thank you for following along as I brush up on some NLP skills!

You can find my repository here