Finding and analyzing popular TV shows on Netflix using topic modelling in Python

2020-07-28

I used Twitter Streaming API to collect around 272300 tweets in a duration of 24 hours starting July 14 2020. I wrote small piece of code using tweepy python library and stored them to a sqlite database.

This video contains results of a datascience experiment which I performed on tweets containing the word ‘Netflix’

I fitted a Latent Dirichlet Allocation model to extract 25 topics from a random subset of tweets. I manually reviewed important keywords associated with each topic and shortlisted topics which were related to TV shows. Then, I used this fitted LDA model to tag every tweet in the dataset with a topic name.

Following Python libraries/tools were used for the above project:

geopy
tweepy
scikit-learn
numpy
pandas
Sqlite
Plotly
GCP’s Geocoding API
Twitter’s streaming API

Finding and analyzing popular TV shows on Netflix using topic modelling in Python

About Me

Recents

Tag Cloud

Categories

Tags

Archives