Movies per year

image

The most striking thing is how many movies I watched in 2008 and 2009! More than an average of 2 per week. The uptick in 2016 and 2017 can be explained by my mum buying me an Odeon Unlimited card in September 2016. The slight rise in 2018 and 2019 compared to mid 2010s I think is explained by moving into a house where watching movies was a common social activity for the housemates. And 2020 is significantly above average (given it is not complete), and this is explained by the covid-19 pandemic.

Movies by source

image

The majority of my movie-watching experience is done on Netflix. To help see any other patterns, I tried grouping them together:

image

Not sure if there is anything noteworthy to say about this. Note that the category ‘internet’ may or may not refer to streaming from dodgy websites.

Movies by source and year

image

Stackplots are awesome! Visually striking, and provides an overall sense of how my movie watching habits changed. Some patterns which are clear from this diagram:

Movies by month

image

Again, there are some patterns visible in this plot:

I recently contributed to an open source project Darts, which does Time Series predictions. I was curious to see what patterns it would find. The following is obtained via an exponential smoothing:

image

The clearest pattern in the model’s prediction is that it predicts peaks in December. Given how small and error-filled the dataset is, I do not think there is much to read in the smaller peaks at other times of the year.

Conclusions

Even with a dataset as noisy as this one, it is still possible to obtain some nice visuals and uncover some overall patterns. My favourite chart is the stackplot showing how the source used to watch movies has changed over the years.

Another nice thing about this project has been that it included various firsts for me: first stackplot, first time series model (admittedly basic), first pivotting in pandas (to create stack chart), first time grappling with Time formats to create precisely the plot I want (for the plot by month).

Next time, I will see what patterns there in the subset of data for which I could join it with imdb data.