Treehouse of Horror
The Halloween Specials, Visualized
Told ya I wasn’t done with The Simpsons just yet. After comparing them to other animated shows, there is still more to tell. Here’s a quick look at the Treehouse of Horror episodes.
Using an R script file, I can reference the rvest scraper I made earlier without copy/pasting.
library(tidyverse) source("../2020-08-28-simpsons/rvest_scraper.R") theme_set(theme_light()) plot_caption <- "zachbogart.com\nSource: IMDb"
After getting the data, we can do some plotting.
simpsons <- grab_imdb_ratings("tt0096697", c(1:31))
Treehouse of Horror Episodes Do Well
Here we use
str_detect to just grab the relevant episodes. With some grouping, can get the average ratings for non-Halloween-special episodes to compare against the October favorites. Turns out the Treehouse of Horror episodes tend to have higher ratings than the average rating for other episodes in the season. They certainly are some of the most memorable shows.
# ratings of ToH over time simpsons %>% mutate(toh_episode = str_detect(title, "Treehouse")) %>% group_by(season, toh_episode) %>% mutate(avg_rating = mean(rating), avg_votes = mean(votes)) %>% select(show:season, toh_episode, avg_rating, avg_votes) %>% distinct() %>% ggplot(aes(season, avg_rating, color = toh_episode)) + geom_smooth(aes(group = toh_episode), se=FALSE, color = "#eeeeee") + geom_line() + scale_x_continuous(breaks = seq(1,31,2)) + scale_color_manual(values = c("#009EDC", "#F14E28")) + labs( title = "Television! Teacher, Mother, Secret Lover.", subtitle = "IMDb Ratings for Regular and Treehouse of Horror Simpsons Episodes", color = "ToH Episode", x = "Season", y = "Average Episode Rating", caption = plot_caption )
- There was no Treehouse of Horror in the first season so the red line has no value there
- Most ToH episodes are rated higher than the average rating for the other episodes in the season. It doesn’t start dipping till season eighteen.
- That peak is Treehouse of Horror V, the third-highest-rated episode in the dataset.
Do They Always Fall Near Halloween?
I wrote a little earlier on that these episodes are “October favorites”. However, that often is not the case. Many Halloween specials aired at the start of November, after Halloween. Given that the show airs on Sundays and Halloween doesn’t always fall on that day, there can be some wiggle room. For the first two decades, the Halloween special always fell within a week of the actual holiday. However, in later years the special is a little more variable, sometimes airing at the start of the month, a full twenty-five days early.
# how close to Halloween? simpsons %>% filter(str_detect(title, "Treehouse")) %>% # convert dates to just month/date mutate(month_day = as.Date(paste("2000-", format(air_date, "%m-%d"))), halloween_diff = as.numeric(abs(month_day - as.Date("2000-10-31")))) %>% ggplot(aes(season, month_day, color = halloween_diff)) + geom_hline(yintercept = as.Date("2000-10-31"), color = "red", alpha = 0.4) + geom_hline(yintercept = as.Date("2000-10-24"), color = "red", alpha = 0.4, linetype = "dashed") + geom_hline(yintercept = as.Date("2000-11-07"), color = "red", alpha = 0.4, linetype = "dashed") + geom_point() + labs( title = "They're Showing a Halloween Episode...In November!", subtitle = "Treehouse of Horror Episodes by Date Aired", x = "Season", y = "Episode Air Date", color = "Days ± Oct 31st", caption = plot_caption )
- Factoring out the code from the work with
rvestinto a separate R script made it easier to work with. Calling it with
sourcewas a cinch.
ggplotto just plot the month and day as a pair was a little finicky. Did some date conversion and such to get it to work.
- Tried to use
geom_ribbonfor one week gap around Oct 31st, but ran into challenges. Was easier to just make two lines.