Tidytuesday #3: Song Genres

Rip the Knob Off

Working with TidyTuesday data, this time some Spotify info. Always was interested in seeing the introduction of lowercase in song titles, so this accomplished that.

library(tidyverse)
library(scales)

theme_set(theme_light())

plot_caption = "zachbogart.com\nSource: Spotify"
# reading in data
df <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-01-21/spotify_songs.csv')

What are the Playlist Genres Present?

df %>% 
  ggplot(aes(playlist_genre, fill=playlist_genre)) +
    geom_bar(show.legend = FALSE) +
    
    labs(x = "Genre",
         y = "# Songs",
         title = "Pretty Even Split Between Genres",
         subtitle = "Spotify Playlist Genres by count",
         caption = plot_caption)

What is the distribution of tempos for each genre?

avg_tempos = df %>% 
  group_by(playlist_genre) %>% 
  summarise(avg_tempo = mean(tempo)) %>% 
  mutate(row = as.numeric(row.names(.))) %>% 
  arrange(desc(avg_tempo))
df %>% 
  ggplot(aes(tempo)) +
    geom_freqpoly(aes(color = playlist_genre)) +
    geom_text(data = avg_tempos, aes(y = 2500 - (row * 100), 
                                     label = round(avg_tempo),
                                     color = playlist_genre), x = 245) +
    facet_grid(playlist_genre~.) +
    
    labs(title = "EDM Has a Preference",
         subtitle = "Tempos (bpm) for different music genres",
         x = "Tempo (bpm)",
         y = "# Songs",
         color = "Genre",
         caption = plot_caption)

when did song titles start being in lowercase?

spotify = df %>% 
  mutate(track_album_release_date = as.Date(track_album_release_date))

spotify = spotify %>% 
  select(-starts_with("play")) %>% 
  distinct() %>% 
  filter(
      (!is.na(track_name)) &
      (!is.na(track_album_release_date))
    ) %>%
  mutate(lowercase =  str_detect(track_name, '^[a-z]'))
upper = spotify %>% 
  filter(lowercase == FALSE)

lower = spotify %>% 
  filter(lowercase)

Can see that some artists put out several tracks (albums for some) using lowercase track names.

lower %>% 
  group_by(track_artist) %>% 
  summarise(lowercase_songs = n()) %>% 
  arrange(desc(lowercase_songs))
## # A tibble: 181 x 2
##    track_artist    lowercase_songs
##    <chr>                     <int>
##  1 Billie Eilish                20
##  2 Ariana Grande                19
##  3 blackbear                     9
##  4 joan                          7
##  5 Musiq Soulchild               6
##  6 tha Supreme                   6
##  7 chillwagon                    5
##  8 iann dior                     5
##  9 schafter                      5
## 10 flor                          4
## # … with 171 more rows
ggplot(data = upper, aes(track_album_release_date, duration_ms, color = lowercase)) +
  geom_point(alpha = 0.8) +
  geom_point(data = lower, alpha = 0.8) +
  
  scale_color_manual(values = c("#191414", "#1DB954")) +
  scale_y_continuous(labels = label_number(scale = 1/1000)) +
  
  labs(title = "Lowercase Track Titles is a New Thing",
       subtitle = "Song Length, by whether track starts with a lowercase letter",
       color = "Lowercase?",
       x = "Release Date",
       y = "Song Length (seconds)",
       caption = plot_caption)

  • It’s a relatively new phenomenon to use lowercase in a track title. It is a good way to stand out from the crowd.

Image Credit

music notes by Zach Bogart from the Noun Project

Zach Bogart
Zach Bogart
Data Explorer

Science, Design, & Data. I’ll know it when I see it.

Related