In this project, our group set out to obtain quantitative measures of the moral concerns expressed in rap lyrics in order to gain insight into the moral dimensions of rap music and hip-hop subcultures as a whole. Laid out in this document is our process in obtaining these measures, as well as what we learned from the results we got.

Background: How can we computationally analyze morality of rap lyrics?

How do we understand human morals in the first place?

The leading framework for understanding moral attitudes in humans is known as Moral Foundations Theory (MFT). It asserts that the moral concerns of individuals can be distilled into 5 basic categories, known as “moral foundations.” It is thought that each foundation played a key role in human evolution. Different individuals and cultures often identify with some of these categories more strongly than others, and the world’s diverse set of moral belief systems are built from these differing affinities with each of the moral foundations. Each foundation has a “virtue” and a “vice” category, which describe beliefs that are positively and negatively associated with the core foundation, respectively. The foundations are as follows (definitions taken from https://moralfoundations.org, in the format Virtue/Vice: Definition):

  1. Care/Harm: This foundation is related to our long evolution as mammals with attachment systems and an ability to feel (and dislike) the pain of others. It underlies virtues of kindness, gentleness, and nurturance.
  2. Fairness/Cheating: This foundation is related to the evolutionary process of reciprocal altruism. It generates ideas of justice, rights, and autonomy.
  3. Loyalty/Betrayal: This foundation is related to our long history as tribal creatures able to form shifting coalitions. It underlies virtues of patriotism and self-sacrifice for the group. It is active anytime people feel that it’s “one for all, and all for one.”
  4. Authority/Subversion: This foundation was shaped by our long primate history of hierarchical social interactions. It underlies virtues of leadership and followership, including deference to legitimate authority and respect for traditions.
  5. Sanctity/Degradation: This foundation was shaped by the psychology of disgust and contamination. It underlies religious notions of striving to live in an elevated, less carnal, more noble way. It underlies the widespread idea that the body is a temple which can be desecrated by immoral activities and contaminants (an idea not unique to religious traditions).

Research surrounding this framework is most often applied to American political subcultures - for example, there is strong evidence that liberals identify more strongly with concerns of Care/Harm, whereas conservatives identify more strongly with Sanctity/Degradation, among other differences. Furthermore, there is some evidence that if one is trying to persuade another, they are more likely to be successful if they use rhetoric that appeals to the moral concerns of the individual/group they are trying to persuade, i.e. conservatives will be more receptive to arguments for environmental regulation if the argument is presented through a framework of sanctity rather than care.

How do we identify these moral concerns in language?

Computationally understanding the true moral message of a piece of text is a difficult task to accomplish, and it is something that is still being worked on today. We will use a dictionary-based method for our analysis. The dictionary we are using is the second version of the “Moral Foundations Dictionary” (MFD2) (Frimer et al, 2019). It contains a list of 2104 words or phrases, categorized into for the vice and virtue dimensions of each moral foundation.

Table 1: sample list of words for each category in MFD2

Table 1: sample list of words for each category in MFD2

Our strategy is to sum up the words for each category within each song/album/artist etc. in our sample of rap music, then divide those sums by the total number of lines to get an average score of “moral language use” for each category (this is the average number of words in that specific category per line of rap music, for example, if Kanye West’s album JESUS IS KING has a score of 0.3 in the sanctity category, then that album has an average of 0.3 sanctity words per line). From here we can compare the scores for each foundation within different subsets of our whole sample, hopefully gaining a better understanding of which moral foundations are important to rap artists.

Now that we’ve described what we’re actually doing in this analysis, let’s start off our first bit of code, which is to load in the MFD2. (You can download MFD2 here)

#first thing's first, include all the relevant libraries for the analysis
library(data.table)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:data.table':
## 
##     between, first, last
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(tidytext)
library(dplyr)
library(tidyr)
library(ggplot2)
library(textdata)
library(ggrepel)
library(forcats)
library(sjmisc)
## 
## Attaching package: 'sjmisc'
## The following object is masked from 'package:tidyr':
## 
##     replace_na
library(stringr)
library(plyr)
## ------------------------------------------------------------------------------
## You have loaded plyr after dplyr - this is likely to cause problems.
## If you need functions from both plyr and dplyr, please load plyr first, then dplyr:
## library(plyr); library(dplyr)
## ------------------------------------------------------------------------------
## 
## Attaching package: 'plyr'
## The following objects are masked from 'package:dplyr':
## 
##     arrange, count, desc, failwith, id, mutate, rename, summarise,
##     summarize
#declare a useful variable we'll need later
foundations <- c("Care", "Harm", "Fairness", "Cheating",
                 "Loyalty", "Betrayal", "Authority", "Subversion",
                 "Sanctity", "Degradation")

#read in the dictionary. Note that the file path will be different on your system
mfd2 <- fread("../CA21/mfd2.0.dic", sep="\t", header = FALSE)
colnames(mfd2) <- c("word", "value")

Why rap music?

Hip Hop music influences a lot of young listeners due to the glamorous lifestyle that these rappers portray on a daily basis, so they are an important part of society nowadays, and the youth tends to follow these big influences. Especially now that social media has become so prominent and serves as a platform for these rappers to display their lifestyle, it is even easier to get access to their lives at least from images.

Rap Music is important to analyze because the listeners should have the opportunity of knowing why some of their favorite artists speak the way that they do and use these certain harsh lyrics, what are propelling them to speak this way, that easily influenced others due to their status of celebrity. However, doing research into their personal life may have the answer that a lot of listeners may be looking for. However, it may not but studies show a number of traumatic events certain artists have been through. When you look at their lyrics after the events take place you can kind of get a sense as to why their lyrics showed so much aggression, betrayal, degradation, etc.

The moral concerns of rappers is not something that is well-studied, and plenty of moral panics in society have been created surrounding rap music. Moral Foundations Theory provides a good framework for assessing the true moral concerns of rap artists, as it doesn’t frame things as simply “moral” or “immoral,” offering a more nuanced view than many might think of when it comes to morality. Nzinga and Medin (2018) provide a good summary of how the morality of rap music has attempted to be studied before, in addition to a survey of the moral concerns of rap listeners, however, the moral concerns of rappers as a population and how these concerns are identifiable in their lyrics has not been thoroughly investigated. Understanding the moral concerns of both rappers and their listeners may provide comprehensive insight into how hip-hop music influences the morals of young listeners of the genre.

Another reason to computationally investigate the moral language in rap lyrics is that there is no published research indicating that current methods of moral text analysis have been applied to a similar set of text. Most research in moral text analysis focuses on social media posts - particularly those of a political nature. Applying these methods to a subset of text we know well could signal to us whether or not these methods hold up when tested on a different population.

Before doing our analysis, we hypothesized that moral “vices” of harm, degradation, and subversion would show up frequently in rap lyrics, whereas moral “virtues” of ingroup loyalty and fairness would also be prominent.

Methods of Data Collection

We selected a sample of 254 contemporary rap albums that we felt represented the genre well - including albums with widespread recognition, albums with critical acclaim, and albums that are particularly relevant to Hip-Hop Heads as a subculture. In order to stay relatively consistent in our collection methods, as well as control for any significant changes in language use throughout the years, we chose to only include albums published in 2015 or later (with the exception of Acid Rap by Chance the Rapper and Faces by Mac Miller. Why? Because I said so).

Figure 1: Breakdown of rap albums included in sample by year (not including 2013 or 2014)

Figure 1: Breakdown of rap albums included in sample by year (not including 2013 or 2014)

The lyrics of the sample were scraped from http://genius.com. The R packages for scraping Genius lyrics are currently not working, so we had to complete this step using the Python package LyricsGenius. As a consequence of this, we cannot include the code for getting the lyrics. However, you can download the dataset collected from the lyrics here.

The dataset is a table of lines from individual rap songs with information of their album, artist, and year they were released. We also obtained the number of clicks the Genius page for each song got, as a measure of the relative popularity of each song. From there, we could use the previously mentioned methods to obtain statistics of average words per line in each category for each artist, album, or song.

#let's load in our lyrics dataset and assign scores associated with each moral foundation to them
all_lyrics_table <- fread("all_lyrics_table_revised.txt")
album_lines <- data.frame(all_lyrics_table, "reference" = 1:dim(all_lyrics_table)[1])

#count words in the dataset
album_lines_counted <- album_lines %>%
  unnest_tokens(word, Lyric) %>%
  dplyr::count(reference, word, sort = TRUE)%>%
  dplyr::rename(per_line = n)

#assign scores from MFD2 to each word in the dataset
albums_with_scores <- full_join(album_lines_counted, mfd2, by="word")

#words without a value get a score of 0
albums_with_scores[is.na(albums_with_scores)] = 0

#create columns for each of the categories in MFD2
albums_with_scores$care_virtue <- numeric(nrow(albums_with_scores))
albums_with_scores$care_vice <- numeric(nrow(albums_with_scores))
albums_with_scores$fairness_virtue <- numeric(nrow(albums_with_scores))
albums_with_scores$fairness_vice <- numeric(nrow(albums_with_scores))
albums_with_scores$loyalty_virtue <- numeric(nrow(albums_with_scores))
albums_with_scores$loyalty_vice <- numeric(nrow(albums_with_scores))
albums_with_scores$authority_virtue <- numeric(nrow(albums_with_scores))
albums_with_scores$authority_vice <- numeric(nrow(albums_with_scores))
albums_with_scores$sanctity_virtue <- numeric(nrow(albums_with_scores))
albums_with_scores$sanctity_vice <- numeric(nrow(albums_with_scores))

#assign values of words to their proper category
for(i in 1:nrow(albums_with_scores)){
  if(albums_with_scores$value[i] == 1){
    albums_with_scores$care_virtue[i] <- albums_with_scores$per_line[i]
  }
  if(albums_with_scores$value[i] == 2){
    albums_with_scores$care_vice[i] <- albums_with_scores$per_line[i]
  }
  if(albums_with_scores$value[i] == 3){
    albums_with_scores$fairness_virtue[i] <- albums_with_scores$per_line[i]
  }
  if(albums_with_scores$value[i] == 4){
    albums_with_scores$fairness_vice[i] <- albums_with_scores$per_line[i]
  }
  if(albums_with_scores$value[i] == 5){
    albums_with_scores$loyalty_virtue[i] <- albums_with_scores$per_line[i]
  }
  if(albums_with_scores$value[i] == 6){
    albums_with_scores$loyalty_vice[i] <- albums_with_scores$per_line[i]
  }
  if(albums_with_scores$value[i] == 7){
    albums_with_scores$authority_virtue[i] <- albums_with_scores$per_line[i]
  }
  if(albums_with_scores$value[i] == 8){
    albums_with_scores$authority_vice[i] <- albums_with_scores$per_line[i]
  }
  if(albums_with_scores$value[i] == 9){
    albums_with_scores$sanctity_virtue[i] <- albums_with_scores$per_line[i]
  }
  if(albums_with_scores$value[i] == 10){
    albums_with_scores$sanctity_vice[i] <- albums_with_scores$per_line[i]
  }
}

#Now we can count up the moral words in each line for each category
album_per_line<-albums_with_scores%>%
  group_by(reference)%>%
  dplyr::summarize(care_virtue_value=sum(care_virtue), care_virtue_var=sd(care_virtue),
            care_vice_value=sum(care_vice), care_vice_var=sd(care_vice),
            fairness_virtue_value=sum(fairness_virtue), fairness_virtue_var=sd(fairness_virtue),
            fairness_vice_value=sum(fairness_vice), fairness_vice_var=sd(fairness_vice),
            loyalty_virtue_value=sum(loyalty_virtue), loyalty_virtue_var=sd(loyalty_virtue),
            loyalty_vice_value=sum(loyalty_vice), loyalty_vice_var=sd(loyalty_vice),
            authority_virtue_value=sum(authority_virtue), authority_virtue_var=sd(authority_virtue),
            authority_vice_value=sum(authority_vice), authority_vice_var=sd(authority_vice),
            sanctity_virtue_value=sum(sanctity_virtue), sanctity_virtue_var=sd(sanctity_virtue),
            sanctity_vice_value=sum(sanctity_vice), sanctity_vice_var=sd(sanctity_vice))

#get dataframe with full information on the lines
album_morals <- inner_join(album_lines, album_per_line, by="reference")

#Now we can use the following code to get information of average moral language use per track, album, and artist
normalized_albums <- album_morals %>% 
  group_by(Album) %>% 
  dplyr::summarize(normalized_care_virtue=sum(care_virtue_value) / n(), normalized_care_vice = sum(care_vice_value) / n(),
            normalized_fairness_virtue=sum(fairness_virtue_value) / n(), normalized_fairness_vice = sum(fairness_vice_value) / n(),
            normalized_loyalty_virtue=sum(loyalty_virtue_value) / n(), normalized_loyalty_vice=sum(loyalty_vice_value) / n(),
            normalized_authority_virtue=sum(authority_virtue_value) / n(), normalized_authority_vice=sum(authority_vice_value) / n(),
            normalized_sanctity_virtue=sum(sanctity_virtue_value) / n(), normalized_sanctity_vice=sum(sanctity_vice_value) / n(), Artist=Artist,
            Year=Year, num_tracks=n(), avg_page_views = sum(Page_views) / n()) %>% ungroup()
## `summarise()` has grouped output by 'Album'. You can override using the `.groups` argument.
normalized_albums <- distinct(normalized_albums)

normalized_tracks <- album_morals %>% 
  group_by(Track) %>% 
  dplyr::summarize(normalized_care_virtue=sum(care_virtue_value) / n(), normalized_care_vice = sum(care_vice_value) / n(),
            normalized_fairness_virtue=sum(fairness_virtue_value) / n(), normalized_fairness_vice = sum(fairness_vice_value) / n(),
            normalized_loyalty_virtue=sum(loyalty_virtue_value) / n(), normalized_loyalty_vice=sum(loyalty_vice_value) / n(),
            normalized_authority_virtue=sum(authority_virtue_value) / n(), normalized_authority_vice=sum(authority_vice_value) / n(),
            normalized_sanctity_virtue=sum(sanctity_virtue_value) / n(), normalized_sanctity_vice=sum(sanctity_vice_value) / n(), Album=Album,
            Year=Year, Has_features = Has_features, Featured_artists=Featured_artists, Page_views = Page_views,
            Artist = Artist) %>% 
  ungroup()
## `summarise()` has grouped output by 'Track'. You can override using the `.groups` argument.
normalized_tracks <- distinct(normalized_tracks)

normalized_artists <- album_morals %>% 
  group_by(Artist) %>% 
  dplyr::summarize(normalized_care_virtue=sum(care_virtue_value) / n(), normalized_care_vice = sum(care_vice_value) / n(),
            normalized_fairness_virtue=sum(fairness_virtue_value) / n(), normalized_fairness_vice = sum(fairness_vice_value) / n(),
            normalized_loyalty_virtue=sum(loyalty_virtue_value) / n(), normalized_loyalty_vice=sum(loyalty_vice_value) / n(),
            normalized_authority_virtue=sum(authority_virtue_value) / n(), normalized_authority_vice=sum(authority_vice_value) / n(),
            normalized_sanctity_virtue=sum(sanctity_virtue_value) / n(), normalized_sanctity_vice=sum(sanctity_vice_value) / n()) %>% 
  ungroup()

normalized_artists <- distinct(normalized_artists)

Results

Now we can use the data we collected to finally get a good look at the moral language use of the rappers in our sample. Let’s start with a bird’s eye view plot of scores for each moral language category for each album:

longer_normalized_albums <- normalized_albums %>% 
  pivot_longer(cols=starts_with("normalized"), names_to="Foundation", values_to="Words_per_line")

longer_normalized_albums %>% 
  mutate(Foundation=fct_relevel(Foundation, "normalized_care_virtue", "normalized_care_vice", "normalized_fairness_virtue", "normalized_fairness_vice", "normalized_loyalty_virtue",
                                "normalized_loyalty_vice", "normalized_authority_virtue", "normalized_authority_vice",
                                "normalized_sanctity_virtue", "normalized_sanctity_vice")) %>% 
  ggplot(aes(Foundation, Words_per_line, colour=Foundation))+
  geom_jitter(show.legend = FALSE)+
  labs(x="Foundation", y="Words per line",
       title="Language use associated with moral foundations in rap albums")+
  theme(plot.title = element_text(size=20, hjust=0.5), axis.title = element_text(size=20), axis.text.x = element_text(size=10), strip.text.x = element_text(size=15))+
  scale_x_discrete(labels=foundations)

We can see from the plot that the highest moral concern expressed in the sampled rap albums is degradation, followed by sanctity, care, and harm. When we break down the averages year-by-year, here’s what we see:

#get dataframe for average values by year
summary_by_year <- normalized_albums %>% 
  group_by(Year) %>% 
  dplyr::summarize(avg_care_virtue=sum(normalized_care_virtue) / n(), avg_care_vice = sum(normalized_care_vice) / n(),
            avg_fairness_virtue=sum(normalized_fairness_virtue) / n(), avg_fairness_vice = sum(normalized_fairness_vice) / n(),
            avg_loyalty_virtue=sum(normalized_loyalty_virtue) / n(), avg_loyalty_vice=sum(normalized_loyalty_vice) / n(),
            avg_authority_virtue=sum(normalized_authority_virtue) / n(), avg_authority_vice=sum(normalized_authority_vice) / n(),
            avg_sanctity_virtue=sum(normalized_sanctity_virtue) / n(), avg_sanctity_vice=sum(normalized_sanctity_vice) / n()) %>% 
  ungroup()


#remove 2013 and 2014 because there is only one sample from each year
summary_by_year_subset <- summary_by_year[-c(1, 2), ]

#pivot the bad boy
longer_by_year <- summary_by_year_subset %>% 
  pivot_longer(cols=starts_with("avg"), names_to="Foundation", values_to="Average_score")

#plot the bad boy
longer_by_year %>% 
  mutate(Foundation=fct_relevel(Foundation, "avg_care_virtue", "avg_care_vice",
                                "avg_fairness_virtue", "avg_fairness_vice", "avg_loyalty_virtue",
                                "avg_loyalty_vice", "avg_authority_virtue", "avg_authority_vice",
                                "avg_sanctity_virtue", "avg_sanctity_vice")) %>% 
  ggplot(aes(as.factor(Year), Average_score, colour=Foundation, group=Foundation))+
  geom_line(size=2)+
  labs(x="Year", y="Average Score",
       title="Language use associated with moral foundations in rap albums by year")+
  scale_colour_discrete(labels=foundations)

From the time series, we see that there is virtually no variation in the overall hierarchy of moral language use throughout the years. The degradation concern changes in its value a bit, but the way the concerns are ordered is consistent - degradation is highest, followed (after a sizeable leap) by sanctity, care, and harm. The rest of the concerns are mildly to barely expressed. Why do we see this pattern? Let’s zoom in on some of these foundations…

A closer look at the most prominent foundations

Degradation

Degradation is the most commonly appearing form of moral language. This isn’t surprising - words relating to sex and drug use appear in this category, as well as expletives, which are common in rap music. The highest scoring song in this category is “Dirty” by Shoreline Mafia - the repetition of the word “dirty” is probably the biggest influence on the score. Here is a snippet of the lyrics:

Chorus of “Dirty” by Shoreline Mafia

Chorus of “Dirty” by Shoreline Mafia

The lyrics of this song are strongly representative of a greater set of activities that are commonly portrayed in rap music that would imply living in a sanctious way is not important to the rappers. It seems the song is performed in such a way that would imply the artists take pride in a lifestyle that acts in opposition to the moral foundation of sanctity.

An example of a song that scores high in this category that might suggest a breakdown in the strength of the model is “Keep the devil Off” by Big K.R.I.T.

Lyrics to the chorus of “Keep the devil Off” by Big K.R.I.T.

Lyrics to the chorus of “Keep the devil Off” by Big K.R.I.T.

In a 2017 NPR interview about the album this song comes from, Big K.R.I.T. stated the song is “just warning you to keep the negativity away.” This provides insight into the fact that the results of this dictionary method can’t always be trusted - it’s possible an artist uses language from one category of moral language without strongly identifying with that dimension of the moral foundation, even within the message of the song.

Sanctity

Sanctity is the moral foundation that is strongly (although not necessarily) associated with religion, which is a common theme in rap music. Most of the time, if a song scores high in the sanctity category, it has to do with Christianity. Take “Jesus Is Lord” by Kanye West - the highest scoring song in this category:

Lyrics to “Jesus Is Lord” by Kanye West

Lyrics to “Jesus Is Lord” by Kanye West

Part of the high score in this song can be attributed to its length - this is the only verse, and it isn’t very lyrically substantive, so the few mentions of Jesus have a huge impact on the overall score. However, both the high score and the lyrics accurately represent Kanye’s greater attitude during the time this was recorded - his relationship with Christianity, although always a theme in his music, was becoming a more prominent part of his public image. He seemed committed to a lifestyle of “pure” activities, including prohibiting people working on his album, JESUS IS KING, from having pre-marital sex. JESUS IS KING is also free of swear words.

While there are no direct contradictions to the assumption that sanctity is important to the rappers who make songs that score highly in the category, a song with a more nuanced meaning that scores highly in this category is “Holy Ghost” by A\(AP Rocky:</p> ![Snippet of the first verse of "Holy Ghost" by A\)AP Rocky](holyghostlyrics.png)

This song is a criticism of organized religion. Rocky comments on the hypocrisy of the Church, drawing parallels between religious imagery and imagery of a contradictory lifestyle. While a close lyrical analysis of the song would lead to the conclusion that sanctity is important to A$AP Rocky (note that MFT draws an important distinction between the foundation of sanctity and organized religion), it’s easy to see from this how a song using religious imagery to criticize the idea of sanctity in general could be made, which the MFD2 would score highly in the sanctity category.

The high scores in both sanctity and degradation across our entire sample makes the question of how important sanctity is to rap artists a confusing one to answer. It’s certainly possible to be religious without actually caring too much about sanctity - that’s part of what A$AP Rocky is calling out in “Holy Ghost.” This makes us think that the sanctity category of the MFD2 works more as a “religion detector” (one that is highly biased towards western religion, at that) than an indicator of the overarching concern of sanctity.

Care

The foundation of care shows up most commonly in songs about love. Take the highest-scoring song in this category - “Swizz Beatz” by Young Thug:

Lyrics to the chorus of “Swizz Beatz” by Young Thug

Lyrics to the chorus of “Swizz Beatz” by Young Thug

However, the verses of this song muddy the central theme of this song, and it doesn’t seem to reveal too much valuable information about how much Thugger values things like kindness or gentleness:

Lyrics to the second verse of “Swizz Beatz” by Young Thug

Lyrics to the second verse of “Swizz Beatz” by Young Thug

The next highest-scoring song in this category is “My Thoughts On Neogaf Dying” by JPEGMAFIA, which easily contradicts its own score:

Chorus to “My Thoughts On Neogaf Dying” by JPEGMAFIA

Chorus to “My Thoughts On Neogaf Dying” by JPEGMAFIA

Both of these songs raise an important question - how much does repetition of the same word or phrase matter in terms of detecting an artists’ moral concerns? This is something that probably isn’t an issue when these methods are applied to things like social media posts, but repetition is a key part of music. Does a phrase that’s repeated 8 times in the same context really say 8 times more about the moral concerns of the artist than a phrase that is said once in its own context? It is undoubtedly more important, but is its importance directly proportional to the number of times its repeated?

Harm

Violence is a very common theme in rap music - so it’s not surprising to see that this category appears commonly. However, looking at the MFD2, it seems that analysis of this category is one that could be vastly improved by including words that are more relevant to the population of interest - references to violence in rap music often includes references to specific types of weaponry that aren’t included in the MFD2, for example. Here’s an example of a song that scores high in this category - A Report to the Shareholders / Kill Your Masters by Run The Jewels (again, this is given a high score because of its repetition:

Lyrics to the chorus of “Kill Your Masters” by Run The Jewels

Lyrics to the chorus of “Kill Your Masters” by Run The Jewels

It should be noted that this song is strongly anti-authority, which leads into our next section…

Where is the disrespect for authority?

Far and away the most surprising result of this analysis is how low the scores in the subversion category are. Subversion is undoubtedly an important aspect of hip-hop culture, with songs like “Fight the Power” by Public Enemy and “Fuck Tha Police” being popular songs that influenced the message of political hip-hop songs for years to come. A more recent example of a subversive hip-hop song is “FDT” by YG. Furthermore, a study that investigated moral language use in tweets following the killing of George Floyd found that subversion was the most commonly used category (although they used the extended Moral Foundations Dictionary instead, which includes weighted values for each of the words) (Priniski et al, 2021). So why is it then, that subversion is the least commonly expressed category of moral language in rap music?

In order to gain more insight into why this might be, we chose a sample of protest songs that have a message of subversion to authority to see if MFD2 picks up their messages any differently from our original sample. This sample primarily consists of rap music inspired by the Black Lives Matter movement, but some older cuts such as “Fuck Tha Police” are included as well. We then calculated the average authority/subversion values for the protest songs and compared them to the average of all the songs in our original sample. The protest dataset can be found here. Some steps will be skipped here, as they are identical to the steps followed to process the original dataset of albums.

#load in processed dataset
protest_tracks_normalized <- read.csv("protest_tracks_normalized.csv")

#create dataframe of important values to plot - we're working with few numbers so it's easier to do this manually
avg_protest_authority <- mean(protest_tracks_normalized$normalized_authority_virtue)
avg_protest_subversion <- mean(protest_tracks_normalized$normalized_authority_vice)

avg_album_authority <- mean(normalized_albums$normalized_authority_virtue)
avg_album_subversion <- mean(normalized_albums$normalized_authority_vice)

groups <- c("Protest Songs", "Protest Songs", "Albums", "Albums")
authority_foundations <- c("Authority", "Subversion", "Authority", "Subversion")

authority_values <- c(avg_protest_authority, avg_protest_subversion, avg_album_authority, avg_album_subversion)

protest_df <- data.frame(authority_values, groups, authority_foundations)

ggplot(protest_df, aes(groups, authority_values, fill=authority_foundations))+
  geom_col(position="dodge")+
  labs(x="Sample", y="Mean Score", colour="Dimension",
       title="Language related to Authority/Subversion concerns in the original sample vs. protest songs")+
  theme(plot.title=element_text(size=15, hjust=0.5), axis.title=element_text(size=15))

We can see from this plot a huge increase in the authority foundation for songs that are specifically anti-authority. A big reason for this is that the word “police” is included in the authority category of MFD2, so is the word “slave.” It makes sense, then, that “Fuck Tha Police” by N.W.A. and “FTP” by YG have the highest values in the authority category out of all the protest songs - remember that song “Kill Your Masters” from the last section? That song is also very high in the authority category, but not so much so in subversion.

Overall, it seems that the MFD2 severely lacks in picking up the nuances of the Authority/Subversion category, at least in rap music. A way to improve it for this population would be to include derogatory terms for authority figures, such as the word “pigs,” which is commonly used to refer to police.

Do artists use moral language differentially when including features on their song (compared to their solo work)?

One potentially insightful piece of information included in our dataset is whether or not each song has a feature or not. This begs the question of if we see any differences in moral language use when artists include features in their songs compared to their solo work. Our hypothesis - there could certainly be some interesting changes - more introspective rap cuts tend towards being performed solo.

To investigate this question, we can use our dataframe grouped by artists and calculate the difference in average score between solo and featured songs for each moral category.

#Let's make a dataset that calculates averages for featured/unfeatured songs by artist
normalized_artist_featured_songs <- album_morals %>% 
  group_by(Artist) %>% 
  dplyr::summarize(avg_feature_care_virtue=sum(care_virtue_value[Has_features == TRUE]) / sum(Has_features == TRUE), avg_feature_care_vice=sum(care_vice_value[Has_features == TRUE]) / sum(Has_features == TRUE),
            avg_feature_fairness_virtue=sum(fairness_virtue_value[Has_features == TRUE]) / sum(Has_features == TRUE), avg_feature_fairness_vice=sum(fairness_vice_value[Has_features == TRUE]) / sum(Has_features == TRUE),
            avg_feature_loyalty_virtue=sum(loyalty_virtue_value[Has_features == TRUE]) / sum(Has_features == TRUE), avg_feature_loyalty_vice=sum(loyalty_vice_value[Has_features == TRUE]) / sum(Has_features == TRUE),
            avg_feature_authority_virtue=sum(authority_virtue_value[Has_features == TRUE]) / sum(Has_features == TRUE),avg_feature_authority_vice=sum(authority_vice_value[Has_features == TRUE]) / sum(Has_features == TRUE),
            avg_feature_sanctity_virtue=sum(sanctity_virtue_value[Has_features == TRUE]) / sum(Has_features == TRUE), avg_feature_sanctity_vice=sum(sanctity_vice_value[Has_features == TRUE]) / sum(Has_features == TRUE),
            avg_solo_care_virtue=sum(care_virtue_value[Has_features == FALSE]) / sum(Has_features == FALSE), avg_solo_care_vice=sum(care_vice_value[Has_features == FALSE]) / sum(Has_features == FALSE),
            avg_solo_fairness_virtue=sum(fairness_virtue_value[Has_features == FALSE]) / sum(Has_features == FALSE), avg_solo_fairness_vice=sum(fairness_vice_value[Has_features == FALSE]) / sum(Has_features == FALSE),
            avg_solo_loyalty_virtue=sum(loyalty_virtue_value[Has_features == FALSE]) / sum(Has_features == FALSE), avg_solo_loyalty_vice=sum(loyalty_vice_value[Has_features == FALSE]) / sum(Has_features == FALSE),
            avg_solo_authority_virtue=sum(authority_virtue_value[Has_features == FALSE]) / sum(Has_features == FALSE),avg_solo_authority_vice=sum(authority_vice_value[Has_features == FALSE]) / sum(Has_features == FALSE),
            avg_solo_sanctity_virtue=sum(sanctity_virtue_value[Has_features == FALSE]) / sum(Has_features == FALSE), avg_solo_sanctity_vice=sum(sanctity_vice_value[Has_features == FALSE]) / sum(Has_features == FALSE),
            average_page_views = sum(Page_views) / n()) %>% 
  ungroup()

#lengthen the bad boy
longer_features <- normalized_artist_featured_songs %>% 
  pivot_longer(cols=starts_with("avg"), names_to = "Foundation", values_to = "Average_score")

#collapse foundations back to the core 10 categories and let feature/solo be its own column
graph_features <- longer_features %>% 
  dplyr::mutate(has_feature = str_detect(Foundation,"feature"),
         overall_foundation = str_replace(Foundation, "avg_([a-z])+_care_virtue", "care_virtue"),
         overall_foundation = str_replace(overall_foundation, "avg_([a-z])+_care_vice", "care_vice"),
         overall_foundation = str_replace(overall_foundation, "avg_([a-z])+_fairness_virtue", "fairness_virtue"),
         overall_foundation = str_replace(overall_foundation, "avg_([a-z])+_fairness_vice", "fairness_vice"),
         overall_foundation = str_replace(overall_foundation, "avg_([a-z])+_loyalty_virtue", "loyalty_virtue"),
         overall_foundation = str_replace(overall_foundation, "avg_([a-z])+_loyalty_vice", "loyalty_vice"),
         overall_foundation = str_replace(overall_foundation, "avg_([a-z])+_authority_virtue", "authority_virtue"),
         overall_foundation = str_replace(overall_foundation, "avg_([a-z])+_authority_vice", "authority_vice"),
         overall_foundation = str_replace(overall_foundation, "avg_([a-z])+_sanctity_virtue", "sanctity_virtue"),
         overall_foundation = str_replace(overall_foundation, "avg_([a-z])+_sanctity_vice", "sanctity_vice"))

#calculate differences
feature_differences <- graph_features %>% 
  group_by(Artist, overall_foundation) %>% 
  dplyr::summarize(difference=diff(Average_score), average_page_views=average_page_views) %>% 
  ungroup()
## `summarise()` has grouped output by 'Artist', 'overall_foundation'. You can override using the `.groups` argument.
feature_differences <- distinct(feature_differences)

feature_differences$overall_foundation <- as.factor(feature_differences$overall_foundation)

#PLOT the bad boy!
feature_differences %>% 
  mutate(overall_foundation=fct_relevel(overall_foundation, "care_virtue", "care_vice", "fairness_virtue",
                                        "fairness_vice", "loyalty_virtue", "loyalty_vice", "authority_virtue",
                                        "authority_vice", "sanctity_virtue", "sanctity_vice")) %>% 
  ggplot(aes(overall_foundation, difference, colour=overall_foundation, size=average_page_views))+
  geom_jitter()+
  guides(colour="none")+
  labs(x="Foundation", y="Difference in moral language use",
       title="Difference in rap artists' moral language use performing solo vs with features",
       size="Average Page Views")+
  scale_x_discrete(labels=foundations)+
  theme(plot.title = element_text(size=20, hjust=0.5), axis.title = element_text(size=20),
        axis.text.x = element_text(size=10, angle=45), strip.text.x = element_text(size=15))
## Warning: Removed 130 rows containing missing values (geom_point).

In the graph output, positive values represent artists that use more moral language of a given category in their solo work rather than their songs with features. We can see that there’s no convincing general pattern either way, and the outliers in the graph that look interesting are usually due to small sample sizes (i.e. an artist that has very few songs with features is more likely to have an extreme value in that category because there aren’t enough observations to calibrate it towards some central tendency). Regardless, this is still useful information. It’s a bit surprising to learn that there isn’t much of a pattern here, and it tells us that the language of artists doesn’t change too much when they have features on their songs.

Zooming in on individual artists - where do we see interesting deviations from our general pattern?

One thing we can do with these data is produce graphs of moral language use for each album for a given artist to see if their moral language use changed significantly between albums - there could be a lot of reasons for shifts in moral language use; the most interesting ones being major life events that cause rappers to rethink their morality, and general stylistic shifts in album themes that might lead to shifts in moral language use that don’t necessarily reflect a change in the artist’s morals. Let’s show an example of how to “zoom in” with Kanye West, whose lifestyle and morality shifts have been mentioned before in this document (note that we won’t show the code for every rapper, as the same template can be followed):

kanye_west <- filter(longer_normalized_albums, str_detect(Artist, "Kanye West"))
kanye_west$Album <- as.factor(kanye_west$Album)

kanye_west %>% 
  mutate(Foundation=fct_relevel(Foundation, "normalized_care_virtue", "normalized_care_vice",
                                "normalized_fairness_virtue", "normalized_fairness_vice", "normalized_loyalty_virtue",
                                "normalized_loyalty_vice", "normalized_authority_virtue", "normalized_authority_vice",
                                "normalized_sanctity_virtue", "normalized_sanctity_vice"),
         Album=fct_relevel(Album, "The Life of Pablo", "Ye", "JESUS IS KING", "Donda")) %>% 
  ggplot(aes(Foundation, Words_per_line, fill=Foundation, width=avg_page_views/max(avg_page_views)))+
  geom_col(position="dodge")+
  theme(plot.title = element_text(size=20, hjust=0.5), axis.title = element_text(size=20), axis.text.x = element_text(angle=45, size=10), strip.text.x = element_text(size=15))+
  facet_grid(cols=vars(Album))+
  labs(x="Foundation", y="Words per line",
       title="Language use associated with moral foundations in Kanye Wests's albums")+
  scale_x_discrete(labels=foundations)+
  scale_fill_discrete(labels=foundations)

In this plot, each sub-plot represents a different album, with the bars representing averages for each foundation. The width of the bars represents the popularity of the album, measured by average number of page views for each song on Genius. This is to see if listeners are more receptive to certain categories of moral language use, although this statistic is heavily influenced by the total amount of time an album has been out.

Kanye West’s usage of moral language shifts quite intensely when comparing his latest albums. Especifically, “Jesus is King” and “Donda” are two albums that touch on love, grief and religion, far more than his previous works, and it clearly shows. The increase in sanctity is a major indicator of the main topics portrayed in the albums, as well as the decrease in degradation. The decrease in harm when comparing “Jesus is King” and “Donda” to “Ye” is also noticeable, attributed to lesser usage of violent language. However, the lack of significant changes on care are a bit unexpected, as it would usually also see significant changes due to the nature of the albums’ subjects.

Drake

One of the most unique cases in our research was Drake’s. Throughout his last five albums, Drake’s usage of moral language has increased objectively, in almost every single way. The most noticeable characteristics are care, degradation and harm, having a consistent increase; this is probably attributed to the emotionally charged language used by Drake, who often uses his songs to engage in themes like love, relationships and gangster activities. It could, however, also be attributed to an overall significant increase in repetition by Drake, in line with the pattern we have been seeing.

Young Thug

As we have seen earlier, in the song “Swizz Beatz,” Young Thug repeats the word love dozens of times, single handedly making for the biggest shift in this entire graph, which you see on the album JEFFERY, in the care section. Also noticeable is the overall increase in degradation throughout his last few albums, which also seems to be attributed to repetition, due to the positive nature of those albums when compared to Slime Season 3, which had the highest degree of degradation.

Do rappers change their moral language use after having a baby?

I apologize, often womanize Took for my child to be born To see through a woman’s eyes Took for these natural twins to believe in miracle Took me too long for this song I don’t deserve you.

-JAY-Z, 4:44

Having a child can be a life-changing event, and it’s one that rappers often touch on in their music. But do we see any major changes in artists’ moral language use after having one? We identified the rappers in our sample that became new parents at some point between two of their albums to see if we notice any changes.

NLE Choppa had a daughter after his Cottonwood album and before his Top Shotta album

NLE Choppa had a daughter after his Cottonwood album and before his Top Shotta album

One of our prime examples of positively-induced changes after having a baby is the artist NLE Choppa. Choppa, who after having his first baby increased his care and sanctity, also managed to decrease his harm and degradation; a pattern not nearly as common as we had imagined, only seen in Choppa and Polo G.

J. Cole had a baby after his 4 Your Eyez Only album and before his KOD album

J. Cole had a baby after his 4 Your Eyez Only album and before his KOD album

When analyzing J. Cole, on the other hand, we saw an increase in harm and degradation, as well as a decrease in sanctity; all of which were unexpected to say the least. Usually seen as a “woke” rapper, who also engages in social problems, J. Cole’s demeanor went from bad to worse when his child was born.

Table summarizing Care/Harm and Sanctity/Degradation changes for artists who became parents

Table summarizing Care/Harm and Sanctity/Degradation changes for artists who became parents

What seems to be one of the patterns after rappers have a child, is the fact that moral language increases overall, perhaps due to the strong emotions they are dealing with from taking care of a newborn.

Networks of moral language use

One of the things we did with our dataset was create a network (using Gephi), with nodes being rappers in our sample and edges being drawn between rappers who collaborated on the same song. Weights correspond to number of collaborations between two artists. From here, we can map average values of moral language use in each category to the network

Network with sizes of names corresponding to eigenvector centrality

Network with sizes of names corresponding to eigenvector centrality

Before looking at any of the most important artists for any of the moral categories, let’s look at who the most central artists are. The names in the above graph are sized according to eigenvector centrality, roughly measuring who is the most important based on number of collaborations. We can see that Kanye West is the most central, with Young Thug and America’s #1 boat-themed rapper not too far off.

Network with sizes of names corresponding to average care score

Network with sizes of names corresponding to average care score

Network with sizes of names corresponding to average harm score

Network with sizes of names corresponding to average harm score

Network with sizes of names corresponding to average authority score

Network with sizes of names corresponding to average authority score

Network with sizes of names corresponding to average subversion scores

Network with sizes of names corresponding to average subversion scores

Looking at our different networks of names sized based on moral language data shows us that with each category we analyze, different artists appear. With each of the different categories you can see different artists have more or less language towards a specific topic. These results are interesting because you can see trends in artists reappearing in certain categories. For example, Xxxtentacion appears to be the largest name in both care and harm, but it also appears as a predominant name in the subversion genre. Besides a few artists that you can see in multiple categories there is little consistency in the names and the amount of lyrics used in each category by the same artists. If you dive a little deeper, you see a few names that show up consistently that bring up the moral language use in these networks. This however does not tend to raise the moral language use of other artists, and what is interesting about this is that Kanye West works with many artists as you can see from the eigenvector centrality network graph. One of the more surprising graphs is care and authority, and we say that because there are many more names on these two graphs than the other ones. This network data gives us insight to how artists work with one another in regards to their language and influence on one another. This can be incredibly useful in our lyrical analysis and how artist use the lyrics they use to connect with their audience.

Conclusion

Unfortunately, most of the results we got from this analysis say more about the methods we used to analyze the rap music than the rap music itself. For instance, our hypotheses about which moral foundations would be most prevalent differed from our output - but we don’t quite buy that our hypotheses were disproven. This isn’t entirely bad - moral sentiment analysis of rap music is uncharted territory, and this preliminary analysis highlights important things to consider when attempting to use a sample like this in the future. The most obvious issue that comes up is the issue of repetition in lyrics - almost every song that scores highly in any of the categories contains a word in the dictionary that is repeated multiple times. These might obscure songs that contain more nuanced moral statements. The categories that were picked up frequently by the dictionary in this analysis are also in line with the categories that Kennedy et al. (2021) found were most easily traceable in social media posts in their evaluation of moral sentiment analysis methods, which might suggest that the concerns we see show up more are simply what the dictionary is better at picking up.

Another prominent issue is that this dictionary isn’t well suited to pick up on the lexicon of rap music - there are plenty of categories (namely harm and subversion) where more words could have been included that would be more easily identifiable indicators of moral concerns of rap artists, leading to more accurate results.

Regardless of the limitations of our methods, the fact that we got something out of our analysis is exciting - there were plenty of examples, such as Kanye West’s discography, where the results of the model aligned with our own ideas of the moral concerns of rap artists. From this reality, we have no doubt that computational analysis of moral language is headed in a positive direction for the future and would be very curious to see how the results of this analysis would differ if done using different methods. It might also be of interest to look at the sonic elements of music (spotify valence, danceability, etc.) and see if any of those are well correlated with moral language use, as well as using a better evaluation of song popularity to complete a more robust analysis of how receptive listeners are to differences in moral language use.

Works Cited

Frimer, J. A., Boghrati, R., Haidt, J., Graham, J., & Dehgani, M. (2019). Moral Foundations Dictionary for Linguistic Analyses 2.0. Unpublished manuscript.

Nzinga, K.L.K., & Medin, D.L. (2018). The Moral Priorities of Rap Listeners. Journal of Cognition and Culture, 312-342.

Priniski, J. Hunter et al. (2021). Mapping Moral Valence of Tweets Following the Killing of George Floyd.

Kennedy, Brendan et al. (2021). Moral concerns are differentially observable in language. Cognition.