Skip to content

Commit

Permalink
webapp and finalized notebooks
Browse files Browse the repository at this point in the history
  • Loading branch information
kelseyfglenn committed Aug 21, 2020
1 parent ead281e commit 25157a3
Show file tree
Hide file tree
Showing 27 changed files with 3,449 additions and 744 deletions.
1,277 changes: 1,277 additions & 0 deletions .~Metis_Proj4_Viz__24311.twbr

Large diffs are not rendered by default.

Binary file added Metis - Project 4 - Clusta Rhymes.pdf
Binary file not shown.
39 changes: 39 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
Objective:

* Categorize hip-hop songs and artists by lyrical content and prosodic style.

Notebook Order:

* data_collection -> preprocessing -> topic_modeling -> clustering
* webapp folder contains files for flask deployment including Tableau dashboard embedding

Data Sources:

* **Artist List:** “List of hip hop musicians”, Wikipedia https://en.wikipedia.org/wiki/List_of_hip_hop_musicians

* **Artist and Song Metadata:** Genius API https://api.genius.com/

* **Song Lyrics:** Genius.com web scrape (API doesn’t support lyric requests)

Methodology:

* **Data Collection**
* Scrape wikipedia for artist names
* Genius API requests for top N songs by each artist
* Genius.com scrape for lyrics to each song
* **Preprocessing**
* Clean and tokenize text
* Generate TF-IDF matrix
* Calclating unique word proportions and syllable rates
* **Analysis**
* Topic Modeling
* NMF Topic Modeling to create semantic categories
* Combine with unique word and syllabic information and apply KMeans clustering
* Aggregate artists’ song categorizations to characterize their style
* **Deployment**
* Recommender flask application
* Tableau visualization

Link to Tableau Public Workbook:

https://public.tableau.com/views/Metis_Proj4_Viz/Dashboard1?:language=en&:display_count=y&publish=yes&:origin=viz_share_link
Loading

0 comments on commit 25157a3

Please sign in to comment.