Skip to content
/ cjsw Public

Python based webscraper for CJSW radio show podcasts

Notifications You must be signed in to change notification settings

mikeroh/cjsw

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Overview

This project is a tool to retrieve CJSW podcasts from their website, name and organize them, and set the ID3 tags.

Setup

To get this script working, some packages are required for python, run the following commands to install them.

pip install lxml
pip install requests
pip install pathlib

Using Debian, the following packages must be installed before getting lxml.

apt-get install libxml2-dev libxslt1-dev python-dev

Use

In the directory that this project is cloned to, run the command

python scrape.py <target directory>

Where <target directory> is the high level directory to save the podcasts into. The script will automatically organize the podcasts into sub directories within <target directory> based on the genre and program.

As a cron job

python scrape.py <target directory> <genre>

For use in a script that requires no user interaction, a specific genre can be passed to the scraper. All programs in the genre specified by <genre> will have their latest episodes downloaded.

About

Python based webscraper for CJSW radio show podcasts

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages