Inofficial API for Twitter Frontend for scraping tweets of german federal delegates
Python 3
Twint (pip3 install twint)
Execute python3 twint_scraper.py
in the cloned git repository.
Modify the date range in the searchUserTweets()
function in twint_scaper.py
or leave it commented out to get all tweets. It takes approximately 10 minutes for 7 days of tweets or 8 hours to get all existing tweets.
To add or remove twitter accounts edit the abgeordnete_usernamen_twitter.json
file.
NOTE: It can only scrape the last 3200 tweets of an account. This is not a limitation of twint but of twitter. Checkout the twint project page to learn more: https://github.com/twintproject/twint
There are several names of German representatives in the file Bundestag_Namen_Username_Fraktion.csv
which are use as keywords in the method searchPoliticalTweets()
The tweets are stored separately according to the party of the German representatives
Like in the first scraper you can modify the data range.
With JSON_utils.py
you can clean the data by removing all tweets written by German representatives.
Also the party of each tweet will be added to the json-object.
Caution: You may have to adjust the filepath in JSON_utils.py