Gathering , Assessing, Cleaning and Reporting tweets archive of WERATEDOGS account to get most lovely kind/stage of dogs using Python. Also get best Algorithms to predict each animal/dog
- (CSV)Downloading the intiative file of twitter_enhance.
- (URL)Using request library for download the image_prediction table.
- (API)After getting tha app Accept from twitter. used TWEEPY to get the archive content.
- Visual first to get a general idea about each column and it's value. Knowing datatypes of each column and keep in mind the cleaning of the wrong datatypes
I used each available methode to get into each column in each table using describe() for statistics description , value_counts() for value counts and knowing different values seraching for missing values in each columns and duplicated searching for duplicated columns name using filtering or query() for specific seraching some purpuse typing I typed each tidness and quality issues that need to be cleaned later
- Getting both quality and tidness issues to clean.
- Starting with copying the original sources and dealing with missing contents.
- Continuing for cleaning the tidness challenges and ending with quality ones.
- Each quality and tidness issue has its Code and Test step
- getting some insight for each stage of dogs and most lovely stage for people based on retweets and favorites counts
- Visualize each Algorithm for best kind of animal to predict.