Skip to content

Gathering 3 different sources (URL, API -tweepy-, CSV), Assessing, Cleaning and Reporting using Python.

Notifications You must be signed in to change notification settings

Anas-Rabea/Wrangle-and-Analyze-Data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 

Repository files navigation

Wrangle-and-Analyze-Data

Gathering , Assessing, Cleaning and Reporting tweets archive of WERATEDOGS account to get most lovely kind/stage of dogs using Python. Also get best Algorithms to predict each animal/dog

Gathering

  • (CSV)Downloading the intiative file of twitter_enhance.
  • (URL)Using request library for download the image_prediction table.
  • (API)After getting tha app Accept from twitter. used TWEEPY to get the archive content.

Assessing

Visual and Programmatically

  • Visual first to get a general idea about each column and it's value. Knowing datatypes of each column and keep in mind the cleaning of the wrong datatypes

Programmatically

I used each available methode to get into each column in each table using describe() for statistics description , value_counts() for value counts and knowing different values seraching for missing values in each columns and duplicated searching for duplicated columns name using filtering or query() for specific seraching some purpuse typing I typed each tidness and quality issues that need to be cleaned later

Cleaning

  • Getting both quality and tidness issues to clean.
  • Starting with copying the original sources and dealing with missing contents.
  • Continuing for cleaning the tidness challenges and ending with quality ones.
  • Each quality and tidness issue has its Code and Test step

Visualization & insights

  • getting some insight for each stage of dogs and most lovely stage for people based on retweets and favorites counts
  • Visualize each Algorithm for best kind of animal to predict.

About

Gathering 3 different sources (URL, API -tweepy-, CSV), Assessing, Cleaning and Reporting using Python.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published