These notes were devloped for the course Foundation of Data Science Using Programming Language at the Center for Enterprise and Technology Advancement (CETA) in University of Management & Technology. The goal is to provide an overiew of fundamental concepts in Python from first principles.
I am deeply thankful to Dr. Shahid Mahmood Awan and Dr. Bilal Wajid for his inspiration and guidance throughout my studies at the Department of Information System, School of Business and Economics, University of Management & Technology. In the past 2 Years, his invaluable support and directions always helped me improve professional and academic skills. It has been an honor being his student. I am also very grateful to all my class fellows for their feedback and useful suggestions.
-
- 1.1 What is Data Science?
- 1.1.1 Data Frames
- 1.1.2 Characteristics of a Data Frame
- 1.1.3 Library Highlights
- 1.1.4 Pandas Documentation
- 1.1.5 Data Wrangling
- 1.2 Pandas Version
- 1.2.1 Pandas Dependencies
- 1.1 What is Data Science?
-
- 2.1 Create Data Frame
- 2.1.1 Using Dictionary
- 2.1.2 Using Tuple
- 2.1.3 Using ndarray
- 2.1.4 Using List
- 2.1.5 Dictionary of Series
- 2.1.6 Using Random Number
- 2.1.7 Using Random Uniform Number
- 2.1.8 Using Random randint Number
- 2.1.9 empty Data Frame
- 2.1 Create Data Frame
-
- 4 CSV & text files
- 4.1 read_table
- 4.2 Define Header Row
- 4.2.1 Skip Row
- 4.2.2 Header Row
- 4.2.3 Header set to None
- 4.2.4 Columns names
- 4.3 rename columns
- 4.4 replace in column name
- 4.5 add_prefix in Header
- 4.6 add_suffix in Header
- 4.7 prefix
- 4.8 Import Limited Number of Rows
- 4.9 Import Limited Number of Columns
- 4.10 skipfooter
- 4.11 engine
- 4.12 Ignore Comments Lile
- 4.13 Sequency Of the Columns
- 4.14 mangle_dupe_cols
- 4.15 skipinitialspace
- 4.16 verbose
- 4.17 converters
- 4.18 keep_default_na
- 4.19 true_values , false_value
- 4.20 index_col
- 4.21 thousands
- 4.22 decimal
- 4.23 squeeze
- 4.24 dtypes
- 4.25 parse_dates
- 4.26 keep_date_col
- 4.27 na_values
- 4.28 lineterminator
-
- 6.1 Import MS Excel File
- 6.1.1 sheet_name
- 6.1.2 Get all sheet names
- 6.1.3 Import multiple Excel Sheets
- 6.1.4 Display all sheets
- 6.1.5 converters
- 6.1 Import MS Excel File
-
- 8.1 Install Tabula
- 8.2 Read PDF FILE
-
- 9.1 Export Data Frame in CSV File
- 9.1.1 Export with Index Label
- 9.1.2 Export Without Index
- 9.1.3 Export Specific Columns
- 9.1.4 Export Without Header
- 9.1.5 Export in .txt Format
- 9.1.6 Export with specific sep
- 9.1.7 line_terminator
- 9.2 Export Data Frame in Excel File
- 9.2.1 Sheet Name
- 9.2.2 index_label
- 9.2.3 Export without Index
- 9.2.4 startrow
- 9.2.5 startcol
- 9.2.6 Export multiple worksheets in the same workbook
- 9.1 Export Data Frame in CSV File
-
- 10.1 to_records
- 10.1.1 Array with out Index
- 10.2 values
- 10.3 to_numpy
- 10.4 array
- 10.5 to_list
- 10.6 tolist
- 10.7 explode
- 10.1 to_records
-
11.1 to_dict
- 11.1.1 series
- 11.1.2 split
- 11.1.3 index
- 11.1.4 OrderedDict
-
- 12.1 Head
- 12.2 Tail
- 12.3 Sample
- 12.4 Shape of the Data Frame
- 12.5 Dimensions of the Data Frame
- 12.6 Size of the Data Frame
- 12.7 Get Variables Name of the Data Frame
- 12.8 Index of the Data Frame
- 12.9 axes of the Data Frame
- 12.10 Set Index to the specfic Column
- 12.11 Re-Set Index
- 12.12 set_axis
- 12.13 Why do some pandas commands end with parentheses (and others don't)?
-
- 13.1 dtypes
- 13.1.1 Check dtype of specific column
- 13.1.2 value_counts
- 13.1.3 infer_objects
- 13.2 Convert strings to numbers
- 13.3 to_datetime
- 13.4 memory_usage
- 13.4.1 memory_usage without Index
- 13.4.2 nbytes
- 13.1 dtypes