INTRODUCTION TO THE COURSE: The Key Concepts and Software Tools
  • Welcome to the Course
  • Data & Script For the Course
  • Python Data Science Environment
  • For Mac Users
  • Introduction to IPython/Jupyter
  • ipython in Browser
Read in Data From Different Sources With Pandas
  • What are Pandas?
  • Read CSV Data
  • Read Excel Data
  • Read in HTML Data
Data Cleaning
  • Remove NA Values
  • Missing Values in a Real Dataset
  • Data Imputation
  • Imputing Qualitative Values
  • Use k-NN for Data Imputation
Basic Data Wrangling
  • Basic Principles
  • Preliminary Data Explorations
  • Basic Data Handling With Conditional Statements
  • Drop Column/Row
  • Change Column Name
  • Change the Column Type
  • Explore Date Related Data
  • Simple Date Related Computations
More Data Wrangling
  • Data Grouping
  • Data Subsetting and Indexing
  • More Data Subsetting
  • Extract Information From Strings
  • (Fuzzy) String Matching
  • Ranking & Sorting
  • Concatenate
  • Merging and Joining
Feature Selection and Transformation
  • Correlation Analysis
  • Using Correlation to Decide Which Features to Retain
  • Univariate Feature Selection
  • Recursive Feature Elimination (RFE)
  • Theory Behind PCA
  • Implement PCA
  • Data Standardisation
  • Create a New Feature
Theory Behind Data Visualisation
  • What is Data Visualisation?
  • Some Theoretical Principles Behind Data Visualisation
Most Common Data Visualizations
  • Histograms-Visualize the Distribution of Continuous Numerical Variables
  • Boxplots-Visualize the Distribution of Continuous Numerical Variables
  • Scatter plot-Relationship Between Two Numerical Variables
  • Barplot
  • Pie Chart
  • Line Charts
  • More Line Charts
  • Some More Plot Types
  • And Some More
Miscallaneous Information
  • Using Colabs as an Online Jupyter Notebook
  • Github