Data Science with Python: Machine Learning

offered by NYC Data Science Academy

This 20-hour course covers all the basic machine learning methods and Python modules (especially Scikit-Learn) for implementing them. The five sessions cover: simple and multiple Linear regressions; classification methods including logistic regression, discriminant analysis and naive bayes, support vector machines (SVMs) and tree based methods; cross-validation and feature selection; regularization; principal component analysis (PCA) and clustering algorithms. After successfully completing of this course, you will be able to explain the principles of machine learning algorithms and implement these methods to analyze complex datasets and make predictions.


Unit 1: Introduction and Regression
What is Machine Learning
Simple Linear Regression
Multiple Linear Regression
Numpy/Scikit-Learn Lab

Unit 2: Classification I
Logistic Regression
Discriminant Analysis
Naive Bayes
Supervised Learning Lab

Unit 3: Resampling and Model Selection
Feature Selection
Model Selection and Regularization lab

Unit 4: Classification II
Support Vector Machines
Decision Trees
Bagging and Random Forests
Decision Tree and SVM Lab

Unit 5: Unsupervised Learning
Principal Component Analysis
Kmeans and Hierarchical Clustering
PCA and Clustering Lab
Final Project

After 20 hours of structured lectures, students are encouraged to work on an exploratory data analysis project based on their own interests. A project presentation demo will be arranged afterwards.