# Data Science with R: Machine Learning

### offered by NYC Data Science Academy

Overview

This 35-hour course introduces both the theoretical foundation of machine learning algorithms as well as their practical applications of machine learning techniques in R. It will introduce you to data mining, performance measures and dimension reduction, regression models, both linear and generalized, KNN and Naïve Bayes models, tree models, and SVMs as well as the Association Rule for analysis. After successfully completing of this course, you will be able to break down the mathematics behind major machine learning algorithms, explain the principles of machine learning algorithms, and implement these methods to solve real-world problems.

Syllabus

Unit 1: Foundations of Statistics and Simple Linear Regression

Understand your data

Statistical inference

Introduction to machine learning

Simple linear regression

Diagnostics and transformations

The coefficient of determination

Unit 2: Multiple Linear Regression and Generalized Linear Model

Multiple linear regression

Assumptions and diagnostics

Extending model flexibility

Generalized linear models

Logistic regression

Maximum likelihood estimation

Model interpretation

Assessing model fit

Unit 3: kNN and Naive Bayes, the Curse of Dimensionality

The K-Nearest Neighbors Algorithm

The choice of K and distance measure

Conditional probability: Bayes’ Theorem

The Naive Bayes’ Algorithm

The Laplace estimator

Dimension reduction

The PCA procedure

Ridge and Lasso regression

Cross-validation

Unit 4: Tree Models and SVMs

Decision trees

Bagging

Random forests

Boosting

Variable Importance

Hyperplanes and maximal margin classifier

Sort margin and support vector classifier

Kernels and support vector machines

Unit 5: Cluster Analysis and Neural Networks

Cluster analysis

K-means clustering

Hierarchical clustering

Neural networks and perceptrons

Sigmoid neurons

Network topology and hidden features

Back propagation learning with gradient descent

Final Project

After 35 hours of structured lectures, students are encouraged to work on an exploratory data analysis project based on their own interests. A project presentation demo will be arranged afterwards.