Tentative Syllabus

Disclaimer: This is an ambitious list of topics that I aim to cover in this course. I will adjust the pace depending on the progress of and the feedback from the students in class. As such, it is possible that only some subset of these topics will end up being covered. HW and exams will be adjusted accordingly.

Date

Lectures and Readings

Out
/ Due

1/16

1/18

Lecture 1: Intro to ML

What is ML?
ML applications
Machine learning paradigms

Supervised learning (classification, regression, feature selection)
Unsupervised learning (density estimation, clustering, dimensionality reduction)

Basic data types

(Mixed) attribute data, text, time series, sequence, network data

The problem solving process:

Business/project understanding, data understanding through EDA, data preparation, modeling, evaluation, deployment

Readings:

Witten & Frank Chapter 1.1-1.3
Provost & Fawcett Chapter 2

PART I: PRELIMINARY ANALYSIS AND DATA PREPARATION

1/18

1/23

Lecture 2: Exploratory Data Analysis

Getting to know your data
Data types
Attribute types
Data quality issues
Data visualization

Histogram, Kernel Density Estimation
Charts, plots, infographics

Correlation analysis

Readings:

Witten & Frank Chapter 2
Bishop Chapter 2.5.1

1/25

1/30

Lecture 3: Data Preparation

Feature creation
Data cleaning

Missing, inaccurate, duplicate values

Data transformation

Feature type conversion
Discretization
Normalization / Standardization

Data reduction

Feature and record selection
Principal Component Analysis
Multidimensional scaling
Manifold learning (Isomap, LLE)

Readings:

HW1 out

PART II: SUPERVISED LEARNING

2/1

Lecture 4: Learning Distributions

Point estimation

Maximum Likelihood Estimation (MLE)
Bayesian learning
Maximum A Posterior (MAP) Estimation
MLE vs. MAP

Gaussians
What is ML revisited

Readings:

Bishop 1.2-1.2.4, 2.1, Appendix B
(Additional Resource) Andrew Moore's basic probability tutorial

2/6

2/8

Lecture 5: Linear Models

Linear Regression
Robust Regression
Sparse Linear Models

Feature subset selection: revisited
Shrinkage methods: ridge regression and Lasso
Principal components regression, Partial least squares

Readings:

ISLR (James, Witten, Hastie, Tibshirani) Chapter 3.1, 3.2, 3.3, 3.4
ISLR (James, Witten, Hastie, Tibshirani) Chapter 6.1, 6.2.1, 6.2.2, 6.3.1, 6.3.2

Lecture 6: Naive Bayes

Bayes Optimal Classifier
Conditional Independence
Naive Bayes
Gaussian Naive Bayes

Readings:

Bishop 1.3, 1.5, 3.2
Mitchell's Chapter on Naive Bayes and Logistic Regression (Sections 1 and 2)

Lecture 7: Logistic Regression and Generalized Models

Logistic Regression decision rule and boundary
Logistic Regression loss function
Gradient descent
Non-linear basis expansions

Readings:

ISLR (James, Witten, Hastie, Tibshirani) Chapter 4.1, 4.2, 4.3
ISLR (James, Witten, Hastie, Tibshirani) Chapter 7.1, 7.2, 7.3, 7.4, 7.6, 7.7

Lecture 8: Model Selection

What is a good model?
Overfitting
Decomposition of error
Bias-Variance tradeoff
Cross Validation
Regularization
Information Criteria (AIC, BIC, MDL)

Readings:

Hastie Chapter 7.1 - 7.10
Provost & Fawcett Chapter 5

HW1 due
HW2 out

Project proposal due

2/27

Lecture 9: Model Evaluation

Performance measures for Machine Learning
Creating baseline methods for comparison
Visualizing model performance

Readings:

Witten & Frank Chapter 5
Provost & Fawcett Chapter 7, 8, 11
Shalizi Chapter 3, 10

3/1

3/6

Lecture 10: Tree-based Methods

Classification trees
From trees to rules
Missing values and pruning
Regression trees

Readings:

Hastie Chapter 9.2
Witten & Frank Chapter 4.3-4.4, 6.1-6.2
Provost & Fawcett Chapter 3
Shalizi Chapter 13
Murphy Chapter 16.2

3/8

Midterm Exam (in class)

3/12-16

Spring Break; No Classes

HW2 due HW3 out

3/20

3/22

Lecture 11: Support Vector Machines

SVM intuition, formulation, and the dual
Slack variables, Hinge loss
The Kernel trick

Kernel SVM
Kernel Logistic Regression
Kernel PCA

Readings:

Witten & Frank Chapter 6.3
ISL-with R Chapter 9
Murphy Chapter 14.2, 14.5
http://www.cs.cornell.edu/courses/cs578/2007fa/slides_sigir03_tutorial.pdf

3/22

3/27

Lecture 12: Instance-based Learning

Kernel Density Estimation
k-Nearest Neighbor Classifier
Kernel Regression
Locally-Weighted Linear Regression

Readings:

Hastie Chapter 6.1-6.3, 6.6.1-6.6.2
Murphy Chapter 1.4.1-1.4.3, 14.7
Shalizi Chapter 7.1, 7.5

3/29

Lecture 13: Ensemble Learning

Combining multiple models
Bagging
Random Forests
Boosting

Readings:

Witten & Frank Chapter 8
Hastie Chapter 10.1, 15, 16
ISL-with R Chapter 8.2

PART III: UNSUPERVISED AND SEMI-SUPERVISED LEARNING

4/3

4/5

4/10

Lecture 14: Clustering

Distance functions
Hierarchical clustering
k-means clustering
Kernel k-means clustering
k-medians clustering
Mixture models
The EM algorithm
Spectral clustering

Readings:

Witten & Frank Chapter 6.8
ISLR (James, Witten, Hastie, Tibshirani) Chapter 10.3
Provost & Fawcett Chapter 6, 12 (part)
Spectral Clustering tutorial by Ulrike von Luxburg

HW3 due
Project midway report due
HW4 out

4/12

Lecture 15: Semi-supervised Learning

Assumptions (smoothness, cluster, manifold)
Semi-supervised learning

Self-training
Generative methods
Graph-based methods
Co-training

Readings:

Witten & Frank Chapter 6.9
Introduction to Semi-Supervised Learning
Graph-based Semi-Supervised Learning Algorithms
Combining Labeled and Unlabeled Data with Co-Training. Avrim Blum, Tom Mitchell

PART IV: LEARNING WITH COMPLEX DATA

4/17

4/24

Lecture 16: Unstructured Data: ML for Text

Representing text
Topic modeling, Applications
Latent Dirichlet Allocation (LDA)
Inference: Gibbs sampling
Collapsed Gibbs sampling for LDA

Readings:

Witten & Frank Chapter 9.5, 9.6
Provost & Fawcett Chapter 10

4/24

4/26

Lecture 17: Dependent Data: ML for Networks

Transductive learning
Learning in networks with and without attributes
Probabilistic relational network classifier
Iterative classification
Loopy belief propagation
Applications to auction, accounting, opinion fraud

Readings:

Collective Classification in Network Data. Prithviraj Sen, Galileo Namata, Mustafa Bilgic, Lise Getoor, Brian Gallagher, Tina Eliassi-Rad
A Simple Relational Classifier. Sofus A. Macskassy, Foster Provost

Tentative Syllabus

Date

Lectures and Readings

Out
/ Due

Lecture 1: Intro to ML

Lecture 2: Exploratory Data Analysis

Lecture 3: Data Preparation

Lecture 4: Learning Distributions

Lecture 5: Linear Models

Lecture 6: Naive Bayes

Lecture 7: Logistic Regression and Generalized Models

Lecture 8: Model Selection

Lecture 9: Model Evaluation

Lecture 10: Tree-based Methods

Midterm Exam (in class)

Spring Break; No Classes

Lecture 11: Support Vector Machines

Lecture 12: Instance-based Learning

Lecture 13: Ensemble Learning

Lecture 14: Clustering

Lecture 15: Semi-supervised Learning

Lecture 16: Unstructured Data: ML for Text

Lecture 17: Dependent Data: ML for Networks

Project Presentations I (today's presenters return final report on 5/3)

Project Presentations II (today's presenters return final report on 5/1)

Tentative Syllabus

Date

Lectures and Readings

Out/ Due

Lecture 1: Intro to ML

Lecture 2: Exploratory Data Analysis

Lecture 3: Data Preparation

Lecture 4: Learning Distributions

Lecture 5: Linear Models

Lecture 6: Naive Bayes

Lecture 7: Logistic Regression and Generalized Models

Lecture 8: Model Selection

Lecture 9: Model Evaluation

Lecture 10: Tree-based Methods

Midterm Exam (in class)

Spring Break; No Classes

Lecture 11: Support Vector Machines

Lecture 12: Instance-based Learning

Lecture 13: Ensemble Learning

Lecture 14: Clustering

Lecture 15: Semi-supervised Learning

Lecture 16: Unstructured Data: ML for Text

Lecture 17: Dependent Data: ML for Networks

Project Presentations I (today's presenters return final report on 5/3)

Project Presentations II (today's presenters return final report on 5/1)

Out
/ Due