skip to page content SBU
95-801 Data Mining Techniques
Fall 2017

Home
Syllabus
Assignments
Notes

Syllabus

You can find the list of topics by lecture below. Readings for each lecture will be posted here. Note that the syllabus is tentative and will be adjusted, if needed, as the semester proceeds.

Date

Lectures and Readings

Out
/ Due

  10/24

  10/26

Lecture 1: Introduction and Fast Similarity Search

  • kd-trees and Locality Sensitive Hashing

Reading:

   

  10/31

  11/2

Lecture 2: Frequent Itemsets and Association Rules

  • Market-basket analysis and the Apriori algorithm
  • Handling large datasets with limited-RAM and limited-pass algorithms

Reading:

   

  11/7

  11/9

Lecture 3: Data Decomposition

  • Singular Value Decomposition (SVD)
  • SVD applications, case studies
  • CUR for sparse decomposition

Reading:

   

  11/14

  11/16

Lecture 4: Clustering

  • Distance measures
  • Hierarchical clustering, k-means, BFR, CURE algorithms

Reading:

   

  11/16

  11/21

Lecture 5: Outlier Mining

  • Extreme-value analysis
  • Density-based outlier detection
  • Ensemble methods

Reading:

   

  11/23

No Class: Thanksgiving

   



  11/28
  11/30

Lecture 6: Graphs: Link Analysis

  • Ranking nodes in a graph
  • Random walks (with restart), Pagerank, Topic-sensitive Pagerank, HITS

Reading:

   


  11/30
  12/5

Lecture 7: Text Mining

  • Topic modeling with LDA and visualization

Reading:

   

  12/7

Lecture 8: Data Streams

  • Uniform-sampling: Reservoir sampling
  • Filtering: the Bloom filter
  • Counting distinct elements: Flajolet-Martin algorithm
  • Counting frequencies: Count-min sketch

Reading: