Date
|
Lectures and Readings
|
Out / Due
|
|
Review
Please take this Python mini-quiz before the course and take this Python mini-course if you need to learn Python or refresh your Python knowledge.
|
|
10/24
|
Lecture 1: Introduction
- Big Data applications
- Technologies for handling big data
- Apache Hadoop and Spark overview
|
|
|
10/26
10/31
|
Lecture 2: Hadoop Fundamentals
- Hadoop architecture
- HDFS and the MapReduce paradigm
- Hadoop ecosystem: Mahout, Pig, Hive, HBase, Spark
|
HW0 out |
10/31
11/2
|
Lecture 3: Introduction to Apache Spark
- Big data and hardware trends
- History of Apache Spark
- Spark's Resilient Distributed Datasets (RDDs)
- Transformations and actions
|
HW1 out |
11/7
|
Lecture 4: Machine Learning Overview
- Basic machine learning concepts
- Steps of typical supervised learning pipelines
- Linear algebra review
- Computational complexity / Big O notation review
| HW1 due HW2 out
|
11/9
11/14
|
Lecture 5: Linear Regression and Distributed ML Principles
- Linear regression
- formulation and closed-form solution
- gradient descent
- grid search
- Distributed machine learning principles
- computation, storage, and communication
| |
11/16
11/21
|
Lecture 6: Logistic Regression and Click-through Rate Prediction
- Online advertising
- Linear classification
- Logistic regression
- working with probabilistic predictions
- categorical data and one-hot-encoding
- feature hashing for dimensionality reduction
|
HW2 due HW3 out
|
11/23
|
No class: Thanksgiving
|
HW3 due HW4 out |
11/21
11/28
|
Lecture 7: Principal Component Analysis and Neuroimaging
- Exploratory data analysis
- Principal Component Analysis (PCA)
- Formulations and solution
- Distributed PCA
|
|
11/30
|
Lecture 8: Big Data ML with MLlib
- k-means Clustering
- Decision Trees and Random Forests
- Recommenders
|
HW4 due (Dec 3) HW5 out (Dec 3) |
12/5
|
Lecture 9: Introduction to SparkSQL
- Working with tables in Spark
- Higher-level declarative programming
|
|
12/7
|
Lecture 10: Analyzing Networks with GraphX
- Understanding network structure
- Computing graph statistics
|
HW5 due (Dec 10) |
12/12 6:00PM
|
Final Exam
|