Week
|
Lectures and Readings
|
Out / Due
|
|
Review
Please take this Python mini-quiz before the course and take this Python mini-course if you need to learn Python or refresh your Python knowledge.
|
|
Week 1
|
Lecture 1: Introduction
- Big Data applications
- Technologies for handling big data
- Apache Hadoop and Spark overview
|
|
|
Week 2
|
Lecture 2: Hadoop Fundamentals
- Hadoop architecture
- HDFS and the MapReduce paradigm
- Hadoop ecosystem: Mahout, Pig, Hive, HBase, Spark
|
HW0 out |
|
Lecture 3: Introduction to Apache Spark
- Big data and hardware trends
- History of Apache Spark
- Spark's Resilient Distributed Datasets (RDDs)
- Transformations and actions
|
HW1 out |
Week 3
|
Lecture 4: Machine Learning Overview
- Basic machine learning concepts
- Steps of typical supervised learning pipelines
- Linear algebra review
- Computational complexity / Big O notation review
|
Week 4
|
Lecture 5: Linear Regression and Distributed ML Principles
- Linear regression
- formulation and closed-form solution
- gradient descent
- grid search
- Distributed machine learning principles
- computation, storage, and communication
| HW1 due HW2 out
|
|
Week 5
|
Lecture 6: Logistic Regression and Click-through Rate Prediction
- Online advertising
- Linear classification
- Logistic regression
- working with probabilistic predictions
- categorical data and one-hot-encoding
- feature hashing for dimensionality reduction
|
HW2 due HW3 out
|
Week 6
|
Lecture 7: Principal Component Analysis and Neuroimaging
- Exploratory data analysis
- Principal Component Analysis (PCA)
- Formulations and solution
- Distributed PCA
|
HW3 due HW4 out
|
|
Week 7
|
Lecture 8: Big Data ML with MLlib
- k-means Clustering
- Decision Trees and Random Forests
- Recommenders
|
HW4 due HW5 out |
|
Lecture 9: Introduction to SparkSQL
- Working with tables in Spark
- Higher-level declarative programming
|
|
Bonus Lecture
|
Lecture 10: Analyzing Networks with GraphX
- Understanding network structure
- Computing graph statistics
|
HW5 due |
See here
|
Final Exam
|