skip to page content SBU
Carnegie Mellon University
95-828 Machine Learning for Problem Solving
Spring 2022

Home
Syllabus
Assignments
Notes

Coursework

Coursework consist of (grading in parentheses):

  • 5 Homework (9% each)
  • 1 Midterm exam (15%)
  • 1 Final exam (25%)
  • 1 Case Study (15%)

HOMEWORK:

Homework will be posted on Canvas. Each homework will consist of two parts: (1) a set of conceptual questions, and (2) programming. For the programming part, we will provide a code template (and sometimes partial code as well) in a Jupyter notebook. You will have two weeks to complete each homework assignment.

Getting help: You can visit the instructor and the TAs during office hours as well as post questions on Piazza to get help on the assignments. Regarding help from fellow students, see the note on collaboration below.

Collaboration: Collaboration and study groups are allowed and encouraged. All assignments are to be done by study groups, 2 students each. The Case Study can be done in groups of up to 4 members. Each group uploads a single submission on Gradescope. Please see the collaboration policy for details.

Submitting: We ask that you submit two files per homework: (1) a pdf file with your answers to the conceptual questions, and (2) the Jupyter notebook we provide as a template with all your code that you filled in. Both files (.pdf and .ipynb) are to be uploaded electronically only on Gradescope (no hard copy print outs).

Homework assignments are due at the beginning of the class on the day it is due. You can upload your files multiple times, but note that we will use the latest upload date as the submission date, which may factor into your slip days accordingly. Please see the late submission policy for details.

IMPORTANT DATES:

Assignment Note Out Due Weight
Homework 0
Setting up Python and Jupyter
Jan 18
n/a
0%
Homework 1
EDA, LR, Model selection
Feb 1
Feb 15
9%
Homework 2
LogR, Model eval., Non-parametric, DT
Feb 15
Mar 1
9%
Midterm Exam
(in class)
Mar 3
--
15%
Homework 3
Ensemble models, NB, SVM
Mar 15
Mar 29
9%
Homework 4
Kernels, Neural nets, Density estimation
Mar 29
Apr 12
9%
Homework 5
Clustering, EM, Dimensionality reduction
Apr 12
Apr 26
9%
Case Study
Mini 4
Mar 17
Apr 29
15%
Final Exam

Check out
univ. calendar
--
25%

EXAMS:

There will be a midterm exam (in class) and a final exam (to be scheduled by the University).

Note: For the midterm, you are allowed to bring with you 2 A4-size sheets, containing your own notes (hand-written or typed). You can use both sides of each sheet. For the final, you are allowed to bring up to 5 A4-size sheets (double sided), again containing your own notes. Use of any computers or other electronic devices during the exams is not allowed. The tentative dates are posted above, the finalized dates will be announced during the semester.

CASE STUDY:

Starting the second half of the course (after Spring break), we will provide you with information describing a large dataset and a list of potential questions to address based on this dataset. We will also release the dataset after Spring break. You will be given the second half of the semester to complete your analysis and modeling on the data. Particularly, you will be expected to carefully choose to apply the techniques and tools you have learned throughout the course to address the problems of interest using machine learning.

The Case Study will consist of 3 Phases. In Phase I, you will think about the data and the problem at hand and brainstorm. In Phase II, you will do hands-on data cleaning, preparation and exploratory analysis and data understanding. In Final Phase III, you will build predictive models using various machine learning tools you have learned throughout the course.

Evaluation: We will assess your case study outcomes in terms of your analytical approach to the problems, and not only based on the quality of your results. That is, the emphasis will be on evaluating how methodical you were in your analysis in terms of the tools you chose to apply, in the way you draw conclusions from your own results, and the sequence of steps you took based on your analyses and intermediate results. We will also assess if you used the best practices in building your solutions, including proper model selection, model comparisons to appropriate baselines, choice of evaluation metrics, and so on.

Teams: The Case Study can be done in groups of up to 4 students. We recommend forming groups of 4, but groups of 2-3 students should also be fine. We do not recommend single-member teams given the amount of workload. You can use Piazza for communication toward finding team members. Submitting: You are asked to submit a single Jupyter notebook, composed of all your code and results, along with a pdf file with answers to specific questions. All submissions will be made on Canvas.