|
Assignments
ASSIGNMENTS ARE DUE AT THE BEGINNING OF LECTURE ON THE DUE DATE
COURSEWORK:
Coursework consist of (grading in parentheses):
- Homework (40%)
- Midterm exam (15%)
- Final exam (20%)
- Project (25%)
NOTE: All assignments (except projects) are to be done individually. Please see the Collaboration policy.
Assignment |
Note
|
Out |
Due |
Weight |
Homework 1 |
Exploratory Data Analysis, Linear Models
|
Jan 26
|
Feb 21 |
10%
|
Homework 2 |
Decision Trees, SVM, and Kernels
|
Feb 21
|
Mar 13
|
10%
|
Homework 3 |
Ensemble learning, Instance-based learning, Clustering
|
Mar 14 |
Apr 11
|
10%
|
Homework 4 |
Semi-supervised learning, Text, Time series, Networks
|
Apr 11 |
May 4
|
10%
|
|
Midterm Exam |
(in class)
|
Mar 9 |
-- |
15%
|
Final Exam |
|
Thursday, May 11 9:30am - 12:30pm HBH A301 |
-- |
20%
|
Project proposal |
List of [datasets] [more project ideas]
|
-- |
Feb 23 |
1%
|
Midway report |
|
-- |
Apr 11 |
7%
|
Project presentation |
(in class)
|
-- |
May 2 & 4
|
7%
|
Project final writeup |
|
-- |
May 2 & 4 |
10%
|
HOMEWORK:
Homework should be turned in at the beginning of the class on the day it is due.
If you are taking late day(s), please send your homework as an email to the TA and also submit a hard copy
next time in class. Note the number of late days you used on the top front of the first page of your homework.
We ask that you submit all your code that was used to complete the assignment electronically only (no print outs) via Blackboard.
EXAMS:
There will be a midterm and a final exam.
Note: Both the midterm and the final will be open book, notes, papers, etc., but you are not allowed to use a computer.
The tentative dates are posted above, the finalized dates will be announced during the semester.
PROJECTS:
Your class project is an opportunity for you to explore an interesting machine learning problem of your choice in
the context of a real-world data set. Below, you will find some project ideas (will be posted some time during the semester).
Your class project must be about new things you have done this semester; you cannot use results you have
developed in previous semesters.
Projects can be done by you as an individual, or in teams of two students.
The course TA will consult with you on your ideas, but of course the final responsibility to define and execute
an interesting piece of work is yours.
Your project will be worth 25% of your final class grade, broken into
four main deliverables:
- Project proposal (1% of the course grade)
- Project milestone report
(7% of the course grade) (** 4 pages maximum **, including references)
describing the results of your first experiments by the milestone due
date (see above). Note that, as with any conference, the page limits
are strict. Papers over the limit will not be considered.
- Final project writeup (10% of the course grade) preferably in ACM format
(** 8 pages maximum, 4 pages minimum **, including references; page limit is strict)
- Final project presentation (in-class)(7% of the course grade)
Project Proposal:
You must turn in a brief project proposal (** 1 page maximum **) on the
due date (see above), in class.
A list of suggested projects and data sets are posted below.
Read the list carefully. You are encouraged to use one of the suggested
data sets, because we know that they have been successfully
used for machine learning in the past. If you prefer to use a different
data set, we will consider your proposal, but you must have access to
this data already,
and present a clear proposal for what you would do with it.
Project proposal format: Proposals should be 1 page maximum. Include the following information:
- Your name and Andrew ID on top of the page
- Project title
- Data set
- Project idea.This should be approximately two paragraphs.
- Papers to read. Include 1-3 relevant papers. You will probably want
to read at least one of them before submitting your proposal.
- Teammate: Will you have a teammate? If so, whom? Maximum team size is two students.
- What will you complete by the project milestone due date? Experimental results of some kind are expected here.
Project Writeups:
Your write-ups should include the information detailed below, in approximately the order given. Your write-up need not have corresponding sections or bullet points, but course staff should be able to find the information without searching too hard. Be as precise/specific as you can.
Note: The mid-way report will be a relatively incomplete version of the final write up. It should include similar sections and address similar questions, but need not contain all the details. Think of the mid-way report as a preliminary version of the final draft.
It is more of a status report, including preliminary results, issues that you are facing in developing your project, and how you plan to modify your approach to tackle some of those issues moving forward.
- Introduction/Motivation/Problem Definition (15%)
- What is it that you are trying to solve/achieve? Who cares and why does it matter?
- Identify, define, and motivate the problem that you are addressing.
- How (precisely) will a machine learning solution address the problem?
- Data Understanding and Preparation (15%)
- Identify and describe the data (and data sources) that will support machine learning to address the problem.
- Include various aspects of the data such as its size (GB/TB/etc), type(s), format, etc.
- Specify how these data are integrated to produce the format required for machine learning.
- Methodology (30%)
This is where you give a detailed description of your primary contributions.
It is especially important that this part be clear and well written so that
we can fully understand what you did.
- How did you approach the problem? What challenges did you face? In what (unique) ways did you handle those challenges?
- Specify the type of model(s) built and/or information/knowledge extracted.
- Discuss choices for machine learning algorithm: what are other alternatives, and what are their pros and cons (in the context of the problem and as compared to your proposed solution)?
- Discuss why and how this model should "solve" the problem (i.e., improve along some dimension of interest).
- Evaluation and Results (30%)
We are interested in seeing a clear and
conclusive set of experiments which successfully evaluate the problem you set out to solve.
Make sure to interpret the results and talk about what we can conclude and learn from your approach.
- How do you evaluate your machine learning solution to the specific
question(s) you have addressed?
- What do these evaulation methods tell you about
your solution?
It is not so important how well your method
performs but rather, (a) how thorough and careful your evaluation is, and (b) how interesting and clever your results and
findings are.
- Style and writing (10%)
Overall writing, grammar, organization, figures and illustrations.
You are suggested to use the ACM format to write your project reports (8 pages maximum, 4 pages minimum, including references; this page limit is strict).
Project Presentations:
- Think of this as an oral version of your final project writeup.
- Present
your work in a meaningful and interesting flow (eg, motivation, problem
definition, data description, challenges, proposed methods, results and their interpretation).
- Make sure to include enough details and background of your methodology (similar to a conference talk).
- See here and here for some how-to
on giving a good/bad talk.
- Be prepared to ask (tough) questions to other project groups.
- We will spend (the last) 2 lectures on project
talks. Depending
on the number of project groups, each group will be given 5-8 minutes including questions.
Datasets for Project:
We provide a long list of potential data sources for your project right here.
The project is open-ended and you are expected to come up with your own project description and problem definition.
In addition to your technical approach, we will evaluate your creativity in formulating an interesting and important problem for the project.
Last modified by Leman Akoglu, Mar 2017
|