Homework Assignment 2
Dynamic Programming
15-211 Summer 2009
Due Tuesday 14th, 2009 at 11:59:59pm
Overview
This assignment is designed to ensure that you know how to use dynamic programming to solve recursive problems. You will solve the Edit Distance problem, which has wide applications in computing (e.g. computational linguistics, spell checking).
Objectives
- Understand the principle of dynamic programming
- Gain experience programming data structures for dynamic programming
- Gain an understanding of the complexity of algorithms using dynamic programming
Theory Questions: 20 points
In the file theory.pdf there are some short theory questions. Please answer the questions in this file and turn it in class.
Starter Code
Download the starter code for assignment (zip file). Once you finish implementation, FTP java files to /afs/andrew.cmu.edu/course/15/211-kesden/handin
What to Submit
You FTP the following java files
- EditDistance.java
- LDS.java
Edit Distance: 45 points
The edit distance between two strings is defined as the minimum number of edits required to transform one string into another. An edit is one of three operations: delete (a character from the original string), insert, and replace. There is a fourth operation, match, which does not count as an edit. Here is an example. The two input strings are "activate" and "caveat". Here is one possible transformation. For the purposes of this assignment, a transformation is represented by a string consisting of the characters D, M, R and I.
D |
M |
D |
R |
M |
I |
M |
M |
D |
a |
c |
t |
i |
v |
a |
t |
e |
|
c |
a |
v |
e |
a |
t |
There are two other possible transformations, DMRDMIMMD and DMRRRMMD. The edit cost between these two strings is 5, because each transformation requires 5 operations (matches are not counted).
You will write all your code inside the class EditDistance
in the file EditDistance.java
.
Part 1 (20 pts): Getting Edit Cost
Write the following method:
public static int getEditCost(String s1, String s2);
It should return the minimum number of edits required to transform s1 into s2 (or vice versa; the number is the same either way). Note that the input strings may be empty, and your code should handle this case. You can trust that the input strings are both non-null, so don't worry about handling that case.
Part 2 (25 pts): Getting the Transformation
Write the following method:
public static String getTransformation(String s1, String s2);
It should return any minimum-cost transformation (represented by a String containing only the characters D, M, R and I that transforms s1 into s2 (here, order is important). Once again, the input strings may be empty, but they will be non-null. Note that although there may be multiple transformations from s1 to s2, your code can return any of them and it will be considered correct.
Extra Credit Part(10 pts): Getting All Transformations
Write the following method:
public static Set getAllTransformations(String s1, String s2);
It should return a Set containing Strings that represent all possible minimum-cost transformations that transform s1 into s2 (order is important). The input strings may be empty but will be non-null.
Longest Decreasing Subsequence (LDS): 25 points
11, 0, 9, 15, 15, 8, 6, 9, 2, 14
The longest decreasing subsequence is shown in bold; in this example, it has length 5. Note that the subsequence does not have to be contiguous. It is also strictly decreasing; i.e. for some element in the LDS, the next element in the LDS must be strictly less than it, as opposed to less than or equal to.
You will write all your code in the class LDS in the
LDS.java
file.
Write the following method:
public static int lds(int[] seq);
It returns the length of the LDS in the sequence represented by seq, an array of ints. Note that seq may be empty, but it will be non-null.
Coding Style: 10 points
Your assignment will be graded first by compiling and testing it. After that, we will read your code to determine whether appropriate data structures and algorithms were used, as well as judge the overall style of your code. Programs will be graded both on correctness as well as their ability to deal with so-called edge cases, like empty lists, lists with only one element, and so on. Points will be deducted for poor design decisions and un-commented, or un-readable code as determined by your TA.
General Advice
import
anything other than what is already
imported for you!
For any dynamic programming algorithm, think before you code! The nature of DP requires you to know exactly how the table is initialized and constructed before you can write any code. Plan out each algorithm on paper.
Edit distance:
- Your DP table should have two axes, one representing s1 and the other representing s2.
- Think carefully about how your table handles empty strings.
- The same table can be used for all three methods.
- It's perfectly acceptable to implement
getAllTransformations
and then implementgetTransformation
just by callinggetAllTransformations
and choosing a random element of the returned Set. - In
getTransformation
, after you build the table, you have to trace backward through it to recover the transformation. Think about how each entry is calculated from those around it.
LDS:
- This problem looks similar at first to the Longest Common Subsequence problem. Don't be fooled; the tables look completely different.
- Don't try to solve this by turning the problem into LCS. That approach will get no credit.
- Though you are not required to, you might try thinking about how you could reconstruct the actual LDS from the table.
- The code for this assignment is very short - our implementation is 11 lines long, not counting blank lines and those with only "}". You should certainly be able to do it in less than 20 lines.
We realize that testing your code may be difficult with this assignment, since you have no way to verify the correctness of your results other than by inspection. This is why it is extremely important to plan your algorithm - if you prove that it is mathematically correct (which is easy to do with DP) then as long as you implement it properly, there should be no problem. Trial and error does not work when writing a DP algorithm.