Java Programming – Longest Common Subsequence

Java Programming – Longest Common Subsequence – Dynamic Programming – LCS problem has optimal substructure property as main problem can be solved .

We have discussed Overlapping Subproblems and Optimal Substructure properties in Set 1 and Set 2 respectively. We also discussed one example problem in Set 3. Let us discuss Longest Common Subsequence (LCS) problem as one more example problem that can be solved using Dynamic Programming.

LCS Problem Statement: Given two sequences, find the length of longest subsequence present in both of them. A subsequence is a sequence that appears in the same relative order, but not necessarily contiguous. For example, “abc”, “abg”, “bdf”, “aeg”, ‘”acefg”, .. etc are subsequences of “abcdefg”. So a string of length n has 2^n different possible subsequences.

It is a classic computer science problem, the basis of diff (a file comparison program that outputs the differences between two files), and has applications in bioinformatics.

Examples:
LCS for input Sequences “ABCDGH” and “AEDFHR” is “ADH” of length 3.
LCS for input Sequences “AGGTAB” and “GXTXAYB” is “GTAB” of length 4.

The naive solution for this problem is to generate all subsequences of both given sequences and find the longest matching subsequence. This solution is exponential in term of time complexity. Let us see how this problem possesses both important properties of a Dynamic Programming (DP) Problem.

• Optimal Substructure:
Let the input sequences be X[0..m-1] and Y[0..n-1] of lengths m and n respectively. And let L(X[0..m-1], Y[0..n-1]) be the length of LCS of the two sequences X and Y. Following is the recursive definition of L(X[0..m-1], Y[0..n-1]).

If last characters of both sequences match (or X[m-1] == Y[n-1]) then
L(X[0..m-1], Y[0..n-1]) = 1 + L(X[0..m-2], Y[0..n-2])

If last characters of both sequences do not match (or X[m-1] != Y[n-1]) then
L(X[0..m-1], Y[0..n-1]) = MAX ( L(X[0..m-2], Y[0..n-1]), L(X[0..m-1], Y[0..n-2])

Examples:

• Consider the input strings “AGGTAB” and “GXTXAYB”. Last characters match for the strings. So length of LCS can be written as:
L(“AGGTAB”, “GXTXAYB”) = 1 + L(“AGGTA”, “GXTXAY”)

• Consider the input strings “ABCDGH” and “AEDFHR. Last characters do not match for the strings. So length of LCS can be written as:
L(“ABCDGH”, “AEDFHR”) = MAX ( L(“ABCDG”, “AEDFHR”), L(“ABCDGH”, “AEDFH”) )

So the LCS problem has optimal substructure property as the main problem can be solved using solutions to subproblems.

• Overlapping Subproblems:

Following is simple recursive implementation of the LCS problem. The implementation simply follows the recursive structure mentioned above.

Java Programming

[pastacode lang=”java” manual=”%2F*%20A%20Naive%20recursive%20implementation%20of%20LCS%20problem%20in%20java*%2F%0Apublic%20class%20LongestCommonSubsequence%0A%7B%0A%20%0A%20%20%2F*%20Returns%20length%20of%20LCS%20for%20X%5B0..m-1%5D%2C%20Y%5B0..n-1%5D%20*%2F%0A%20%20int%20lcs(%20char%5B%5D%20X%2C%20char%5B%5D%20Y%2C%20int%20m%2C%20int%20n%20)%0A%20%20%7B%0A%20%20%20%20if%20(m%20%3D%3D%200%20%7C%7C%20n%20%3D%3D%200)%0A%20%20%20%20%20%20return%200%3B%0A%20%20%20%20if%20(X%5Bm-1%5D%20%3D%3D%20Y%5Bn-1%5D)%0A%20%20%20%20%20%20return%201%20%2B%20lcs(X%2C%20Y%2C%20m-1%2C%20n-1)%3B%0A%20%20%20%20else%0A%20%20%20%20%20%20return%20max(lcs(X%2C%20Y%2C%20m%2C%20n-1)%2C%20lcs(X%2C%20Y%2C%20m-1%2C%20n))%3B%0A%20%20%7D%0A%20%0A%20%20%2F*%20Utility%20function%20to%20get%20max%20of%202%20integers%20*%2F%0A%20%20int%20max(int%20a%2C%20int%20b)%0A%20%20%7B%0A%20%20%20%20return%20(a%20%3E%20b)%3F%20a%20%3A%20b%3B%0A%20%20%7D%0A%20%0A%20%20public%20static%20void%20main(String%5B%5D%20args)%0A%20%20%7B%0A%20%20%20%20LongestCommonSubsequence%20lcs%20%3D%20new%20LongestCommonSubsequence()%3B%0A%20%20%20%20String%20s1%20%3D%20%22AGGTAB%22%3B%0A%20%20%20%20String%20s2%20%3D%20%22GXTXAYB%22%3B%0A%20%0A%20%20%20%20char%5B%5D%20X%3Ds1.toCharArray()%3B%0A%20%20%20%20char%5B%5D%20Y%3Ds2.toCharArray()%3B%0A%20%20%20%20int%20m%20%3D%20X.length%3B%0A%20%20%20%20int%20n%20%3D%20Y.length%3B%0A%20%0A%20%20%20%20System.out.println(%22Length%20of%20LCS%20is%22%20%2B%20%22%20%22%20%2B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20lcs.lcs(%20X%2C%20Y%2C%20m%2C%20n%20)%20)%3B%0A%20%20%7D%0A%20%0A%7D” message=”Java” highlight=”” provider=”manual”/]

Output :

`Length of LCS is 4`

Time complexity of the above naive recursive approach is O(2^n) in worst case and worst case happens when all characters of X and Y mismatch i.e., length of LCS is 0.
Considering the above implementation, following is a partial recursion tree for input strings “AXYT” and “AYZX”

```                         lcs("AXYT", "AYZX")
/                 \
lcs("AXY", "AYZX")            lcs("AXYT", "AYZ")
/            \                  /               \
lcs("AX", "AYZX") lcs("AXY", "AYZ")   lcs("AXY", "AYZ") lcs("AXYT", "AY")```

In the above partial recursion tree, lcs(“AXY”, “AYZ”) is being solved twice. If we draw the complete recursion tree, then we can see that there are many subproblems which are solved again and again. So this problem has Overlapping Substructure property and recomputation of same subproblems can be avoided by either using Memoization or Tabulation. Following is a tabulated implementation for the LCS problem.

Java Programming

[pastacode lang=”java” manual=”%2F*%20Dynamic%20Programming%20Java%20implementation%20of%20LCS%20problem%20*%2F%0Apublic%20class%20LongestCommonSubsequence%0A%7B%0A%20%0A%20%20%2F*%20Returns%20length%20of%20LCS%20for%20X%5B0..m-1%5D%2C%20Y%5B0..n-1%5D%20*%2F%0A%20%20int%20lcs(%20char%5B%5D%20X%2C%20char%5B%5D%20Y%2C%20int%20m%2C%20int%20n%20)%0A%20%20%7B%0A%20%20%20%20int%20L%5B%5D%5B%5D%20%3D%20new%20int%5Bm%2B1%5D%5Bn%2B1%5D%3B%0A%20%0A%20%20%20%20%2F*%20Following%20steps%20build%20L%5Bm%2B1%5D%5Bn%2B1%5D%20in%20bottom%20up%20fashion.%20Note%0A%20%20%20%20%20%20%20%20%20that%20L%5Bi%5D%5Bj%5D%20contains%20length%20of%20LCS%20of%20X%5B0..i-1%5D%20and%20Y%5B0..j-1%5D%20*%2F%0A%20%20%20%20for%20(int%20i%3D0%3B%20i%3C%3Dm%3B%20i%2B%2B)%0A%20%20%20%20%7B%0A%20%20%20%20%20%20for%20(int%20j%3D0%3B%20j%3C%3Dn%3B%20j%2B%2B)%0A%20%20%20%20%20%20%7B%0A%20%20%20%20%20%20%20%20if%20(i%20%3D%3D%200%20%7C%7C%20j%20%3D%3D%200)%0A%20%20%20%20%20%20%20%20%20%20%20%20L%5Bi%5D%5Bj%5D%20%3D%200%3B%0A%20%20%20%20%20%20%20%20else%20if%20(X%5Bi-1%5D%20%3D%3D%20Y%5Bj-1%5D)%0A%20%20%20%20%20%20%20%20%20%20%20%20L%5Bi%5D%5Bj%5D%20%3D%20L%5Bi-1%5D%5Bj-1%5D%20%2B%201%3B%0A%20%20%20%20%20%20%20%20else%0A%20%20%20%20%20%20%20%20%20%20%20%20L%5Bi%5D%5Bj%5D%20%3D%20max(L%5Bi-1%5D%5Bj%5D%2C%20L%5Bi%5D%5Bj-1%5D)%3B%0A%20%20%20%20%20%20%7D%0A%20%20%20%20%7D%0A%20%20return%20L%5Bm%5D%5Bn%5D%3B%0A%20%20%7D%0A%20%0A%20%20%2F*%20Utility%20function%20to%20get%20max%20of%202%20integers%20*%2F%0A%20%20int%20max(int%20a%2C%20int%20b)%0A%20%20%7B%0A%20%20%20%20return%20(a%20%3E%20b)%3F%20a%20%3A%20b%3B%0A%20%20%7D%0A%20%0A%20%20public%20static%20void%20main(String%5B%5D%20args)%0A%20%20%7B%0A%20%20%20%20LongestCommonSubsequence%20lcs%20%3D%20new%20LongestCommonSubsequence()%3B%0A%20%20%20%20String%20s1%20%3D%20%22AGGTAB%22%3B%0A%20%20%20%20String%20s2%20%3D%20%22GXTXAYB%22%3B%0A%20%0A%20%20%20%20char%5B%5D%20X%3Ds1.toCharArray()%3B%0A%20%20%20%20char%5B%5D%20Y%3Ds2.toCharArray()%3B%0A%20%20%20%20int%20m%20%3D%20X.length%3B%0A%20%20%20%20int%20n%20%3D%20Y.length%3B%0A%20%0A%20%20%20%20System.out.println(%22Length%20of%20LCS%20is%22%20%2B%20%22%20%22%20%2B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20lcs.lcs(%20X%2C%20Y%2C%20m%2C%20n%20)%20)%3B%0A%20%20%7D%0A%20%0A%7D” message=”Java” highlight=”” provider=”manual”/]

Time Complexity of the above implementation is O(mn) which is much better than the worst case time complexity of Naive Recursive implementation.

The above algorithm/code returns only length of LCS.

[ Solution 1 Answers ] Class Is Public, Should Be Declared In A File Named .Java

PROBLEM: When the class name and the filename of a given Java program doesn’t match then we will get this error. Considering an example where the file is named as…

Web browser project in c, c program to open a website/url

Web browser project in c, c program to open a website/url – c programming – It will launch Firefox web browser to open a website so it should be installed.

Tips to Get Your Car Repaired After An Accident

Car accidents are an event that happens all over the country with regularity. It seems like the wheels of fate was unfavorable to you … and your car was involved…

C Programming – Subset Sum Problem

C Programming – Subset Sum Problem – Dynamic Programming Given a set of non-negative integers, and a value sum, determine if there is a subset

Top 10 Grey Hat Hackers 2018

Grey hat hackers that do hacking bit the ethical and non-ethical. These are the persons who knows extraordinary hacking traps

Java Programming – 0-1 Knapsack Problem

Java Programming – 0-1 Knapsack Problem – Dynamic Programming simple solution is to consider all subsets of items and calculate the total weight and value