Introduction
The Longest Common Subsequence (LCS) is a classic problem in computer science, often used to illustrate dynamic programming techniques. Given two sequences, the task is to find the length of their longest subsequence that appears in both sequences in the same order, though not necessarily consecutively. This reading material will explore the LCS problem using various approaches: Recursion, Memoization, and Tabulation, along with their time and space complexities.
Problem Description
Given two sequences, X and Y, the LCS problem seeks to determine the longest sequence that can be derived from both X and Y by deleting some or no elements without changing the order of the remaining elements.
Example:
Consider two sequences:
The longest common subsequence is BCAB, which appears in both sequences in the same order.
Recursive Approach
Concept
The recursive solution to the LCS problem is based on the idea of solving smaller subproblems. If the characters at the current indices of both sequences match, they are part of the LCS. Otherwise, the solution is the maximum LCS obtainable by either skipping the current character of the first sequence or the second sequence.
Recursive Pseudo Code
function LCS(X, Y, i, j):
if i == 0 or j == 0:
return 0
if X[i-1] == Y[j-1]:
return 1 + LCS(X, Y, i-1, j-1)
else:
return max(LCS(X, Y, i-1, j), LCS(X, Y, i, j-1))
Step-by-Step Process
Base Case: If either sequence is exhausted (i == 0 or j == 0), the LCS is zero. Match Case: If the current characters of X and Y match (X[i-1] == Y[j-1]), the LCS length is 1 plus the LCS of the remaining sequences. Mismatch Case: If the characters do not match, the solution is the maximum LCS obtained by either excluding the current character of X or Y. Time and Space Complexity
Time Complexity: , where M and N are the lengths of X and Y. This exponential complexity arises from the recursive calls. Space Complexity: O(M + N) due to the call stack.
Memoization Approach
Concept
Memoization stores the results of subproblems to avoid redundant computations. By maintaining a 2D array (memo table), the results of subproblems are saved and reused.
Memoization Pseudo Code
function LCS(X, Y, i, j, memo):
if i == 0 or j == 0:
return 0
if memo[i][j] != -1:
return memo[i][j]
if X[i-1] == Y[j-1]:
memo[i][j] = 1 + LCS(X, Y, i-1, j-1, memo)
else:
memo[i][j] = max(LCS(X, Y, i-1, j), LCS(X, Y, i, j-1))
return memo[i][j]
Step-by-Step Process
Initialize Memo Table: Create a 2D array memo initialized with -1. Base Case: If either sequence is exhausted, return 0. Check Memo Table: If the subproblem has been solved, return the stored result. Match Case: If characters match, store the result as 1 plus the LCS of the remaining sequences. Mismatch Case: If characters do not match, store the result as the maximum LCS by excluding characters. Time and Space Complexity
Time Complexity: O(M * N), since each subproblem is solved only once. Space Complexity: O(M * N) for the memo table, plus O(M + N) for the call stack.
Tabulation Approach
Concept
Tabulation involves filling a table iteratively to solve the problem. This approach eliminates the recursive stack space by using a bottom-up strategy.
Tabulation Pseudo Code
function LCS(X, Y):
m = length(X)
n = length(Y)
dp = array of size (m+1, n+1)
for i from 0 to m:
for j from 0 to n:
if i == 0 or j == 0:
dp[i][j] = 0
else if X[i-1] == Y[j-1]:
dp[i][j] = 1 + dp[i-1][j-1]
else:
dp[i][j] = max(dp[i-1][j], dp[i][j-1])
return dp[m][n]
Step-by-Step Process
Initialize DP Table: Create a 2D array dp of size (m+1, n+1) initialized to zero. If i == 0 or j == 0, set dp[i][j] to 0. If X[i-1] == Y[j-1], set dp[i][j] to 1 plus dp[i-1][j-1]. If characters do not match, set dp[i][j] to the maximum of dp[i-1][j] and dp[i][j-1]. Result: The value at dp[m][n] gives the length of the LCS. Time and Space Complexity
Time Complexity: O(M* N), as each cell in the table is filled once. Space Complexity: O(M * N) for the table.
Example
Consider X = "ABCBDAB" and Y = "BDCABC". The length of the LCS is 4, and the LCS could be "BCAB" or "BDAB".
Recursive Trace
Recursive calls break down the problem until base cases are reached. Combine results using the match and mismatch cases. Memoization Table
Initialize memo with dimensions 8 x 7 filled with -1. Fill memo during recursive calls. Tabulation Table
Conclusion
The LCS problem can be efficiently solved using dynamic programming. Starting with a recursive solution helps understand the problem, which can be optimized using memoization and further improved with tabulation to eliminate recursive stack space. These approaches provide a solid foundation for solving many other problems in dynamic programming.