Explore

Longest Common Subsequence (LCS)

Introduction

The Longest Common Subsequence (LCS) is a classic problem in computer science, often used to illustrate dynamic programming techniques. Given two sequences, the task is to find the length of their longest subsequence that appears in both sequences in the same order, though not necessarily consecutively. This reading material will explore the LCS problem using various approaches: Recursion, Memoization, and Tabulation, along with their time and space complexities.

Problem Description

Given two sequences, X and Y, the LCS problem seeks to determine the longest sequence that can be derived from both X and Y by deleting some or no elements without changing the order of the remaining elements.

Example:

Consider two sequences:

X = ABCBDAB

Y = BDCABC

The longest common subsequence is BCAB, which appears in both sequences in the same order.

Recursive Approach

Concept

The recursive solution to the LCS problem is based on the idea of solving smaller subproblems. If the characters at the current indices of both sequences match, they are part of the LCS. Otherwise, the solution is the maximum LCS obtainable by either skipping the current character of the first sequence or the second sequence.

Recursive Pseudo Code

function LCS(X, Y, i, j):

if i == 0 or j == 0:

return 0

if X[i-1] == Y[j-1]:

return 1 + LCS(X, Y, i-1, j-1)

else:

return max(LCS(X, Y, i-1, j), LCS(X, Y, i, j-1))

Step-by-Step Process

Base Case: If either sequence is exhausted (i == 0 or j == 0), the LCS is zero.

Match Case: If the current characters of X and Y match (X[i-1] == Y[j-1]), the LCS length is 1 plus the LCS of the remaining sequences.

Mismatch Case: If the characters do not match, the solution is the maximum LCS obtained by either excluding the current character of X or Y.

Time and Space Complexity

Time Complexity: , where M and N are the lengths of X and Y. This exponential complexity arises from the recursive calls.

Space Complexity: O(M + N) due to the call stack.

Memoization Approach

Concept

Memoization stores the results of subproblems to avoid redundant computations. By maintaining a 2D array (memo table), the results of subproblems are saved and reused.

Memoization Pseudo Code

function LCS(X, Y, i, j, memo):

if i == 0 or j == 0:

return 0

if memo[i][j] != -1:

return memo[i][j]

if X[i-1] == Y[j-1]:

memo[i][j] = 1 + LCS(X, Y, i-1, j-1, memo)

else:

memo[i][j] = max(LCS(X, Y, i-1, j), LCS(X, Y, i, j-1))

return memo[i][j]

Step-by-Step Process

Initialize Memo Table: Create a 2D array memo initialized with -1.

Base Case: If either sequence is exhausted, return 0.

Check Memo Table: If the subproblem has been solved, return the stored result.

Match Case: If characters match, store the result as 1 plus the LCS of the remaining sequences.

Mismatch Case: If characters do not match, store the result as the maximum LCS by excluding characters.

Time and Space Complexity

Time Complexity: O(M * N), since each subproblem is solved only once.

Space Complexity: O(M * N) for the memo table, plus O(M + N) for the call stack.

Tabulation Approach

Concept

Tabulation involves filling a table iteratively to solve the problem. This approach eliminates the recursive stack space by using a bottom-up strategy.

Tabulation Pseudo Code

function LCS(X, Y):

m = length(X)

n = length(Y)

dp = array of size (m+1, n+1)

for i from 0 to m:

for j from 0 to n:

if i == 0 or j == 0:

dp[i][j] = 0

else if X[i-1] == Y[j-1]:

dp[i][j] = 1 + dp[i-1][j-1]

else:

dp[i][j] = max(dp[i-1][j], dp[i][j-1])

return dp[m][n]

Step-by-Step Process

Initialize DP Table: Create a 2D array dp of size (m+1, n+1) initialized to zero.

Fill DP Table:

If i == 0 or j == 0, set dp[i][j] to 0.

If X[i-1] == Y[j-1], set dp[i][j] to 1 plus dp[i-1][j-1].

If characters do not match, set dp[i][j] to the maximum of dp[i-1][j] and dp[i][j-1].

Result: The value at dp[m][n] gives the length of the LCS.

Time and Space Complexity

Time Complexity: O(M* N), as each cell in the table is filled once.

Space Complexity: O(M * N) for the table.

Example

Consider X = "ABCBDAB" and Y = "BDCABC". The length of the LCS is 4, and the LCS could be "BCAB" or "BDAB".

Recursive Trace

Call LCS(X, Y, 7, 6)

Recursive calls break down the problem until base cases are reached.

Combine results using the match and mismatch cases.

Memoization Table

Initialize memo with dimensions 8 x 7 filled with -1.

Fill memo during recursive calls.

Tabulation Table

Table 3

Table 3

B 5

C 6

Column 8

There are no rows in this table

⁠

Conclusion

The LCS problem can be efficiently solved using dynamic programming. Starting with a recursive solution helps understand the problem, which can be optimized using memoization and further improved with tabulation to eliminate recursive stack space. These approaches provide a solid foundation for solving many other problems in dynamic programming.

Want to print your doc?
This is not the way.

Try clicking the ··· in the right corner or using a keyboard shortcut (

CtrlP

) instead.