DP is working on a similar idea to proving that algorithms work by using an invariant and showing it holds over the course of the execution.

In DP, we have an invariant that shows us how to calculate the solution for smaller problems. This allows us to then write recursive or bottom-up algorithms that find solutions to our Gesamtproblem.

Memoization and Bottom-Up

Naive implementation of Fibonacci:

def fib(n):
	if n <= 2: return 1 # recursion stop
	
	return fib(n-1) + fib(n-2)

This implementation is correct, it doesn’t run very fast though. We have seen in the Übungen that this has exponential runtime of $Ω (1. 5^{n})$ .

There are two approaches that allow us to do better!

Memoization

Memoization uses the storage of intermediate results to prevent us from having to re-calculate them at every turn.

We store an array memo[1, ..., n] which contains the results of the computation for each index. This can then be used to do the following:

memo[1] = 1
memo[2] = 1
 
def fib(n):
	if memo[n] != -1: return memo
	return fib(n-1) + fib(n-2)

This will massively speed up computation, to $O (n)$ , as each value now only has to be computed once. Accessing array elements here is $Θ (1)$ .

Bottom Up calculation

We can also go from the other way around, computing them in increasing order.

F[1..n] # new array
F[1] = 1
F[2] = 1
for i in range(3, n):
	F[i] = F[i - 1] + F[i - 2]

Bottom-Up vs. Memoization

Recursion:

Memoization is often easier to implement using recursion
Recursion is easier to read
we don’t need to explicitly think about the order we compute the values in

Bottom-Up:

more efficient as not stack heavy (no stack limits either)
memory optimisations are possible by keeping only one row of a DP-table for example

Core of DP

We need to find a fitting subproblem that is easier to solve. This subproblem will often call upon earlier results recursively. We can then solve these problems and construct the solution from them.

We need to explicitly write a recursive formulation of the problem, which requires us to think about base cases!

We then need to think about how to compute the solutions to the subproblems, in which order and using recursion or bottom-up. We then need to define how to extract the final solution.

Finally we can calculate the run-time, for which the size of the DP-table is very useful. Usually the elements in there can be calculated in constant time, which means the total runtime is the size of the table.

Maximum Subarray Sum

Subarray vs. Subsequence vs. Subset

There is an important differentiation to be made between the different problems to do with sub-…:

A subarray is a continous partition of the original input array

A subsequence is a non-continous partition of the original input array that preserve the order.

A subset is any subset of the elements of the original array.

Essentials

Runtime $Θ (n)$
DP-Table DP[1..n]

Recursion: Base Case: $R_{o} = a_{o}$ $R_{j} = max (a_{j}, a_{j} + R_{j - 1}), R_{M} = max_{j} R_{j}$ Algorithm

def MSS(A[0..n - 1])
	R = A[0]
	RM = A[0] 
	for j = 1 to n-1: 
		R = max(A[j], R + A[j]) 
		RM = max(RM, R)
	return RM

Description

We want to find the subarray that maximises the sum in an array of integers. Formally we want to find $i$ and $j$ such that $A [i .. j]$ with $1 \leq i \leq j \leq n$ such that $S_{i, j} := \sum_{k = i}^{j} A [k]$ is maximal, where the empty array is also valid.

Subproblem: We define the randmax $R_{j}$ as the maximum subarray sum ending in $j$ : $R_{j} := max_{1 \leq i \leq j} \sum_{k = i}^{j} A [k]$ As the $R_{j}$ either contains only $A [j]$ or it contains another maximum subarray sum ending in $A [j - 1]$ , we can define: $R_{j} = max {A [j], R_{j - 1} + A [j]}$ The base case is $R_{1} = A [1]$ .

To extract the solution we can take the maximum $R_{j}$ in our DP-table and otherwise 0 if they are all negative.

As calculating each cell of the DP-table takes $O (1)$ time (it’s only a single comparison and array access), the final runtime is $Θ (n)$ .

Jump Game

Essentials

Runtime $O (n)$ for the optimised one ( $Θ (n^{2})$ for naive approach).
DP-Table DP[0..k] or DP[1..n].

Recursion: Base Cases: $d p [0] = 0$ , $d p [1] = A [0]$ $d p [k] = max {i + A [i] ∣ d p [k - 2] < i \leq d p [k - 1]}$ Algorithm :

def MinJumps(A[0..n-1]):
	dp[0] = 0
	dp[1] = A[0]
	
	k = 1 
	while dp[k] < n-1:
		k = k + 1
		dp[k] = -infinity
		
		for i = dp[k-2] + 1 to dp[k-1]: 
			dp[k] = max(dp[k], i + A[i]) 
			
	return k

Description

We need to find the minimal amount of jumps needed to get from the beginning of the array to the end. From position $i$ we can jump to $i + 1$ and $i + A [i]$ . We start at $1$ . All numbers in the array are natural numbers $n \geq 1$ .

We can therefore always jump at least $1$ , thus the maximum is $n - 1$ jumps.

Two different approaches

We define our problem as $S [i] := minimal number of jumps to reach i$ . The final solution is then simply $S [n]$ . The recursive equation is $S [i] = min {S [j] + 1 ∣ 1 \leq j < i and j + A [j] \geq i}$ We find the cell in the array that leads to our current position with the fewest jumps.

We can also use a different approach by switching our variables. This approach often works in DP problems (Knapsack also).

Instead of thinking in array cells, we think in cells we can reach in $k$ jumps. We switch $i$ and $k$ , making our DP-table $M [k] := maximum index we can reach with k jumps$ . The solution is then the smallest $k$ for which $M [k] \geq n$ .

We get the following recursive equation $M [k] = max {i + A [i] ∣ 1 \leq i \leq M [k - 1]}$ We have $M [0] = 1$ . The positions $i$ reachable in $k - 1$ jumps are $i \leq M [k - 1]$ . Thus we search the maximum position $i$ reachable from those $i$ by looking for the highest $i + A [i]$ .

We can further optimise this by seeing that we only need to look at $M [k - 2] < i \leq M [k - 1]$ as we can reach $i \leq M [k - 2]$ with $k - 2$ jumps already. Thus our recursion becomes $M [k] = max {i + A [i] ∣ M [k - 2] \leq i \leq M [k - 1]}$ And base cases $M [0] = 1$ and $M [1] = A [0]$ . As each $i$ now only contributes to a single maximum, we have a runtime of $O (n)$ . This is because after we have $max i + A [i]$ once, the next steps will not go below that threshold.

Lower bound: We can see that $O (n)$ is optimal as we have to look at each element at least once. Otherwise we could have $A [1, 4, 1, 1, 1, 1]$ where if we dont look at $A [2] = 4$ we will not find the optimal number of jumps.

Longest Common Subsequence (Längste Gemeinsame Teilfolge)

Essentials

Runtime: $Θ (n \cdot m)$ (can be improved to $O (n^{2} + m)$ or $O (n + m^{2})$ )
- We can also reach $O (n^{2} / lo g n)$ for the case $m = n$
DP-Table: DP[0..n][0..m] for $n, m$ lengths of the strings

Recursion: Base Cases: $L (i, 0) = L (0, j) = 0$ $L (i, j) = ⎩ ⎨ ⎧ 0, L (i - 1, j - 1) + 1, max (L (i - 1, j), L (i, j - 1)), i = 0 oder j = 0 X_{i} = Y_{j} X_{i} \neq = Y_{j}$

Algorithm

def LCS(X[1..m], Y[1..n]):
	for i = 0..m: L[i,0] = 0
	for j = 0..n: L[0,j] = 0
	
	for i = 1..m:
		for j = 1..n: 
			if X[i] == Y[j]: 
				L[i,j] = L[i-1,j-1] + 1
			else: 
				L[i,j] = max(L[i-1,j], L[i,j-1])
	return L[m,n]

Description

We want to find the longest common subsequence that two strings share. For example TIGER and ZIEGE share IGE as a LGT.

We have $L [i, j] := Length of an LGT from (a_{1}, \dots, a_{i}) and (b_{1}, \dots, b_{j})$ The length of the LGT is then simply $L GT [n, m]$ .

We compute the LGT by distinction between three cases:

$L [0, 0] = 0$ as for two empty strings we have an empty LGT
$L [i, j]$ for $i > 0, j > 0$ and $a_{i} = b_{j}$ :
- In this case $a_{i} = b_{j}$ as the last element in our LGT. The rest of the LGT must then be an LGT of $(a_{1}, \dots, a_{i - 1})$ and $(b_{1}, \dots, b_{j - 1})$ .
- This choice of last element is as good as any other, it leaves more choice for the LGT from the previous elements.
$i > 0, j > 0$ and $a_{i} \neq = b_{j}$ :
- In this case, this cannot be the last element of the LGT, thus it’s either an LGT of $(a_{1}, \dots, a_{i})$ and $(b_{1}, \dots, b_{j - 1})$ OR $(a_{1}, \dots, a_{i - 1})$ and $(b_{1}, \dots, b_{j})$ .

This gives us the following recursion: $L (i, j) = ⎩ ⎨ ⎧ 0, L (i - 1, j - 1) + 1, max (L (i - 1, j), L (i, j - 1)), i = 0 oder j = 0 X_{i} = Y_{j} X_{i} \neq = Y_{j}$ We can calculate this value bottom-up by first going row-wise left to right or column-wise top to bottom. Then we never need an element not yet calculated.

Backtracking

We can find the actual LGT itself by using backtracking on the DP-table.

If we go diagonally, it’s because the elements were the same. Thus that element is part of the LGT. If we went horizontally, then it was not part of it.

Editing Distance

Essentials

Recursion: Base Cases: $D (0, j) = j$ , $D (i, 0) = i$ $D (i, j) = ⎩ ⎨ ⎧ D (i - 1, j) + 1, D (i, j - 1) + 1, D (i - 1, j - 1) + [X_{i} \neq = Y_{j}] (delete) (insert) (replace)$ Algorithm:

def EditDistance(X[1..m], Y[1..n]):
	for i = 0..m: D[i,0] = i
	for j = 0..n: D[0,j] = j
	for i = 1..m:
		for j = 1..n:
			cost = 0 if X[i] == Y[j] else 1 
			D[i,j] = min(
				D[i-1,j] + 1, # delete
				D[i,j-1] + 1, # insert
				D[i-1,j-1] + cost # replace
			)
	return D[m,n]

Description

The editing distance of two strings is the minimum amount of edits (insert, delete, replace) we need to perform in order to transition one into the other. We can use the LGT to calculate the editing distance.

The editing distance of TIGER to ZIEGE is $3$ as we replace T by Z, then remove R and insert E. There are of course other ways to do this, but they can’t be shorter.

We track the character $a_{i}$ through the process of finding the $E D (i, j)$ for two strings $(a_{1}, \dots, a_{i})$ and $(b_{1}, \dots, b_{j})$ :

$a_{i}$ is deleted at some point, thus $E D (i, j) = 1 + E D (i - 1, j)$ , i.e. searching for the ED between the strings $(a_{1}, \dots, a_{i - 1})$ and the same b.
- A crucial insight is that if a character is deleted, it doesn’t matter when in the process it is done so.
$a_{i}$ is not deleted and ends up somewhere in ${1, \dots, j - 1}$ .
- In this case no character $a_{k}, k < i$ can be behind $a_{i}$ (it would cost an extra op to delete and insert it again), thus we have $E D (i, j) = 1 + E D (i, j - 1)$ .
$a_{i}$ is not deleted and ends up at $b_{j}$
- In this case we can’t insert any other character behind $a_{i}$ , thus $E D (i, j) = E D (i - 1, j - 1)$ if $a_{i} = b_{j}$ otherwise $E D (i, j) = 1 + E D (i - 1, j - 1)$ .

We can calculate each entry of the DP-Table in constant time $Θ (1)$ , thus the total runtime is $Θ (n \cdot m)$ .

Backtracking

We can again use the DP-Table to find the edits performed:

Subset Sum (Teilsummenproblem)

Essentials

Runtime: $Θ (n \cdot b)$
DP-Table: DP[0..n][0..b]

Recursion: Base Cases: $d p [0] [0] = True, d p [0] [t > 0] = False$ Recursion: $d p [i] [t] = {d p [i - 1] [t] \lor d p [i - 1] [t - s_{i}], d p [i - 1] [t], t \geq s_{i} t < s_{i}$

Description

We want to find the subset $I \subseteq {1, \dots, n}$ such that $\sum_{i \in I} A [i] = b$ . Such a subset sum must not exist for all $b$ obviously.

There is a special version of this problem called the partition problem (Partitionsproblem) which asks if we can divide the numbers of $A$ into two subsets with the same sum.

Our sub-problem is $T (i, s) = {1 if a subset exists with I \subseteq {1, \dots, i} w i t h \sum_{j \in I} A [j] = s, 0 otherwise$ To recursively calculate this value we observe that either $A [i]$ is in the subset sum, or we look for a subset sum which includes $A [i]$ and thus sums to $s - A [i]$ : $T (i, s) = T (i - 1, s) \lor T (i - 1, s - A [i])$ $T (0, 0) = 1$ as we can use $\emptyset$ . $T (0, s) = 0$ for all $s > 0$ as we can’t use any elements. We calculate in ascending order of $i$ .

To find the subset, we have to include all $A [i]$ which have a diagonal increase in the DP-table, as that is the case where we take $A [i]$ .

This problem runs in pseudo-polynomial runtime.

Pseudo-Polynomial Runtime

We have $Θ (n \cdot b)$ . But while $n$ is the length of the array, $b$ is the value of a user entry. This means that $b$ could be extremely large, while looking like a polynomial factor in our model. If we chose $b := 2^{k}$ for example, our runtime would be $Θ (n \cdot 2^{k})$ which is exponential. On the other hand, if $b$ is polynomial in relation to $n$ , like for $b := n^{2}$ , then our total runtime is also polynomial.

Knapsack Problem (Rucksackproblem)

Essentials

Runtime: $Θ (n \cdot W)$ or $Θ (n \cdot P)$ .

Recursion: Base Cases: $d p [0] [w] = 0, d p [i] [0] = 0$ $d p [i] [w] = {d p [i - 1] [w], max (d p [i - 1] [w], d p [i - 1] [w - w_{i}] + v_{i}), w_{i} > w sonst$ Algorithm:

def Knapsack(v[1..n], w[1..n], W):
	for i = 0..n: dp[i][0] = 0
	for w = 0..W: dp[0][w] = 0
	for i = 1..n:
		for cap = 1..W:
			if w[i] <= cap: 
				dp[i][cap] = max( 
					dp[i-1][cap],
					dp[i-1][cap - w[i]] + v[i]
				)
			else:
				dp[i][cap] = dp[i-1][cap]
	return dp[n][W]

Description

The knapsack problem asks which items of weight $w_{i}$ and profit $p_{i}$ we should take for a weight limit $W$ to reach a profit $p$ . This is a a subset problem.

A greedy algorithm (chooses local optimum in the hopes of it corresponding to the global optimum) which always takes the most profitable item (or the lightest ones, or the ones with the best profit/weight ratio) will fail, as we can always construct an unfavorable input.

Our subproblem is: $P (i, w) = {P (i - 1, w) max {P (i - 1, w), p_{i} + P (i - 1, w - w_{i})} if w < w_{i},$ So we either don’t use an item because it busts our limit, or we can use it and take the bigger profits between using it and not.

Backtracking works like usual, if we go diagonally, we took the item, otherwise not.

This algorithm is again pseudo-polynomial as our input $W$ is not the length of an array but a user entry.

Alternative: There is also an alternative DP solution, which uses $P$ by variable switching. We look for items that give us profit $p$ by taking the subset that has minimal weight.

Approximation for the Knapsack Problem

As we don’t expect to find an algorithm to solve the Knapsack problem in polynomial time, we can instead look into finding an algorithm that approximates the solution.

We want to find an algorithm that has polynomial runtime and returns a value close to the actual solution.

To do this, we round the profits and solve the Knapsack problem for these rounded profits: $\overline{p_{i}} := K \cdot ⌊ \frac{p _{i}}{K} ⌋$ where K is the multiple that we round $p_{i}$ to. As we didn’t change the weights and only the profits, our approximated solution is still a valid subset for the original problem

Key Properties:

The rounded profits satisfy: $\overline{p_{i}} \leq p_{i} \leq \overline{p_{i}} + K$
Since we only change profits (not weights), any valid solution for the rounded problem is also valid for the original problem
We can exclude items with $w_{i} > W$ beforehand (they won’t fit anyway)

Performance Analysis

Let:

$OPT$ = optimal solution for the original problem
$\overline{OPT}$ = optimal solution for the rounded problem (what our algorithm computes) - $p_{m a x} = max {p_{1}, \dots, p_{n}}$ = maximum profit of any item

Key Inequalities:

$\sum_{i \in OPT} \overline{p_{i}} \geq p_{m a x}$ (sum of rounded profits in OPT is at least $p_{m a x}$ )
$\sum_{i \in \overline{OPT}} p_{i} \leq n \cdot p_{m a x}$ (since OPT has at most $n$ items)

Approximation Quality: Through a series of inequalities, we can show: $p (\overline{OPT}) \geq p (OPT) - K \cdot n$ where $p (\cdot)$ denotes the total profit of a solution. To make $K \cdot n$ small relative to $p (OPT)$ , we introduce a parameter $ε > 0$ and set: $K := \frac{ε \cdot p _{m a x}}{n}$ This gives us: $K \cdot n = ε \cdot p_{m a x} \leq ε \cdot p (OPT)$ Therefore: $p (\overline{OPT}) \geq p (OPT) - ε \cdot p (OPT) = (1 - ε) \cdot p (OPT)$ Result: The algorithm achieves a $(1 - ε)$ -approximation — the profit of our solution is at least $(1 - ε)$ times the optimal profit.

As our new DP table now only has length $Θ (n P / K)$ as we only have to fill every $K$ th column in the table. We can choose an arbitrarily big $K$ , for a trade off in accuracy.

With $K = ε \cdot p_{m a x} / n$ :

Runtime: $O (\frac{n ^{2} P}{ε \cdot p _{m a x}})$
Since $P \leq n \cdot p_{m a x}$ , this simplifies to: $O (n^{3} / ε)$

Examples:

For $ε = 0.01$ : We get a 90% approximation in time $O (n^{3})$
For $ε = 1/ n$ : We get a $(1 - 1/ n)$ -approximation in time $O (n^{4})$

Longest Ascending Subsequence (Längste Aufsteigende Teilfolge)

Essentials

Runtime: $Θ (n lo g n)$

Base Cases: tails = [a_1] we initialise the array with our first element (it’s the smallest at that point).

Algorithm:

def LAS(A[1..n]):
	tails = [] # tails[l] = kleinstes Endelement einer Folge der Länge l+1
	for x in A:
		pos = binary_search_first_ge(tails, x)
		if pos == len(tails):
			tails.append(x)
		else:
			tails[pos] = x
	return len(tails)

We can use an array for this by just initialising it to length $n$ and then setting all values to $\infty$ .

Description

We are given an array $A [1.. n]$ of $n$ distinct whole numbers. We want to find the longest ascending subsequence of $A$ (i.e. the indices $i \in I$ for which the longest $A [i_{0}], \dots, A [i_{n}]$ is strictly ascending).

We define $E (i, l) = {10 if there is a longest ascending subsequence ending length l in A [i] otherwise$ To calculate these values recursively we distinguish:

$l = 1$ : $E (i, l) = 1$ as the LAT of length 1 always exists
$l \geq 2$ : $E (i, l) = 1$ iff. there is a $j < i$ such that $E (j, l - 1) = 1$ and $A [j] < A [i]$ . We can then extend this subsequence by $A [i]$ .
$0$ otherwise

This is not the most efficient solution however with $O (n^{3})$ , as we need to compute go through all previous possibilities.

More efficient

We define a table $M (i, l)$ as the smallest possible ending of a LAT of length $l$ in the subarray $A [1.. i]$ . If there is none, $M (i, l) = \infty$ (no smaller element exists to extend the sequence).

The recursion is defined for the following cases:

$i = 1$ : then $M (1, l) = A [i]$ if $l = 1$ and $\infty$ otherwise
$i \geq 2$ :
- We do not use $A [i]$ , and in this case $M (i, l) = M (i - 1, l)$
- We use $A [i]$ . This is only possible if $M (i - 1, l - 1) < A [i]$ , as only then we can actually append $A [i]$ to this longest subsequence. Then $M (i, l) = A [i]$
- We therefore need to take the smaller of these two:
  - if $i \geq 2$ and $M (i - 1, l - 1) < A [i]$ , then $M (i, l) = min {M (i - 1, l), A [i]}$

As every entry in our $n \times n$ table can be computed in constant time, we have a runtime of $O (n^{2})$ .

Even more efficient

We can still improve the runtime here by recognising that $M (i, l) \leq M (i, l + 1)$ as we can take any longer subsequence and shorten it by 1. This means that the rows of our DP table are sorted:

By only keeping the latest version, i.e. the smallest element for any $l$ , we can have a 1d DP table.

We can always use binary search in our sorted table to find the place where $A [i] \leq T [l]$ , i.e. the one we need to update.

We return the biggest $l$ for which the entry is $\neq = \infty$ .

P = NP?

This question asks if the class of problems computable in polynomial time is equal to the class of problems solvable in polynomial time. Said differently, if we can check a solution in polynomial time, can we also find it?

If we can solve problems like the subset sum or the knapsack problem in polynomial time we have shown that $P = NP$ .

Tips for the Exam

DP Table Recursion

A common mistake while establishing the recursion for the DP-Table is to have DP[...][b - A[i]] where b - A[i] goes out of bounds. If we don’t say false if out of bounds, this is a mistake.

DP Table Solution Runtime

When specifying the runtime of the solution, always specify how long it takes to extract the solution. It’s often $O (1)$ .

Niklas @ ETHZ

Explorer

4. Dynamic Programming

Memoization and Bottom-Up

Memoization

Bottom Up calculation

Bottom-Up vs. Memoization

Core of DP

Maximum Subarray Sum

Essentials

Description

Jump Game

Essentials

Description

Two different approaches

Longest Common Subsequence (Längste Gemeinsame Teilfolge)

Essentials

Description

Backtracking

Editing Distance

Essentials

Description

Backtracking

Subset Sum (Teilsummenproblem)

Essentials

Description

Knapsack Problem (Rucksackproblem)

Essentials

Description

Approximation for the Knapsack Problem

Performance Analysis

Longest Ascending Subsequence (Längste Aufsteigende Teilfolge)

Essentials

Description

More efficient

Even more efficient

P = NP?

Tips for the Exam

DP Table Recursion

DP Table Solution Runtime

Graph View

Table of Contents

Backlinks