Finding sub array totaling the given sum (Knapsack variation)

Problem

Given the array of integers, A, the target sum, S, and the limit on number of integers to use, K,
write the program to return the K integers, whose sum equals, S.

Example:

A = { 1,2 ,4, 5, 9}, with K = 2, S = 6,  output  = {4, 5}


This is the classic N sum problem.

Brute force approarch

Without deep thought, we can try to find the list of numbers totaling the given S through the DFS search.

Basically, we continue to search the array, A looking for the K integers making up of S. For each recursive call, we need to remember the previously selected set of integers as a parameter.
Overall running time will be $$O(N \times N!)$$, where N is the length of array, A

This exponential algorithm might not be the ideal one that the interview expect.
I will leave the coding of this approach to the readers.

Dynamic programing approach

We want something faster than exponential running, like polynomial running time. This is where the dynamical programming is coming to your aid.

1. Identify the sub problem to solve the smaller problem

Let's imagine that there is optimal solution, X and we are looking from the end of the array A to decide whether n th element, A[n] is part of X. There can be two cases:

Let's say, n is the position of element in the array A, k is the number of selected integers, and s is the sum of selected integers.

Case 1: A[n] is part of S. If so, the following should hold:  $$X_{n, k, s} = X_{n-1,k-1, s-A[n]} + A[n]$$

Case 2: A[n] is not part of S, we just inherit from the pervious solution, which is $X_{n-1, k, s}$.
$$X_{n, k, s} = X_{n-1,k, s}$$

2. Solve the bigger problem using the solution from the sub problem

Now, we have the sub problem that is simple enough to understand. We can know solve the bigger problem. Let's find out the possible ranges for k, n, s.

$$s = {0..S}$$ $$k = {0..K}$$ $$n = {0.. N}$$

Now, we can iterate over s, k, n by checking the sub problem we identified. It will be triple loop.
We need 3 dimensional array, X with size of S, K, N.   $$X[K][S][N]$$
where S is the target sum, K is the target number of integers to use, N is the length of the input array, A.

 3. Establish the base cases

It is very important to establish the correct base cases. As seen in Case 1 of the sub problem.
We might need to have base cases for the following two groups:

1) When K = 0, there is no number selected so we need to fill all X[0][n][s] to 0.

2) When K = 1, we set the X[1][n][s] to s if s == A[n], $-\infty$.   --> turned out we don't need it.

3) When S = 0. we set the X[k][n][0]  = 0 --> newly added 

4) When N = 0, we set the X[k][0][s] = 0 --> newly added

Basically, logic will be triple for-loop, which results in $$O(N \times K \times N)$$.

Until now, we have the algorithm that fills up the array, X with the final result at X[K][N][S].
However, the algorithm would not give us the answers, the sub array totaling S.
We need to back-trace the X to find out all the elements chosen in the course of getting the final result.

4. Back tracing to reconstruct the solution

Finding the numbers that made up of the final result at X[K][N][S] can be achieved by reconstructing the answers from X[K][N][S] following algorithm

1) Starting from X[K][N][S], evaluation the two cases:

 Case 1: If S >= A[n] and X[K][N][S] - A[N] = X[K-1][N-1][S-A[N]],
     A[N] is the part of the solution. Add A[N] to the output list.
     decrement K, N, S as below:
     K--, N--, S -= A[N]

Case 2: Otherwise, decrement K, N, S as below:
      N -= 1;
2) Perform the step 1 while K > 0 and N>0 and S>0

3) Return the output result.

Here is the complete code.



Practice statistics

2hr: to write up the code and fix the logical flaws
        - Got the base cases wrong several time.
        - Mix up the index variables. Should stick to the variables used in the algorithm.


Update (2019-05-25):

Solve the same problem again and realized that previous algorithm and C++ solution had flaws.

Write in python. Due to my ignorance on what Knapsack problem was and rusty algorithm skill, it took a week to complete the solution

Realized that it is very important to have correct base cases.

Here is python version of solution:







Comments

Popular posts from this blog

Planting flowers with no adjacent flower plots

Find the maximum number of bomb that can be detonated