Picking K matches from N matchbox that makes the sum the minimum multiple of K

Problem

Rahul is playing a very interesting game. He has some N different types of matchboxes. All matchboxes may have a different number of matchsticks (S1, S2, S3... Sn).

Rahul chooses two random numbers F and K. K should be less than N. The game is that Rahul wants to select any K matchboxes out of N matchboxes such that the total number of matchsticks in these K-selected matchboxes should be multiple of F.

You can only take one patch from each box. No duplicate number is allowed

At the same time, Rahul wants the sum of matchsticks of all the selected match boxes should be the minimum possible.

Input Specifications:

1) Array S = {S1,S2,S3,...Sn} of size N corresponding to the number of match sticks in N matchboxes (0<=N<=1000}

2) F-Value (as explained above)

3) K-Value ( as explained above)

Output:

1 2 3 4 5 Here 3 is the number of matchsticks in matchboxes I II III IV V minimum possible total number of matchsticks such that the condition explained in the problem statement is satisfied. Output -1 if it is not possible or invalid input.

For example, there are 5 match boxes i.e., N = 5

Let's say K is 3 (Rahul has to choose any 3 matchboxes)

Let's say F is 5(the sum of matchsticks in 3 selected matchboxes should be multiple of 5).

Rahul can choose II, III, and V matchboxes which would give the total sum of 10 which is a multiple of F i.e., 5. And 10 is the minimum number of possible matchsticks possible in the above case.

So you have to answer the minimum possible matchstick(sum of the matchsticks in the selected matchboxes) but the conditions given above should be satisfied.

This is one of Google's interview questions. I tried to find an efficient solution. However, I could not find an efficient solution in the first place.

Due to the complexity, which might not the case for some audiences, allow me to write a lengthy post.

Brute force approach using DFS

The current solution runs in $O(M * N^K)$, where $$M = \frac{\sum_{i={N-K}}^N i} {F}$$ and N is the number of matchboxes, and K is the number of the selected stick.

One brute force approach is to use think of this problem as a graph and perform the depth-first search up to K depth to find the sum equal to M*K.
This algorithm stops if the $M*K$ becomes larger than the sum of the largest K numbers.

I will post part 2 after I find a more efficient solution.

Here is the brute force solution.

For N = 1000, and K = 13, F = 9, it runs forever. Running time is: $$O(1000^{13})$$

This algorithm may be correct but, it is not an efficient algorithm.

More efficient approach using Dynamic programming

Given the fact that our previous attempt was too slow, we need to think of a better approach.
However, what we produced earlier was not all useless. It reveals one important insight into the better solution.

In lines 39 - 50, we have a unique sub-problem, which is checking if there are K numbers making up i*F given N numbers.

while (found == -1 && (last_sum >= i*F))
{
bool * visited = new bool[N];
if (dfs(visited, F*i, N, K)) // can be replaced with nSum function.
{
found = F*i;
}
i++;
delete []visited;
}
return found;

It is a typical N-Sum problem. This problem can be solved efficiently through Dynamic programming.

To solve the problem using dynamic programming, we first need to identify the sub-problems.

1. Identify the sub-problems.

Think of going backward from N, to check if it is part of the ideal solution we might find.

Let's say $S_{n, k, s}$ an optimal solution, where $n \in N $ and k is the number of n can be selected, and s is the sum of selected numbers.

There are two cases we need to consider:

Case 1) If n, $n \in N $ is part of the optimal solution, $S_{n, k, s}$ , $$S_{n, k, s} = S_{n-1, k-1, s - n} + n$$

Case 2) if n, $n \in N $ is not part of the optimal solution, $S_{n, k, s}$, $$S_{n, k, s} = S_{n-1, k, s}$$
In case 2, we just inherit the result of $S_{n-1, k, s}$.

2. Now, we identified the sub-problem. We can solve the bigger sub-problems given the solutions to the smaller sub-problems.

There are 3 variables we need to keep track of, $$ n \in N, 0 \gt n \le N$$ $$ k, 0 \gt k \le K$$ $$ s , 0 \gt s \le F$$

Therefore, we will allocate the 3-dimensional array, S to keep track of the results in different n, k, s.

As a base case, we set S[n][0] to INT_MIN since the sum can't be valid when k = 0, meaning no number was selected.

We also set S[n][k][0] to INT_MIN for all $ 0 \gt k \le K$ since if the target SUM is 0, no valid sum is possible.

Now we need to fill in the array S, in a bottom-up matter, and check if any n falls into Case 1 or Case 2.

It will be 3 nested for-loop, which runs in $O( NKF )$, which is close to $O ( N^3)$.

This algorithm is way faster than the previous exponential algorithm.

Here is the complete code in C++.

UPDATE(2019-07-26): solved the question again. I knew I could use the nSum method for subroutine but couldn't come up with a complete algorithm. I did not trust myself so gave up writing the code accordingly. I should trust myself more.

Here is a solution in python.

Practice statistics

10:00: to write the code using nSum
10:00: to review the code
10:00: to fix the bug (but in backtrace code. used the wrong variable.)

UPDATE(2023-02-27): couldn't solve the problem from the first place. Took several days to write the correct code even with the right algorithm.

Practice statistics

3 days: to write the code using nSum

Search This Blog

Peter's CodeCrushing