Find four elements that sum to a given value [reviewed]

Problem

You're given an array of integers(eg [3,4,7,1,2,9,8]) Find the index of values that satisfy A+B = C + D, where A,B,C and D are integers values in the array.

Eg: Given [3,4,7,1,2,9,8] array

3+7 = 1+ 9 satisfies A+B=C+D

so print (0,2,3,5)

1. Two sum problem

First all all, we can break this problem into simpler problem. Finding the two numbers that sum to a given number is typical "two sum" problem. Using the hashtable, which takes

$O(1)$ for search operation, we can solve this problem in

$O(N)$ .

Here is the complete code running in

$O( N^2 )$ , base on the assumption that the Hashtable provides

$O(1)$ time for search. However, I used std::map, which provides

$O(\log{N}$ ) for search, so the overall running time is

$O(N^2 \log {N})$

Give the S, the sum as an input, algorithm is as below:

1. put integers in the array into Hashtable,

$H$ . (

$O(N)$ )
2. Iterate over the integers in the array, for each integer, x in the array, check if

$S-x$ exists in the hashtable, H. (running time

$O (N) * O(1) = O(N)$ )

2. Extending to the bigger problem

Now, we know how to find the two numbers in

$O(N)$ time, given the array,

$N$ . How can we apply this idea to 4 integers satisfying the following?

$A+B = C+D, where A \in N, B \in N, C \in N,D \in N$

What if we can convert above equation into the following?

$A+C = S, where S = C+D, C\in N, D \in N$

If we can do that, this problem become the two sum problem again.
We can preprocess the integers in array

$N$ , produce the list of sum of two unique integers. This process will take

$O(N^2)$ time.

Let's call this list,

$S_{c,d}$ .

All we need to do is to find two integers other than

$C, D$ making up

$S \in S_{c,d}$ using two sum algorithm. Let calculate the overall running time.

1) iterating over

$S_{c,d}$ takes

$O(N)$ .
2) For each

$S \in S_{c,d}$ , running a two sum algorithm takes

$O(N)$

This will take

$O (N^2)$ time.

The implementation blow used the std::map instead of HashTable. This gives

$O( \log {N})$ search instead of

$O(N)$ , thereby resulting in

$O(N^2 \log {N})$ running time.

	#include<vector>
	#include<map>
	#include<iostream>

	using namespace std;

	struct node {
	int x;
	int y;
	int sum;
	node(int a, int b, int s):x(a), y(b), sum(s){}
	};

	map<int, int> build_hashtable(vector<int>& input)
	{
	map<int, int> hashtable;
	for (int i = 0; i< input.size(); i++)
	{
	hashtable[input[i]] = i;
	}
	return hashtable;
	}

	vector<node*> build_sum_list(vector<int>& input)
	{
	vector<node*> sums;
	for (int i = 0; i<input.size(); i++)
	{
	for(int j = i+1; j < input.size(); j++)
	{
	sums.push_back(new node(i,j, input[i]+input[j]));
	}
	}
	return sums;
	}
	vector<int> find_double_sum(vector<int>& input)
	{
	vector<int> result;
	map<int, int> hashtable = build_hashtable(input);
	vector<node*> sums = build_sum_list(input);

	for ( int i = 0; i< sums.size(); i++)
	{
	int sum = sums[i]->sum;
	for (int j = 0; j < input.size(); j++)
	{
	map<int, int>::iterator found;
	if (j != sums[i]->x && j != sums[i]->y)
	{
	if ((found = hashtable.find(sum-input[j]))!= hashtable.end())
	{
	result.push_back(sums[i]->x);
	result.push_back(sums[i]->y);
	result.push_back(j);
	result.push_back(found->second);
	return result;
	}

	}
	}
	}
	return result;
	}

	int main()
	{
	vector<int>input;
	input.push_back(3);
	input.push_back(4);
	input.push_back(7);
	input.push_back(1);
	input.push_back(2);
	input.push_back(9);
	input.push_back(8);

	vector<int>result = find_double_sum(input);
	for (int i = 0; i < result.size(); i++)
	cout<<" "<<result[i]<<", ";
	cout<<endl;
	return 1;
	}

view raw find_double_twosum.cpp hosted with ❤ by GitHub

Practice statistics

25:52: to write up the code in a paper
5:18 : to verify the code (made minor typo)

UPDATE(2019-07-26): solved the problem again. Came up with the right algorithm but code had a bug of adding the solution twice.

Here is a solution in python

	class FindEqualSum:

	def __init__(self):
	self.matrix = None
	self.result = None

	def find_numbers(self, list):
	self.matrix = [0 for i in list]
	self.result = []
	self.dfs(list, 0, [])
	return self.result

	def dfs(self, list, next, chosen_set):
	if len(chosen_set) == 2:
	sum = list[chosen_set[0]] + list[chosen_set[1]]
	found = self.twoSum(list, sum)
	if found == True:
	self.result.append(chosen_set[:])
	return found


	for i in range(next, len(list) -1):
	chosen_set.append(i)
	self.matrix[i] = 1
	if self.dfs(list, i+1, chosen_set) == True:
	return True
	#backtracking
	chosen_set.pop()
	self.matrix[i] = 0

	return False


	def twoSum(self, list, sum):
	map = {}
	for i, v in enumerate(list):
	if self.matrix[i] == 0:
	map[v] = i

	#search the second value.
	for i , v in enumerate(list):
	if sum -v in map:
	#store position of found two values
	self.result.append([i, map[sum-v]])
	return True

	return False

	f = FindEqualSum()

	input = [3,4,7,1,2,9,8]
	print("input = {} output = {}".format(input, f.find_numbers(input)))

view raw facebook_double_two_sum_dfs_hashtable.py hosted with ❤ by GitHub

Practice statistics:

7:00: come up with algorithm
15:00: write up the code
5:00: debugging and fix the errors.
6:00: had to fix the problem of recursion.

Search This Blog

Peter's CodeCrushing