This is a function version of the partition problem, so it is FNP-complete (and thus at least as hard as NP-complete). You can also phrase the question as a function version of Subset-Sum. Luckily, it has a pseudopolynomial solution and often solvable quickly in practice.
It's more useful to consider an equivalent form of your problem: Given a multiset of integers S, whose sum is T, find a subset of S whose sum is at most T/2 and as close to T/2 as possible. This subset is just the subset with smaller sum from your problem, so the other subset is the rest of S.
Given an algorithm (such as the one you link in the other post) that merely finds the optimal-sum-at-most-half (or, equivalently, the minimum absolute difference), there is usually a straightforward way to modify it to get the actual subset. When generating a list of subset-sums, also store the index of an element that generated that subset-sum. Then, at the end, we can backtrack from our final found sum to recover the elements.
For a simple example, if our array is [1, 5, 8], we might generate all subset sums by adding elements one at a time:
Sums-Dict: Hashmap from subset-sums to last added element for that sum.
Initial Sums: {0: null}
Add element 1:
Sums-Dict = {0: null, 1: 1}
Add element 5:
Sums-Dict = {0: null, 1: 1, 5: 5, 6: 5}
Add element 8:
Sums-Dict = {0: null, 1: 1, 5: 5, 6: 5, 8: 8, 9: 8, 13: 8, 14: 8}
Then to backtrack, we use a process similar to backtracking in the knapsack-problem to output a solution:
Find a closest sum to (1+5+8)/2, for example '6'.
Backtracking:
Used-elements = [], Current Sum = 6.
Sums-Dict[6] = 5: add 5 to used-elements, subtract 5 from current sum
Used-elements = [5], Current Sum = 1.
Sums-Dict[1] = 1: add 1 to used-elements, subtract 1 from current sum
Used-elements = [5, 1], Current Sum = 0.
Sums-Dict[0] = null: stop. We have found the subset that summed to '6'.
The classic dynamic-programming solution can be trivially modified to store that extra information for each sum; if the numbers are small, this is your best bet. I've included a Python implementation for subset-sum with a core algorithm and notation based on the Schroeppel-Shamir paper. This is a meet-in-the-middle version of the naive inclusion/exclusion solution to subset-sum. It's more complex than the brute-force approach, but runs in O(2^(n/2) * n/4) time and takes O(2^(n/4)) space, so is a practical solution for larger inputs.
from typing import Dict, List, Tuple
import collections
import math
import heapq
class SubsetSumSolver:
"""Solves the subset-sum and partition optimization problems.
Useful when values or goal sum are too large for dynamic programming"""
def __init__(self, nums: List[int]):
# Strip all zeros. Not necessary, but a useful optimization for speed
self.orig_zeros = nums.count(0)
self.nums = sorted(x for x in nums if x != 0)
self.n = len(self.nums)
def all_subset_sums(self, left_bound: int, right_bound: int) -> Dict[int, int]:
"""Return a subset-sum dictionary, mapping subset-sums of
nums[left_bound:right_bound] to any index of an element in that subset-sum."""
all_sums = {0: self.n + 1}
for i in range(left_bound, right_bound):
# Want old sums to remain/take priority
new_sums = {self.nums[i] + elem: i for elem in all_sums}
new_sums.update(all_sums)
all_sums = new_sums
return all_sums
def recover_sum_members(self, sum_dict: Dict[int, int], found_sum: int) -> List[int]:
"""Given a subset-sum dictionary and a sum, return a set of elements
from nums that formed that sum."""
answer = []
curr_sum = found_sum
while curr_sum != 0:
next_elem_index = sum_dict[curr_sum]
next_elem = self.nums[next_elem_index]
answer.append(next_elem)
curr_sum -= next_elem
assert len(answer) <= self.n
return answer
def min_absolute_difference(self, goal: float) -> List[int]:
"""Implement Schroeppel and Shamir alg. for subset sum
Runs in O(2^(n/2) * n/4) time, takes O(2^(n/4)) space
Returns a subset of self.nums whose sum is as close to goal as possible.
"""
if self.n < 8:
# Direct solution when n < 8; not worth splitting into 4 groups.
all_sums_dict = self.all_subset_sums(0, self.n)
best_diff_seen, best_sum = min(((abs(x - goal), x) for x in all_sums_dict),
key=lambda pair: pair[0])
return self.recover_sum_members(all_sums_dict, best_sum)
# Split nums into 4 equal-length parts (or as close as possible)
half = self.n // 2
bounds = [0, half // 2, half, half + (self.n - half) // 2, self.n]
# first_arr = nums[bounds[0]:bounds[1]]
# sec_arr = nums[bounds[1]:bounds[2]]
# third_arr = nums[bounds[2]:bounds[3]]
# fourth_arr = nums[bounds[3]:bounds[4]]
first_table_dict = self.all_subset_sums(bounds[0], bounds[1])
first_table = list(first_table_dict.keys())
sec_table_dict = self.all_subset_sums(bounds[1], bounds[2])
sec_table = sorted(sec_table_dict.keys())
third_table_dict = self.all_subset_sums(bounds[2], bounds[3])
third_table = list(third_table_dict.keys())
fourth_table_dict = self.all_subset_sums(bounds[3], bounds[4])
fourth_table = sorted(fourth_table_dict.keys(), reverse=True)
# min_heap stores pairs of problems from T1 and T2, and
# makes the pair with smallest sum in front of the heap
# Format: (sum, T1-index, T2-index) triplets
min_heap = [(x + sec_table[0], i, 0) for i, x in enumerate(first_table)]
# max_heap stores pairs of problems from T3 and T4, and
# makes the pair with largest sum in front of the heap
# Format: (-sum, T3-index, T4-index) triplets
max_heap = [(-(x + fourth_table[0]), i, 0) for i, x in enumerate(third_table)]
heapq.heapify(min_heap)
heapq.heapify(max_heap)
best_diff_seen = math.inf
best_diff_indices = []
while len(min_heap) != 0 and len(max_heap) != 0:
smallest, p1, p2 = min_heap[0]
largest, p3, p4 = max_heap[0]
largest = -largest
ans_here = smallest + largest
if abs(goal - ans_here) < best_diff_seen:
best_diff_seen = abs(goal - ans_here)
best_diff_indices = (p1, p2, p3, p4)
if ans_here <= goal: # Want sum to increase, so increase p2
heapq.heappop(min_heap)
if p2 + 1 != len(sec_table):
heapq.heappush(min_heap,
(first_table[p1] + sec_table[p2 + 1], p1, p2 + 1))
else: # Want sum to decrease. so increase p4
heapq.heappop(max_heap)
if p4 + 1 != len(fourth_table):
heapq.heappush(max_heap,
(-(third_table[p3] + fourth_table[p4 + 1]), p3, p4 + 1))
sum_ans = []
p1, p2, p3, p4 = best_diff_indices
sum_ans.extend(self.recover_sum_members(first_table_dict, first_table[p1]))
sum_ans.extend(self.recover_sum_members(sec_table_dict, sec_table[p2]))
sum_ans.extend(self.recover_sum_members(third_table_dict, third_table[p3]))
sum_ans.extend(self.recover_sum_members(fourth_table_dict, fourth_table[p4]))
return sum_ans
def solve_partition(self) -> Tuple[List[int], List[int]]:
"""Return a partition of nums into (smaller_sum_set, larger_sum_set)
Finds a partition whose sum-difference is minimum.
"""
total_sum = sum(self.nums)
frequency_counts = collections.Counter(self.nums)
first_subset = self.min_absolute_difference(goal=total_sum / 2.0)
if self.orig_zeros != 0:
first_subset.extend([0] * self.orig_zeros)
remaining_subset = frequency_counts - collections.Counter(first_subset)
remaining_subset = list(remaining_subset.elements())
if sum(first_subset) <= sum(remaining_subset):
return (first_subset, remaining_subset)
else:
return (remaining_subset, first_subset)
You can call it like so, on any array of integers (practical up to n=100 elements):
Solver = SubsetSumSolver([1, 5, 5, 6, 7, 10, 20])
print(Solver.solve_partition())
>>> ([10, 6, 5, 5, 1], [7, 20])