Ways of speeding up the itertools.product in python

Question

I'm trying to create a numpy array consisting of all possible asset allocations using itertools.product. The conditions are that allocations for each asset can be in range of zero to 100% and can rise by (100% / number of assets) increments. The allocations total sum should be 100%.

The calculations take very long time when assets number grows (10 seconds for 7 assets, 210 seconds for 8 assets and so on). Is there a way to speed up the code somehow? Maybe i should try using it.takewhile or multiprocessing?

import itertools as it
import numpy as np

def CreateMatrix(Increments):

    inputs = it.product(np.arange(0, 1 + Increments, Increments), repeat = int(1/Increments));
    matrix = np.ndarray((1, int(1/Increments)));
    x = 0;
    for i in inputs:
        if np.sum(i, axis = 0) == 1:
            if x > 0:
                matrix = np.r_[matrix, np.ndarray((1, int(1/Increments)))]
            matrix[x] = i
            x = x + 1

    return matrix

Assets = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
Increments = 1.0 / len(Assets)
matrix = CreateMatrix(Increments);
print matrix

The product with 8 repetitions will have 16 million elements. Calling multiple numpy functions 16 million times is going to take a long time. Are you sure you really need an 8-dimensional product? — Barmar, Mar 30 '20 at 22:43
Take a look at how many entries you are going create in your matrix (list?). This is a non-trivial number, and answering this question will not solve the problems that will come afterwards with your trillions of entries, e.g. "I cannot sort 10^60th elements" =) — Cireo, Mar 30 '20 at 22:44
While your implementation could be improved a *lot* (primarily through a [stars and bars](https://en.wikipedia.org/wiki/Stars_and_bars_(combinatorics)) approach instead of filtering a cartesian product, and not trying to append rows to an array one at a time), your array size still grows rapidly, and it'll grow way too huge to build, store, or use pretty quickly. Your best option is to use a different approach to whatever problem you were trying to use the array to solve. — user2357112, Mar 30 '20 at 22:46

Michael Ruth · Answer 1 · 2022-12-29T05:58:14.847

Use the stdlib sum instead of numpy.sum. This code spends most of its time computing that sum, according to cProfile.

Profiling Code

import cProfile, pstats, StringIO
import itertools as it
import numpy as np


def CreateMatrix(Increments):
    inputs = it.product(np.arange(0, 1 + Increments, Increments), repeat = int(1/Increments));
    matrix = np.ndarray((1, int(1/Increments)));
    x = 0
    for i in inputs:
        if np.sum(i, axis=0) == 1:
            if x > 0:
                matrix = np.r_[matrix, np.ndarray((1, int(1/Increments)))]
            matrix[x] = i
            x += 1
    return matrix

pr = cProfile.Profile()
pr.enable()
Assets = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
Increments = 1.0 / len(Assets)
matrix = CreateMatrix(Increments);
print matrix
pr.disable()
s = StringIO.StringIO()
sortby = 'cumulative'
ps = pstats.Stats(pr, stream=s).sort_stats(sortby)
ps.print_stats()
print s.getvalue()

Truncated Output

         301565912 function calls (301565864 primitive calls) in 294.255 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1   26.294   26.294  294.254  294.254 product.py:7(CreateMatrix)
 43046721   41.948    0.000  267.762    0.000 Library/Python/2.7/lib/python/site-packages/numpy/core/fromnumeric.py:1966(sum)
 43046723   60.071    0.000  217.863    0.000 Library/Python/2.7/lib/python/site-packages/numpy/core/fromnumeric.py:69(_wrapreduction)
 43046723  124.341    0.000  124.341    0.000 {method 'reduce' of 'numpy.ufunc' objects}
 43046723   14.630    0.000   14.630    0.000 Library/Python/2.7/lib/python/site-packages/numpy/core/fromnumeric.py:70(<dictcomp>)
 43046721   12.629    0.000   12.629    0.000 {getattr}
 43098200    7.958    0.000    7.958    0.000 {isinstance}
 43046724    6.191    0.000    6.191    0.000 {method 'items' of 'dict' objects}
     6434    0.047    0.000    0.199    0.000 Library/Python/2.7/lib/python/site-packages/numpy/lib/index_tricks.py:316(__getitem__)

Timing Experiments

numpy.sum

import itertools as it
import numpy as np

def CreateMatrix(Increments):

    inputs = it.product(np.arange(0, 1 + Increments, Increments), repeat = int(1/Increments));
    matrix = np.ndarray((1, int(1/Increments)));
    x = 0;
    for i in inputs:
        if np.sum(i, axis = 0) == 1:
            if x > 0:
                matrix = np.r_[matrix, np.ndarray((1, int(1/Increments)))]
            matrix[x] = i
            x = x + 1

    return matrix

Assets = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
Increments = 1.0 / len(Assets)
matrix = CreateMatrix(Increments);

$ python -m timeit --number=3 --verbose "$(cat product.py)"
raw times: 738 696 697
3 loops, best of 3: 232 sec per loop

Stdlib sum

import itertools as it
import numpy as np

def CreateMatrix(Increments):

    inputs = it.product(np.arange(0, 1 + Increments, Increments), repeat = int(1/Increments));
    matrix = np.ndarray((1, int(1/Increments)));
    x = 0;
    for i in inputs:
        if sum(i) == 1:
            if x > 0:
                matrix = np.r_[matrix, np.ndarray((1, int(1/Increments)))]
            matrix[x] = i
            x = x + 1

    return matrix

Assets = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
Increments = 1.0 / len(Assets)
matrix = CreateMatrix(Increments);

$ python -m timeit --number=3 --verbose "$(cat product.py)"
raw times: 90.5 84.3 85.3
3 loops, best of 3: 28.1 sec per loop

There are many more ways to get your solution faster, as other folks have said in their comments. Take a look at How do I "multi-process" the itertools product module? for an idea of how to use multiprocessing to speed this up. No matter what you do: clever algorithm, concurrency or both, replace the sum function; it's a lot of speed up for very little effort.

Update

I was able to further improve run time after realizing that matrix is merely the rows from the Cartesian product which sum to 1. Rather than reallocating matrix within a loop, compute the sums of rows in the Cartesian product and construct matrix from them. This is also much easier to understand.

def CreateMatrix(Increments):
    inputs = it.product(
        np.arange(0, 1 + Increments, Increments),
        repeat=int(1/Increments)
    )
    matrix = np.array(
        [row for row in inputs if sum(row) == 1]
    )
    return matrix

raw times: 79.4 78.5 78
3 loops, best of 3: 26 sec per loop

Future Work

The number of rows to search (rows in product set) increases much more rapidly than the number of rows in the output, matrix.

It's critical to develop a method for computing the rows which sum to 1, in the order produced by itertools.product, in order to construct matrix for larger problem sizes.

Ways of speeding up the itertools.product in python

1 Answers1

Linked