for col in df:
df[col] = df[col].sum()
or a slower solution that doesn't use looping...
df = pd.DataFrame([df.sum()] * len(df))
Timings
@jezrael Thanks for the timings. This does them on a larger dataframe and includes the for loop as well. Most of the time is spent creating the dataframe rather than calculating the sums, so the most efficient method that does this appears to be the one from @ayhan that assigns the sum to the values directly:
from string import ascii_letters
df = pd.DataFrame(np.random.randn(10000, 52), columns=list(ascii_letters))
# A baseline timing figure to determine sum of each column.
%timeit df.sum()
1000 loops, best of 3: 1.47 ms per loop
# Solution 1 from @Alexander
%%timeit
for col in df:
df[col] = df[col].sum()
100 loops, best of 3: 21.3 ms per loop
# Solution 2 from @Alexander (without `for loop`, but much slower)
%timeit df2 = pd.DataFrame([df.sum()] * len(df))
1 loops, best of 3: 270 ms per loop
# Solution from @PiRSquared
%timeit df.stack().groupby(level=1).transform('sum').unstack()
10 loops, best of 3: 159 ms per loop
# Solution 1 from @Jezrael
%timeit (pd.DataFrame(np.tile(df.sum().values, (len(df.index),1)), columns=df.columns, index=df.index))
100 loops, best of 3: 2.32 ms per loop
# Solution 2 from @Jezrael
%%timeit
df2 = pd.DataFrame(df.sum().values[np.newaxis,:].repeat(len(df.index), axis=0),
columns=df.columns,
index=df.index)
100 loops, best of 3: 2.3 ms per loop
# Solution from @ayhan
%time df.values[:] = df.values.sum(0)
CPU times: user 1.54 ms, sys: 485 µs, total: 2.02 ms
Wall time: 1.36 ms # <<<< FASTEST