I have a pandas dataframe:
df = pd.DataFrame(data={'Name':['John', 'John', 'John', 'Jill', 'Jill', 'Jill', 'Jill'],'Score':[1,1, 1,2,2, 2, 2]})
df
Out[168]:
Name Score
0 John 1
1 John 1
2 John 1
3 Jill 2
4 Jill 2
5 Jill 2
6 Jill 2
I want to add a column that contains the rolling sum of the previous 2 values in a column (column score here) like so:
Out[171]:
Name Score Total
0 John 1 1
1 John 1 2
2 John 1 2
3 Jill 2 2
4 Jill 2 4
5 Jill 2 4
6 Jill 2 4
I have tried groupby and rolling which works but the row order is not the same as the original dataframe. Notice how Jill is now first, probably due to some alphabetical sorting:
df.groupby('Name')['Score'].rolling(2, min_periods=1).sum()
Out[173]:
Name
Jill 3 2.0
4 4.0
5 4.0
6 4.0
John 0 1.0
1 2.0
2 2.0
Name: Score, dtype: float64
How does this work and how can I achieve the goal without doing a leftjoin / merge afterwards?