I am trying to sum the number of categories for each month and cumulate the previous months. The group by needs to be by month,year, and category. I tried different ways of using cumsum but I just can't get it.
This is the df:
sample=[ {'name':11,'category':'A','year':2017,'month':1}, {'name':22,'category':'A','year':2017,'month':2}, {'name':33,'category':'B','year':2015,'month':1}, {'name':33,'category':'C','year':2017,'month':3}, {'name':33,'category':'B','year':2017,'month':8}, {'name':44,'category':'B','year':2016,'month':4}, {'name':44,'category':'A','year':2017,'month':6}, {'name':55,'category':'C','year':2016,'month':9}, {'name':55,'category':'C','year':2017,'month':5}, {'name':55,'category':'B','year':2017,'month':11}] sample_df=pd.DataFrame(sample)
I grouped by month,year, and category and aggregated by month
sample_counts = sample_df.groupby(['month','year','category']).agg({ 'category': 'count', }).rename(columns={'category':'category_count'}).reset_index()
The output of sample_counts is this:
month year category category_count 1 2015 B 1 1 2017 A 1 2 2017 A 1 3 2017 C 1 4 2016 B 1 5 2017 C 1 6 2017 A 1 8 2017 B 1 9 2016 C 1 11 2017 B 1
So I want the output to add all the B's (for example) as the months progress
So by 11/2017 the 'B' sum should be 2.
I tried different variations of
cumul_df=sample_counts.groupby(['month','year','category']).sum().groupby(level=0).cumsum().reset_index()
Would appreciate any help! Thank you!!
https://stackoverflow.com/questions/66501966/cumsum-with-groupby-for-date-accumulation March 06, 2021 at 11:05AM
没有评论:
发表评论