2021年3月5日星期五

Cumsum with groupby for date accumulation

I am trying to sum the number of categories for each month and cumulate the previous months. The group by needs to be by month,year, and category. I tried different ways of using cumsum but I just can't get it.

This is the df:

sample=[  {'name':11,'category':'A','year':2017,'month':1},  {'name':22,'category':'A','year':2017,'month':2},  {'name':33,'category':'B','year':2015,'month':1},  {'name':33,'category':'C','year':2017,'month':3},  {'name':33,'category':'B','year':2017,'month':8},  {'name':44,'category':'B','year':2016,'month':4},  {'name':44,'category':'A','year':2017,'month':6},  {'name':55,'category':'C','year':2016,'month':9},  {'name':55,'category':'C','year':2017,'month':5},  {'name':55,'category':'B','year':2017,'month':11}]    sample_df=pd.DataFrame(sample)  

I grouped by month,year, and category and aggregated by month

sample_counts = sample_df.groupby(['month','year','category']).agg({                                      'category': 'count',                                  }).rename(columns={'category':'category_count'}).reset_index()  

The output of sample_counts is this:

month   year  category  category_count  1       2015     B         1  1       2017     A         1  2       2017     A         1  3       2017     C         1  4       2016     B         1  5       2017     C         1  6       2017     A         1  8       2017     B         1  9       2016     C         1  11      2017     B         1  

So I want the output to add all the B's (for example) as the months progress

So by 11/2017 the 'B' sum should be 2.

I tried different variations of

cumul_df=sample_counts.groupby(['month','year','category']).sum().groupby(level=0).cumsum().reset_index()  

Would appreciate any help! Thank you!!

https://stackoverflow.com/questions/66501966/cumsum-with-groupby-for-date-accumulation March 06, 2021 at 11:05AM

没有评论:

发表评论