2021年1月22日星期五

Calculate time corresponding to rows with same name from Pandas dataframe, in Python

Hi I have an excel sheet with Player names and Dates. For example:

Column A=[1000 1000 1001 1001 1001 1002 1002 1002 1002]  Column B=[03/12/2009 03/12/2009  04/01/2011 05/01/2010  08/02/2011 10/03/2012 05/12/2010 07/02/2011 09/03/2012 14/02/2013]  

For each player name, I want to calculate the maximum length of time between the first and final date. I thought to perform this via a pandas df and then dictionary formation, but it does not seem to work. There must be some easier way to do this, but I can't find my way out. This is what I have tried so far:

import pandas as pd  from datetime import datetime  from itertools import count  from collections import defaultdict    Player_Dates = pd.read_excel(r'C:\Users\PycharmProjects\Project1\Data.xlsx', sheet_name='Sheet 1, header=0, na_values=['NA'], usecols = "B:C")    Player_Dates_new=Player_Dates.iloc[5:len(Player_Dates)]    Player_Dates_new.columns = ['Player_ID','Dates']    counts = {k: count(0) for k in Player_Dates_new.Player_ID.unique()}  d = defaultdict(dict)    for k, *v in Player_Dates_new.values.tolist():      d[k][next(counts[k])] = v    dict(d)    print(d, Player_Dates_new)  
https://stackoverflow.com/questions/65846843/calculate-time-corresponding-to-rows-with-same-name-from-pandas-dataframe-in-py January 22, 2021 at 10:16PM

没有评论:

发表评论