Hi I have an excel sheet with Player names and Dates. For example:
Column A=[1000 1000 1001 1001 1001 1002 1002 1002 1002] Column B=[03/12/2009 03/12/2009 04/01/2011 05/01/2010 08/02/2011 10/03/2012 05/12/2010 07/02/2011 09/03/2012 14/02/2013]
For each player name, I want to calculate the maximum length of time between the first and final date. I thought to perform this via a pandas df and then dictionary formation, but it does not seem to work. There must be some easier way to do this, but I can't find my way out. This is what I have tried so far:
import pandas as pd from datetime import datetime from itertools import count from collections import defaultdict Player_Dates = pd.read_excel(r'C:\Users\PycharmProjects\Project1\Data.xlsx', sheet_name='Sheet 1, header=0, na_values=['NA'], usecols = "B:C") Player_Dates_new=Player_Dates.iloc[5:len(Player_Dates)] Player_Dates_new.columns = ['Player_ID','Dates'] counts = {k: count(0) for k in Player_Dates_new.Player_ID.unique()} d = defaultdict(dict) for k, *v in Player_Dates_new.values.tolist(): d[k][next(counts[k])] = v dict(d) print(d, Player_Dates_new)
https://stackoverflow.com/questions/65846843/calculate-time-corresponding-to-rows-with-same-name-from-pandas-dataframe-in-py January 22, 2021 at 10:16PM
没有评论:
发表评论