2 dataframe as below, I want to return, how many Success (Yes) in a year (of the specific person) 1 year before his/her specific date, i.e. each entry in "to check" to define the range in "history".
For example, in "to_check", Mike 20200602, I want to know how many Success (Yes) in Mike's history (1 year before, until 20200602).
By using the "to_check" as a list, I came up with a clumsy way:
import pandas as pd import datetime import numpy as np from io import StringIO import time from datetime import datetime, date, time, timedelta csvfile = StringIO(""" Name Check Mike 20200602 David 20210415 Kate 20201109""") csvfile_1 = StringIO(""" Name History Success David 20180312 Yes David 20180811 Yes David 20191223 Yes David 20210311 Yes Kate 20180906 Yes Kate 20180912 Yes Kate 20191204 Yes Kate 20200505 Yes Mike 20180912 Yes Mike 20190312 Yes Mike 20190806 Yes Mike 20191204 Yes""") df_check = pd.read_csv(csvfile, sep = ' ', engine='python') df_history = pd.read_csv(csvfile_1, sep = ' ', engine='python') df_history['Date'] = pd.to_datetime(df_history['History'], format='%Y%m%d') to_check = [Mike 20200602"",""David 20210415"",""Kate 20201109""] for t in to_check: name, d = t.split("" "") date_obj = datetime.strptime(d, '%Y%m%d') delta = timedelta(days = 365) day_before = date_obj - delta m1 = df_history['Name'] == name m2 = df_history['Date'] >= day_before df_history['OP'] = np.where(m1 & m2, ""fit"", '') how_many = df_history['OP'].value_counts().tolist()[1] print (t, how_many) Output:
Mike 20200602 2 David 20210415 1 Kate 20201109 2 What's the better and smarter way to achieve it? Thank you.
https://stackoverflow.com/questions/67223048/from-specific-row-in-a-df-a-to-count-its-occrueances-in-past-a-year-in-df-b April 23, 2021 at 10:05AM
没有评论:
发表评论