有些事如何做: Calculating new column value in dataframe based on next row's column value

My lack of experience in working with python in the past year has made me rather rusty, and I'm getting back into coding again.

I have a dataframe of events that has a 'start_time' column. What I need to do is create an 'end_time' column that has a time value that is 1 second less than the next row's start_time. This is an ask for doing event time calculations.

The desired output:

start_time  end_time  0   00:00:00  07:59:59  1   08:00:00  08:20:04  2   08:20:05  08:29:19  3   08:29:20  08:29:20  4   08:29:21  08:35:14  5   08:35:15  08:55:21  6   08:55:22  08:57:20  7   08:57:21  09:02:23  8   09:02:24  09:14:07  9   09:14:08  09:15:03

I currently have code that will accomplish this, but from anything I've read here, and from what I remember, I really shouldn't be iterating through a dataframe in a for loop.

for ndx, row in df.iterrows():      if ndx != df[atnp_df.columns[0]].count() - 1:          df.iloc[ndx, 9] = pd.to_datetime(df.iloc[ndx+1, 8]) - timedelta(seconds=1)

(Hey, it works, but it's slow...)

How do I do this pythonically? I know the solution should be something like this:

df['end_time'] = pd.to_datetime(df['start_time']) - timedelta(seconds=1)

But, this subtracts 1 second from the start_time in the same row. I'm not quite sure how to access the next row's start time in this way.

Any and all help is greatly appreciated!

https://stackoverflow.com/questions/66683405/calculating-new-column-value-in-dataframe-based-on-next-rows-column-value March 18, 2021 at 09:06AM

有些事如何做

2021年3月17日星期三

Calculating new column value in dataframe based on next row's column value

没有评论:

发表评论