I need to add df1 columns's ['inicio_vigencia'] and ['fim_vigencia'] to df2, considering the following conditions:
- df1['id_vei'] == df2['id_vei'] and
- df1['inicio_vigencia'] <= df2['Hora_do_evento'] and
- df1['Hora_do_evento'] >= df2['Hora_do_evento']
Dataframes size: - df1 has 25.196.167 lines (aprox. 2gb)
- df2 has 113.787 lines
I prepared the following script, but it takes lots of time.
df2['inicio_vigencia'] = 0 df2['fim_vigencia'] = 0 for i in range(len(df2)): id_vei = (df2.iloc[i, 1]) Hora_do_evento = (df2.iloc[i, 3]) df_vigencia = df1.loc[(df1['inicio_vigencia'] <= Hora_do_evento) & (df1['fim_vigencia'] >= Hora_do_evento) & (df1['id_vei'] == id_vei)] if (len(df_vigencia) > 0): index = int(df_vigencia[df_vigencia['id_vei'] == id_vei].index.values) #len_df_vigencia.append(len(df_vigencia)) inicio_vigencia = df_vigencia['inicio_vigencia'][index] fim_vigencia = df_vigencia['fim_vigencia'][index] df2.loc[i, 'inicio_vigencia'] = inicio_vigencia df2.loc[i, 'fim_vigencia'] = fim_vigencia How could I perform this task efficiently?
https://stackoverflow.com/questions/66881493/how-to-join-two-big-dataframes-based-on-multiple-conditions-in-an-effective-quic March 31, 2021 at 12:04PM
没有评论:
发表评论