2021年1月19日星期二

all vs. all comparisons between two dataframes using apply

I am using R and I have two dataframes with two same columns (ID and timestamps, but different number of rows.

ID       timeStamp  a   2018-04-17 10:47:45  a   2018-04-17 10:47:48  a   2018-04-17 10:47:48  a   2018-04-17 10:47:48  a   2018-04-17 10:49:23  a   2018-04-17 10:50:02  a   2018-04-17 10:51:34  a   2018-04-17 10:51:36  a   2018-04-17 10:51:38      ID       timeStamp  b   2018-04-17 10:32:17  b   2018-04-17 10:46:18  b   2018-04-17 10:47:18  b   2018-04-17 10:49:20  b   2018-04-17 10:52:22  b   2018-04-17 10:55:25  b   2018-04-17 10:57:29  

I would like to compare all of the timestamp values in two dataframes and compute points conditional on the number of times observations in dataframe A and B are within a specific time range. For example, If two obs are within 5 mins range, I want to assign 10 points. If the values are exactly same, it will get 5 points. Otherwise, no point will be added.

I tried to make the model using for loop, but it takes so long when I compare huge number of rows.

m= 0  n= 0  for (i in 1:nrow(A)){    for (j in 1:nrow(B)){if (difftime(A[i,"tStamp"],B[j,"tStamp"],units = "secs") < 300 & A(Role1[i,"tStamp"],B[j,"tStamp"],units = "secs") >0 ) {m=m+10}  else if ( difftime(A[i,"tStamp"],B[j,"tStamp"],units = "secs") == 0){m=m+5}  else if (difftime(B[j,"tStamp"],A[i,"tStamp"],units = "secs") < 300 & difftime(B[j,"tStamp"],A[i,"tStamp"],units = "secs") >0) {n=n+10}  else if ( difftime(B[j,"tStamp"],A[i,"tStamp"],units = "secs") == 0){n=n+5}}  

Would there be a good way to do this using apply function? I believe it would be much more efficient & faster than for loop. Any hep would be appreciated.

https://stackoverflow.com/questions/65802756/all-vs-all-comparisons-between-two-dataframes-using-apply January 20, 2021 at 11:40AM

没有评论:

发表评论