Hi I am new with R and Good with Excel.
Problem: I have two(2) datasets: a. Masterlist which contain ID and NEW ID I want to refer to other dataset. b. Transaction dataset which contain a column of mix ID and old ID name OFFICER ID.
Desired Output to achieved: I want to create a new column name MERGE ID in Transaction dataset to only refer to one ID.
Approach: In ExceL (Please - Igbore and Read the Translated Formula) - for ref purpose:
=IF(ISNA(INDEX(ID,MATCH(OFFICER ID,ID,0))),INDEX(ID,MATCH(OFFICER ID,NEWID,0)),H2)
Translate as:
NEW MERGE ID COLUMN = IF RESULT OF LEFT_JOIN(Transaction, MasterList, by = "OFFICER ID" = "ID") EQUAL NA is TRUE THEN
OVERWRITE/REPLACE THE NA VALUE WITH LEFT_JOIN(Transaction, MasterList, by = "OFFICER ID" = "NEW ID")
my Working solution in R:
R Code:
id_list <- masterLIST_dataset %>% select("ID", "NEW ID")      MERGE_ID <- function({       output1 <- Transaction_dataset %>%              left_join(id_list, by = c("OFFICER ID" = "ID"), keep = TRUE) %>%               setnames("ID", "MERGE ID")         output2 <- output1 %>%               left_join(id_list, by = c("OFFICER ID" = "NEW ID")) %>%              setnames("NEW ID", "MERGE ID") %>% select(-last_col())          if (which(is.na(output1$"MERGE ID"), FALSE))       output$"MERGE ID" <- output1$"MERGE ID"     else if (which(is.na(output1$"MERGE ID"), TRUE))       output$"MERGE ID" <- replace(which(is.na(output1$), TRUE), output2$"MERGE ID")      output                     })   
没有评论:
发表评论