I have series text files. each for them has similar 3 rows, see example below:
The probability of being a carrier is 0.07457166 an BRCA1 carrier 0.03181885 an BRCA2 carrier 0.04273394
I need to get the last number after the specific string; I got a code with function like this:
dir <- 'W:/project/_help/temp/' files <- list.files(dir,pattern = '*.txt') filepath <- list.files(dir,pattern = '*.txt', full.names = TRUE) try <- function(file, xx){ aa <- readLines(file) bb <- grep(xx, aa, value = TRUE) cc <- readr::parse_number(bb) return(cc) } overall <- lapply(filepath, try, "being a carrier is") Brca1 <- lapply(filepath, try, "an BRCA1 carrier") Brca2 <- lapply(filepath, try, "an BRCA2 carrier")
the code:
result <- lapply(filepath, try, "The probability of being a carrier is")
works fine, I can get the number from 1st row. But I also want to get the number from 2nd and 3rd rows. So I submit
result <- lapply(filepath, try, "an BRCA1 carrier")
result <- lapply(filepath, try, "an BRCA2 carrier")
But it return 1 and 2. I guess the code return the 1 or 2 from string BRCA1 or BRCA2. Actually, I want to get the number after entire string of "an BRCA1 carrier" or "an BRCA2 carrier". How to modify the function for this? Additionally, some of text file may has NA values after those string. such as:
The probability of being a carrier is NA an BRCA1 carrier NA an BRCA2 carrier 0.04273394
`
I also need the function can handle those missing values, thank you. TGG
https://stackoverflow.com/questions/66094743/r-function-update-to-fit-more-situation February 08, 2021 at 09:07AM
没有评论:
发表评论