I have a very large tibble with a column of concatenated variables. Since this is repeated measure data, there are many repetitions of the same few combinations of concatenated variables, and as a result, this code:
df %>% group_by(var_col) %>% nest() %>% separate(var_col, into=c("var1","var2"),sep="_") %>% unnest(data)
is many times faster than this code:
df %>% separate(var_col, into=c("var1","var2"),sep="_")
This seems a bit hackish a way to get so much speed-up. Is there a better way to take advantage of the fact that my data is repeated like this?
https://stackoverflow.com/questions/67352394/tidyr-separate-speed-up-when-many-entries-are-the-same May 02, 2021 at 10:05AM
没有评论:
发表评论