2021年1月7日星期四

Data cleaning in R: grouping by number and then by name

A small sample of my dataset looks something like this:

x <- c(1,2,3,4,1,7,1)  y <- c("A","b","a","F","A",".A.","B")  data <- cbind(x,y)  

My goal is to first group data that have the same number together and then followed by the same name together (A,a,.A. are considered as the same name for my case). In other words, the final output should look something like this:

xnew <- c(1,1,3,7,1,2,4)  ynew <- c("A","A","a",".A.","B","b","F")  datanew <- cbind(xnew,ynew)  

Currently, I am only able to group by number in the column labelled x. I am unable to group by name yet. I would appreciate any help given.

Note: I need an automated solution as my raw dataset contains over 10,000 lines for the x and y columns.

https://stackoverflow.com/questions/65623271/data-cleaning-in-r-grouping-by-number-and-then-by-name January 08, 2021 at 11:59AM

没有评论:

发表评论