Say I have 2 data frames I want to merge. df1 has repeated measures per sample (which I do not actually know how many, and can be different per each sample), while df2 only has one measure for the same samples.
As a MWE, something like this:
> df1=data.frame(letter=rep(LETTERS[1:5],each=3), val1=1:15) > df2=data.frame(letter=LETTERS[1:5], val2=16:20) > df1 letter val1 1 A 1 2 A 2 3 A 3 4 B 4 5 B 5 6 B 6 7 C 7 8 C 8 9 C 9 10 D 10 11 D 11 12 D 12 13 E 13 14 E 14 15 E 15 > df2 letter val2 1 A 16 2 B 17 3 C 18 4 D 19 5 E 20 I want to merge them in such a way that this is reflected. As of now I can do:
> merge(df1, df2) letter val1 val2 1 A 1 16 2 A 2 16 3 A 3 16 4 B 4 17 5 B 5 17 6 B 6 17 7 C 7 18 8 C 8 18 9 C 9 18 10 D 10 19 11 D 11 19 12 D 12 19 13 E 13 20 14 E 14 20 15 E 15 20 But ideally, I would need this:
> merge(df1, df2, all=T) letter rep val1 val2 1 A 1 1 16 2 A 2 2 NA 3 A 3 3 NA 4 B 1 4 17 5 B 2 5 NA 6 B 3 6 NA 7 C 1 7 18 8 C 2 8 NA 9 C 3 9 NA 10 D 1 10 19 11 D 2 11 NA 12 D 3 12 NA 13 E 1 13 20 14 E 2 14 NA 15 E 3 15 NA But I do not have that rep column since the beginning, so I should add it post hoc, but I do not know how... Alternatively, maybe merge has some option to only list the first match for the val2 column...
Any help? It should be easy but I am getting into loops and checks to add that rep columns, and that is probably not the way.
没有评论:
发表评论