Good evening Stackoverflow experts.
How can I transpose only a part of a dataframe to another part of the same dataframe conditioned to some particular values of some cells in the dataframe?
I have a dataframe in r with the following structure:
str(df) # 'data.frame': 20 obs. of 11 variables: # $ Code1 : int 1 1 1 1 2 2 3 3 4 4 ... # $ SpeciesG1: chr "Sp1" "Sp1" "Sp1" "Sp1" ... # $ Family : chr "Fam1" "Fam2" "Fam3" "Fam4" ... # $ Pct_feed : num 55.4 12.56 1.35 30.69 62.93 ... # $ Code2 : int 0 0 0 0 0 0 0 0 0 0 ... # $ SpeciesG2: chr "0" "0" "0" "0" ... # $ Fam1 : int 0 0 0 0 0 0 0 0 0 0 ... # $ Fam2 : int 0 0 0 0 0 0 0 0 0 0 ... # $ Fam3 : int 0 0 0 0 0 0 0 0 0 0 ... # $ Fam4 : int 0 0 0 0 0 0 0 0 0 0 ... # $ Fam5 : int 0 0 0 0 0 0 0 0 0 0 ...
The data looks as follows (this is only a small sample of a df that is several hundreds rows long):
df # Code1 SpeciesG1 Family Pct_feed Code2 SpeciesG2 Fam1 Fam2 Fam3 Fam4 Fam5 # 1 1 Sp1 Fam1 55.40 0 0 0 0 0 0 0 # 2 1 Sp1 Fam2 12.56 0 0 0 0 0 0 0 # 3 1 Sp1 Fam3 1.35 0 0 0 0 0 0 0 # 4 1 Sp1 Fam4 30.69 0 0 0 0 0 0 0 # 5 2 Sp1 Fam2 62.93 0 0 0 0 0 0 0 # 6 2 Sp1 Fam3 37.07 0 0 0 0 0 0 0 # 7 3 Sp2 Fam4 17.49 0 0 0 0 0 0 0 # 8 3 Sp2 Fam5 82.51 0 0 0 0 0 0 0 # 9 4 Sp2 Fam1 36.97 0 0 0 0 0 0 0 # 10 4 Sp2 Fam2 51.46 0 0 0 0 0 0 0 # 11 4 Sp2 Fam3 11.57 0 0 0 0 0 0 0 # 12 5 Sp3 Fam1 41.81 0 0 0 0 0 0 0 # 13 5 Sp3 Fam2 9.64 0 0 0 0 0 0 0 # 14 5 Sp3 Fam3 31.74 0 0 0 0 0 0 0 # 15 5 Sp3 Fam4 5.12 0 0 0 0 0 0 0 # 16 5 Sp3 Fam5 11.69 0 0 0 0 0 0 0 # 17 5 Sp4 Fam2 41.16 0 0 0 0 0 0 0 # 18 5 Sp4 Fam3 40.04 0 0 0 0 0 0 0 # 19 5 Sp4 Fam4 4.32 0 0 0 0 0 0 0 # 20 5 Sp4 Fam5 14.48 0 0 0 0 0 0 0
I need to replace some of the zero values of columns 7-11 ('Fam1' ... 'Fam5'), in accordance to the following 3-step procedure:
Step 1: For all possible pairs of 'Code1' and 'SpeciesG1' I want the Pct_feed values replacing the zero values **matching ** the family names columns in columns 7→11 (i.e., 'Fam1', 'Fam2', 'Fam3', 'Fam4' and' Fam5'), but expanding those values horizontally (or transposing, that is, all values in the same row). Step 2: In that same single row created in Step 1, the values of 'Code1' and 'SpeciesG1' should replace the '0' values under columns 'Code2' and ' SpeciesG2'. Step 3: All the rows, except the single row created in Step 1, with the same pair of values of 'Code1' and 'SpeciesG1', should be deleted. The end result of this 3-step operation, as applied to df, should look as follows: #Code1 SpeciesG1 Family Pct_feed Code2 SpeciesG2 Fam1 Fam2 Fam3 Fam4 Fam5 #1 Sp1 Fam1 55.40 1 Sp1 55.40 12.56 1.35 30.69 0 #2 Sp1 Fam2 62.93 2 Sp1 0 62.93 37.07 0 0 #3 Sp2 Fam4 6.11 3 Sp2 0 0 0 17.49 82.51 #4 Sp2 Fam1 36.97 4 Sp2 36.97 51.46 11.57 0 0 #5 Sp3 Fam1 41.81 5 Sp3 41.81 9.64 31.74 5.12 11.69 #5 Sp4 Fam2 41.16 5 Sp4 0 41.16 40.04 4.32 14.48 NOTE1: As a check, the sum of all Pct_feed values of a given pair of values of 'Code1' and 'SpeciesG1' should be 100. NOTE2: I have looked into the dplyr package but I couldn't find a way of obtaining the results I need. I would very much appreciate any help provided to solve this need. Thank you very much Jorge
https://stackoverflow.com/questions/67394189/transposing-partially-only-some-elements-of-a-dataframe-conditioned-to-matching May 05, 2021 at 09:47AM
没有评论:
发表评论