2021年3月17日星期三

Making a function to get percentage of multiple rows in dataframe R

I have a dataframe of hundreds of samples and thousands of variables. But here I give a simple data frame (my_data) as an illustration. I want to get the percentage of the variable relative to the sum of gene_count based on which cluster the gene belong (the gene can be in multiple clusters). I know how to get the percentage of each using dataframe operation. However, since I am new to coding, I am trying to make a function to get the percentage. Could anyone helps how to get percentage for each gene using my function (percent)? The result would be the percentage on the column "percent". Thank you very much.

gene = c("CD63", "PTN", "MT2A", "PTGDS", "DBI", "TIMP1", "COX6C", "APLP2", "PTN", "GPC1")  gene_count = c(10, 15, 5, 15, 10, 25, 5, 5, 5, 5)  cluster = c(1,2, 3, 5, 7, 8, 9, 3, 6, 4 )  percent = c(0.1, 0.15, 0.5, 0.15, 0.1, 0.25, 0.05, 0.05, 0.05, 0.05)  my_data = data.frame(gene, gene_count, cluster, percent)  my_data      percent = function(gene, cluster){  for (gene in c(data$gene)){  if (data$gene == gene & data$cluster == cluster)  print(data$gene_count[which(data$gene == gene & data$cluster == cluster)]/sum(data$gene_count))  else print("Gene is not expressed in this cluster")    }  }  

enter image description here

https://stackoverflow.com/questions/66635072/making-a-function-to-get-percentage-of-multiple-rows-in-dataframe-r March 15, 2021 at 05:04PM

没有评论:

发表评论