I have a dataframe of hundreds of samples and thousands of variables. But here I give a simple data frame (my_data) as an illustration. I want to get the percentage of the variable relative to the sum of gene_count based on which cluster the gene belong (the gene can be in multiple clusters). I know how to get the percentage of each using dataframe operation. However, since I am new to coding, I am trying to make a function to get the percentage. Could anyone helps how to get percentage for each gene using my function (percent)? The result would be the percentage on the column "percent". Thank you very much.
gene = c("CD63", "PTN", "MT2A", "PTGDS", "DBI", "TIMP1", "COX6C", "APLP2", "PTN", "GPC1") gene_count = c(10, 15, 5, 15, 10, 25, 5, 5, 5, 5) cluster = c(1,2, 3, 5, 7, 8, 9, 3, 6, 4 ) percent = c(0.1, 0.15, 0.5, 0.15, 0.1, 0.25, 0.05, 0.05, 0.05, 0.05) my_data = data.frame(gene, gene_count, cluster, percent) my_data percent = function(gene, cluster){ for (gene in c(data$gene)){ if (data$gene == gene & data$cluster == cluster) print(data$gene_count[which(data$gene == gene & data$cluster == cluster)]/sum(data$gene_count)) else print("Gene is not expressed in this cluster") } } https://stackoverflow.com/questions/66635072/making-a-function-to-get-percentage-of-multiple-rows-in-dataframe-r March 15, 2021 at 05:04PM

没有评论:
发表评论