I have a data.frame
with observed success/failure outcomes per three groups along with expected probabilities:
library(dplyr) observed.probability.df <- data.frame(group = c("A","B","C"), p = c(0.5,0.2,0.3)) expected.probability.df <- data.frame(group = c("A","B","C"), p = qlogis(c(0.4,0.3,0.3))) observed.data.df <- do.call(rbind,lapply(c("A","B","C"), function(g) data.frame(group = g, value = c(rep(0,1000*sum(dplyr::filter(observed.probability.df, group != g)$p)),rep(1,1000*dplyr::filter(observed.probability.df, group == g)$p))) )) %>% dplyr::left_join(expected.probability.df) observed.probability.df$group <- factor(observed.probability.df$group, levels = c("A","B","C")) observed.data.df$group <- factor(observed.data.df$group, levels = c("A","B","C"))
I'm fitting a logistic regression (binomial
glm
with a logit
link function
) to these data with the offset
term:
fit <- glm(value ~ group + offset(p), data = observed.data.df, family = binomial(link = 'logit'))
Now, I'd like to plot these data as a bar graph using ggplot2
's geom_bar
, color-coded by group, and to add to that the fit's estimated trend line (see post on how to also add the shaded standard error area for group with two levels).
If it wasn't for the offset
term I'd just use stat_smooth
, but sems that I have to resort to alternative ways.
I was thinking of using stat_function
calling this function for getting the trend line:
trendLine <- function(x, ests) plogis(ests[1] + ests[2] * x)
But I don't know how make this work such that two trend lines are added, corresponding to each contrast:
- Going from
group
A
togroup
B
, callingtrendLine
withcoef(fit)[c(1,2)])
- Going from
group
A
togroup
C
, callingtrendLine
withcoef(fit)[c(1,3)])
I tried this:
library(ggplot2) ggplot(observed.probability.df, aes(x = group, y = p, fill = group)) + geom_bar(stat = 'identity') + stat_function(fun = slope.est,args=list(ests=coef(fit)[c(1,2)]),size=2,color="black") + stat_function(fun = slope.est,args=list(ests=coef(fit)[c(1,3)]),size=2,color="black") + scale_x_discrete(name = NULL,labels = levels(observed.probability.df$group), breaks = sort(unique(observed.probability.df$group))) + theme_minimal() + theme(legend.title = element_blank()) + ylab("Fraction of cells")
But it looks like the bottom line is a curve which also runs all the way to the bar of group
C
.
So my question is if theres a principled to add these contrast-wise trend lines to the bar plot with their respective x-axis starts and ends. It does not necessarily have to be with stat_function
, but I'm not aware of an alternative.
没有评论:
发表评论