Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

p_function() plot #249

Open
strengejacke opened this issue Nov 2, 2022 · 15 comments
Open

p_function() plot #249

strengejacke opened this issue Nov 2, 2022 · 15 comments
Labels
enhancement 🔥 New feature or request

Comments

@strengejacke
Copy link
Member

strengejacke commented Nov 2, 2022

Like in https://journals.sagepub.com/doi/10.1177/02683962221105904

WDYT? @easystats/core-team

library(easystats)
#> # Attaching packages: easystats 0.5.2.13
#> ✔ bayestestR  0.13.0.1    ✔ correlation 0.8.3    
#> ✔ datawizard  0.6.3.3     ✔ effectsize  0.8.2    
#> ✔ insight     0.18.6.2    ✔ modelbased  0.8.5.1  
#> ✔ performance 0.10.0.11   ✔ parameters  0.19.0.14
#> ✔ report      0.5.5.3     ✔ see         0.7.3.2
library(ggplot2)

p_function_plot <- function(model, ci_method = NULL, param) {
  x <- suppressMessages(ci(model, ci_method = ci_method, ci = seq(0, 1, .01), verbose = FALSE))
  
  # data for plotting
  out <- data_filter(x, !is.infinite(CI_low) & !is.infinite(CI_high))
  out <- out[out$Parameter == param, ]
  out$CI <- round(out$CI, 2)

  # CI level emphasize
  line_size <- c(0.8, 0.8, 0.8, 0.9)

  # transform non-Gaussian
  info <- insight::model_info(model)
  if (info$is_linear) {
    delta <- 0
  } else {
    delta <- 1
    out$CI_low <- exp(out$CI_low)
    out$CI_high <- exp(out$CI_high)
  }

  # data for p_function ribbon
  data_ribbon <- data_to_long(out, select = c("CI_low", "CI_high"), values_to = "x")
  
  # data for vertical CI level lines
  data_ci_segments <- data_filter(out, CI %in% c(.25, .5, .75, .95))
  data_ci_segments$color <- "green"
  data_ci_segments$color[data_ci_segments$CI == 0.95] <- "red"
  
  # most plausible value (point estimate)
  point_estimate <- out$CI_low[which.min(out$CI)]

  ggplot() +
    geom_ribbon(
      data = data_ribbon,
      mapping = aes(x = x, ymin = 0, ymax = 1 - CI),
      color = "#1b6ca8",
      fill = "#1b6ca8",
      alpha = .1
    ) +
    geom_point(
      data = data_ci_segments,
      mapping = aes(x = CI_low, y = 1 - CI, colour = color)
    ) +
    geom_point(
      data = data_ci_segments,
      mapping = aes(x = CI_high, y = 1 - CI, colour = color)
    ) +
    geom_segment(
      data = data_ci_segments,
      mapping = aes(x = CI_low, y = 1 - CI, xend = CI_high, yend = 1 - CI, colour = color),
      size = line_size
    ) +
    geom_segment(
      mapping = aes(x = point_estimate, xend = point_estimate, y = 0, yend = 1),
      linetype = "dashed",
      color = "#cccccc"
    ) +
    annotate(
      geom = "text",
      x = point_estimate,
      y = 0.025,
      label = sprintf("%.3f", point_estimate)
    ) +
    scale_y_continuous(
      limits = c(0, 1),
      breaks = seq(0, 1, .05),
      sec.axis = sec_axis(
        trans = ~ 1 - .,
        name = "Compatibility Interval",
        breaks = seq(0, 1, .05),
        labels = sprintf("%g%%", round(100 * seq(0, 1, .05)))
      ),
      expand = c(0, 0)
    ) +
    scale_color_manual(
      values = c("#3aaf85", "#cd201f"),
      guide = "none"
    ) + 
    labs(y = "P value", x = "Range of Estimates", colour = NULL) +
    theme_lucid() +
    theme(panel.grid.minor.y = element_blank())
}

model <- glm(vs ~ mpg * cyl, data = mtcars, family = "binomial")
p_function_plot(model, ci_method = "wald", param = "mpg")

model <- lm(Sepal.Length ~ Species, data = iris)
p_function_plot(model, param = "Speciesversicolor")

Created on 2022-11-02 with reprex v2.0.2

@strengejacke strengejacke added the enhancement 🔥 New feature or request label Nov 2, 2022
@strengejacke

This comment was marked as outdated.

@DominiqueMakowski
Copy link
Member

It's really good! I'd by default extend the plot to include the null hypothesis (e.g., 0) and have a vertical bar there (like in the paper) to show the position of the p-value reported by the model in regard to the other plausible values. It would give a sense of perspective

@rempsyc
Copy link
Sponsor Member

rempsyc commented Nov 3, 2022

Neat! I like it. Agree with Dom. Would use lower-case italicized p, but that's just an APA convention.

strengejacke added a commit that referenced this issue Nov 3, 2022
@strengejacke strengejacke mentioned this issue Nov 3, 2022
8 tasks
@strengejacke

This comment was marked as outdated.

@strengejacke

This comment was marked as resolved.

@strengejacke

This comment was marked as outdated.

@strengejacke
Copy link
Member Author

strengejacke commented Nov 4, 2022

  • get rid of one of the two legends when we have no facets
  • add back one legend title when we have no facets
  • remove legend for facets
  • how to have two color scales when we have no facets? I.e. one color-scale for the ribbons, and one for the vertical bars?
  • drop intercept by default?
  • do we need to highlight the 95% CI at all?
  • include the null hypothesis (e.g., 0) and have a vertical bar there
  • italic lower case p

@bwiernik
Copy link
Contributor

bwiernik commented Nov 4, 2022

  • get rid of one of the two legends when we have no facets
  • add back one legend title when we have no facets
  • remove legend for facets
  • how to have two color scales when we have no facets? i.e. one color-scale for the ribbons, and one for the vertical bars?

I think rather than color, the lines for different confidence levels should be differentiated by line thickness. Make 25/50/75 thin lines and the focal hypothesis line (95) a thick line. Don't include a legend for either of these (with show.legend = FALSE).

  • drop intercept by default?

Yes

  • do we need to highlight the 95% CI at all?

Yes, I think including a focal hypothesis line makes this sort of plot a lot more accessible to people new to it. We can include a conf_level argument to let users customize which levels they want highlighted.

@strengejacke

This comment was marked as outdated.

@strengejacke

This comment was marked as outdated.

@bwiernik
Copy link
Contributor

bwiernik commented Nov 5, 2022

Given that we use "grid" to mean data grids elsewhere, maybe name that argument "facet"?

@strengejacke
Copy link
Member Author

I think we have n_columns as argument to activate facet_wrap() or have an integrated plot. But I'll change grid to n_columns.

@strengejacke
Copy link
Member Author

library(parameters)
library(see)

model <- lm(Sepal.Length ~ Species + Sepal.Width + Petal.Length, data = iris)
p <- p_function(model)

plot(p)

plot(p, show_labels = FALSE)

plot(p, n_columns = 1)

plot(p, n_columns = 2)

Created on 2022-11-05 with reprex v2.0.2

@strengejacke
Copy link
Member Author

@bwiernik How would you describe the p value here? What would be a proper formulation? The point estimate has a p-value of 1, but of course we cannot say that with 100% prob. the point estimate is correct. Rather, I would say that "With a probability of p, the range of estimates most compatible with our data, given a correct model, is from [CI_low] to [CI_high]"? I think we need to add something like this to the docs of the parameters::p_function().

@mattansb
Copy link
Member

mattansb commented Nov 5, 2022

@strengejacke These look so good!

strengejacke added a commit that referenced this issue Nov 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement 🔥 New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants