Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Max/Min of differences for numeric variables in "Differences" table of comparedf #369

Open
sheramin opened this issue Jun 17, 2024 · 0 comments

Comments

@sheramin
Copy link

sheramin commented Jun 17, 2024

I'm using comparedf function to cross check two tables which usually contain calculated variables, such as BMI. Because the two tables were built by two different programmers, they could have used different rounding which in this case we might have some differences but they're not really significant - just rounding and decimal differences. I think adding a min.diff and max.diff columns in the diffs.byvar.table table for numeric variables could be really helpful in such cases to show the range of differences, so if the differences are reasonable we can skip otherwise to check the details. I know we can print the whole table by setting the control option and visually check this, but that's time consuming when you have a ton of variables. Following is a visual example of what I'm trying to say. (The example datasets are from admiral.test::admiral_vs package. BMI has been calculated by reshaping the dataset).
image

Below I tried to implement a solution in a very basic and simple way:

add_xy_diff <- compareres$diffs.table %>% 
  mutate(
    values.x = unlist(values.x),
    values.y = unlist(values.y),
    xy.diff = case_when(
      is.numeric(values.x) & is.numeric(values.y) ~ values.x - values.y,
      TRUE ~ NA_real_
    )
  )

image

xy_diff_smry <- add_xy_diff %>% 
  group_by(var.x, var.y) %>% 
  summarise(
    min.diff = min(xy.diff, na.rm = TRUE),
    max.diff = max(xy.diff, na.rm = TRUE)
  )

image

compareres$diffs.byvar.table %>% 
  left_join(xy_diff_smry, by = c("var.x", "var.y"))

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant