[Feature Request] case statements needed for grouped regression #26

alexhallam · 2018-08-11T02:46:44Z

I use grouped regressions often (aka Many Models). Though your package makes it mostly painless to create grouped model sql code, maybe it would be nice to have a convenience function to do this.

Hopefully the code below illustrates what I am thinking.

# load libs
library(tidyverse)
library(tidypredict)
library(gapminder)

# make model function
country_model <- function(df) {
  lm(lifeExp ~ year, data = df)
}

# nest data for grouped regression
by_country <- gapminder %>% 
  group_by(country, continent) %>% 
  nest() %>% 
  mutate(model = map(data, country_model)) %>% 
  # use tidypredict to make sql strings by model
  mutate(sql_txt = map(model,~tidypredict_sql(., dbplyr::simulate_mssql()))) %>% 
  # add some case statements -- more work needed here
  mutate(case_string = paste0("WHEN COUNTRY = ", "'", country, "'", " THEN ")) %>% 
  # paste case statements to model sql
  mutate(full_string = paste0(case_string, sql_txt))
  
# paste all of the models together with case statements and return one big glob
# of text for sql
by_country %>% 
  unnest(full_string) %>% 
  summarise(full_sql_text = toString(full_string)) %>% 
  as.list()

The first few entries are shown below. I am sure there are many ways to implement this in a cleaner fashion than what I did.

WHEN COUNTRY = 'Afghanistan' THEN -507.534271561773 + (`year`) * (0.275328671328672), 
WHEN COUNTRY = 'Albania' THEN -594.072510955711 + (`year`) * (0.334683216783217),
WHEN COUNTRY = 'Algeria' THEN -1067.85903962704 + (`year`) * (0.569279720279723)
...

Would it be possible to add some convenience function to do some lifting for building case statements needed for grouped regression?

edgararuiz-zz · 2018-08-14T13:34:02Z

Hi Alex, thanks for the suggestion, let me take a look and see what we could do via the package.

edgararuiz-zz · 2018-08-16T02:37:26Z

Actually, I have another package called modeldb that is able to fit models inside databases, and it also respects grouped data:

library(tidyverse)
library(gapminder)
library(modeldb)
gapminder %>%
  group_by(country) %>%
  select(lifeExp, year) %>%
  linear_regression_db(lifeExp)
#> Adding missing grouping variables: `country`
#> # A tibble: 142 x 3
#>    country     `(Intercept)`  year
#>    <fct>               <dbl> <dbl>
#>  1 Afghanistan         -508. 0.275
#>  2 Albania             -594. 0.335
#>  3 Algeria            -1068. 0.569
#>  4 Angola              -377. 0.209
#>  5 Argentina           -390. 0.232
#>  6 Australia           -376. 0.228
#>  7 Austria             -406. 0.242
#>  8 Bahrain             -860. 0.468
#>  9 Bangladesh          -936. 0.498
#> 10 Belgium             -340. 0.209
#> # ... with 132 more rows

This package works with tidypredict but at this time, only with un-grouped data. If modeldb fits the models that you ultimately need, maybe we should focus on adding grouped model capability in tidypredict but based on the output from modeldb.

Thanks!

alexhallam · 2018-08-19T23:07:08Z

Does this translate to sql text?

edgararuiz-zz · 2018-08-20T23:36:31Z

Not yet, that's the portion I need to code, but wanted first to find out if it would be useful for anyone.

topepo added the feature a feature request or enhancement label Apr 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] case statements needed for grouped regression #26

[Feature Request] case statements needed for grouped regression #26

alexhallam commented Aug 11, 2018 •

edited

Loading

edgararuiz-zz commented Aug 14, 2018

edgararuiz-zz commented Aug 16, 2018 •

edited

Loading

alexhallam commented Aug 19, 2018

edgararuiz-zz commented Aug 20, 2018

[Feature Request] case statements needed for grouped regression #26

[Feature Request] case statements needed for grouped regression #26

Comments

alexhallam commented Aug 11, 2018 • edited Loading

edgararuiz-zz commented Aug 14, 2018

edgararuiz-zz commented Aug 16, 2018 • edited Loading

alexhallam commented Aug 19, 2018

edgararuiz-zz commented Aug 20, 2018

alexhallam commented Aug 11, 2018 •

edited

Loading

edgararuiz-zz commented Aug 16, 2018 •

edited

Loading