You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I use grouped regressions often (aka Many Models). Though your package makes it mostly painless to create grouped model sql code, maybe it would be nice to have a convenience function to do this.
Hopefully the code below illustrates what I am thinking.
# load libs
library(tidyverse)
library(tidypredict)
library(gapminder)
# make model functioncountry_model<-function(df) {
lm(lifeExp~year, data=df)
}
# nest data for grouped regressionby_country<-gapminder %>%
group_by(country, continent) %>%
nest() %>%
mutate(model= map(data, country_model)) %>%
# use tidypredict to make sql strings by model
mutate(sql_txt= map(model,~tidypredict_sql(., dbplyr::simulate_mssql()))) %>%
# add some case statements -- more work needed here
mutate(case_string= paste0("WHEN COUNTRY = ", "'", country, "'", " THEN ")) %>%
# paste case statements to model sql
mutate(full_string= paste0(case_string, sql_txt))
# paste all of the models together with case statements and return one big glob# of text for sqlby_country %>%
unnest(full_string) %>%
summarise(full_sql_text= toString(full_string)) %>%
as.list()
The first few entries are shown below. I am sure there are many ways to implement this in a cleaner fashion than what I did.
Actually, I have another package called modeldb that is able to fit models inside databases, and it also respects grouped data:
library(tidyverse)
library(gapminder)
library(modeldb)
gapminder %>%
group_by(country) %>%
select(lifeExp, year) %>%
linear_regression_db(lifeExp)
#> Adding missing grouping variables: `country`#> # A tibble: 142 x 3#> country `(Intercept)` year#> <fct> <dbl> <dbl>#> 1 Afghanistan -508. 0.275#> 2 Albania -594. 0.335#> 3 Algeria -1068. 0.569#> 4 Angola -377. 0.209#> 5 Argentina -390. 0.232#> 6 Australia -376. 0.228#> 7 Austria -406. 0.242#> 8 Bahrain -860. 0.468#> 9 Bangladesh -936. 0.498#> 10 Belgium -340. 0.209#> # ... with 132 more rows
This package works with tidypredict but at this time, only with un-grouped data. If modeldb fits the models that you ultimately need, maybe we should focus on adding grouped model capability in tidypredict but based on the output from modeldb.
I use grouped regressions often (aka Many Models). Though your package makes it mostly painless to create grouped model sql code, maybe it would be nice to have a convenience function to do this.
Hopefully the code below illustrates what I am thinking.
The first few entries are shown below. I am sure there are many ways to implement this in a cleaner fashion than what I did.
Would it be possible to add some convenience function to do some lifting for building case statements needed for grouped regression?
The text was updated successfully, but these errors were encountered: