@cameronpat I have wondered about this too! Especially since GAMs seem like a natural progression from "ordinary" linear models. Is it the choosing of bases or interpretation of coefficients that's a turn off? But those aren't decisions specific to #biostatistics. Perhaps it's just a lack of awareness? I've found #mgcv in #rstats super easy to use.
#RStats issues I'm struggling with that seem impossible to Google: Building a {brms} model within the {tidymodels} framework using {bayesian}.
The formula is inherently too complex (including splines and random effects) for the typical tidymodels workflow that involves recipes &c., so it must be added in at a later step. Two things:
1. Complex {brms} multivariate formulas seem to not be possible using {tidymodels}. E.g., literally multivariate or including phi after my formula via brms::bf(). It simply errors. :( This may just need some tweaking of {bayesian}'s scripts or waiting for an update since it's still fairly young.
2. Using {mgcv} random effect syntax like s(cat1, cat2, bs = "re") seems to not pick up as random effects in the model...I think? And I have never figured out if this is creating hierarchical random effects or not -- or if multilevel random effects just aren't possible in this syntax(?).
3. Using {lme4} random effect like (1 | cat1 / cat2) to ensure the hierarchy is preserved *does* retain random effects I can pull out of the model later using `ranef`, but for some absurd reason I cannot run this model through cross-validation or a myriad of other steps later because it seems to force-create a complex web of interacting factor levels that don't exist. E.g., if my random effects are '(1 | realm / biome)', this eventually fails because it'll look for tundra biome types in Africa for some absurd reason.*
Noticed this while trying to solve *separate* issues within broom.mixed:::tidy.brmsfit() -- that it seems to delete the names of all the fixed effects and return them as 'NULL' character strings (???), and its reliance on 'ranef' means it doesn't find the random effects using {mgcv} syntax.
That's my rambling mess of an essay for the day. Not sure how many of these are real issues or me simply not understanding how these packages differ or wot.
* Almost wondering if this might even be a separate {tidymodels} issue right now. Every recipe no matter what seems to factor every single character column regardless of how the recipe is built. Hmmmm.
#rstats #brms #mgcv #tidymodels
Absolutely gaga over this new preprint by Nick Clark and the @weecology group. So many methodological threads - long-term ecological monitoring, an open data system, careful semi-parametric models, simulation-based inference and forecasting rigor - combine into predicting complex multispecies dynamics while learning about their relationships + drivers
https://ecoevorxiv.org/repository/view/5143/, code at https://github.com/nicholasjclark/portal_VAR
Thread from nick at: https://twitter.com/nj_clark/status/1635417591157260288
#ecology #forecasting #efi #mgcv #rstats
I'm working with very large data, and am fitting smoothing splines with the `bam()` function, and `discrete = TRUE` (which is an amazing speed boost!)
When I want to predict new data from the fitted model, is there any reason why I can't set `discrete = FALSE` in `predict.bam()`? That is, the fitted bam model is still just a gam fitted model, right?
(I have *many* levels for a random effect, and the discretized predictions are erroring out with "data is too long")
#rstats friends ππ»π
two bots of potential interest
@mgcv_updates tells you about what's new in #mgcv
and
@rverbsr is a silly bot that toots "verb that noun" phrases where the verbs are functions in R base and the nouns are R types
enjoy!
#rstats friends ππ»π
two bots of potential interest
@mgcv_updates tells you about what's new in #mgcv
and
@rverbsr is a silly bot that toots "verb that noun" phrases where the verbs are functions in R base and the nouns are R types
enjoy!
OK, a first convening of team #gams #mgcv here: @ericJpedersen @gavinsimpson @millerdl .
If I want to fit a spline but constrain it to going through certain points (e.g., the start and end of an epicurve should be zero), what's the best way? I'm thinking of adding points to the data at the ends of the range with very high weights. Not sure what the consequences of that would be. #rstats
I can feel the planet heating...
RT @ucfagls@twitter.com
A beautiful sight, when fitting a BAM to ~4 million rows of data #rstats #mgcv
π¦π: https://twitter.com/ucfagls/status/1467792800922099715