diff --git a/inst/WORDLIST b/inst/WORDLIST index d206f94e2..8bc76e696 100644 --- a/inst/WORDLIST +++ b/inst/WORDLIST @@ -1,3 +1,5 @@ +’ +’s ANCOVAs ATEs ATT diff --git a/pkgdown/_pkgdown.yml b/pkgdown/_pkgdown.yml index d0411ec7a..791ba8c38 100644 --- a/pkgdown/_pkgdown.yml +++ b/pkgdown/_pkgdown.yml @@ -103,6 +103,10 @@ navbar: href: articles/introduction_comparisons_4.html - text: "Contrasts and comparisons for zero-inflation models" href: articles/introduction_comparisons_5.html + - text: ------- + - text: "Comparison to other R packages" + - text: "Packages related to predictions, marginal means and effects" + href: articles/practical_comparison.html - icon: fa fa-newspaper text: News href: news/index.html diff --git a/vignettes/overview_of_vignettes.Rmd b/vignettes/overview_of_vignettes.Rmd index 4d37246c1..502b7192f 100644 --- a/vignettes/overview_of_vignettes.Rmd +++ b/vignettes/overview_of_vignettes.Rmd @@ -71,3 +71,7 @@ All package vignettes are available at [https://easystats.github.io/modelbased/] * [Slopes, floodlight and spotlight analysis (Johnson-Neyman intervals)](https://easystats.github.io/modelbased/articles/introduction_comparisons_3.html) * [Contrasts and comparisons for generalized linear models](https://easystats.github.io/modelbased/articles/introduction_comparisons_4.html) * [Contrasts and comparisons for zero-inflation models](https://easystats.github.io/modelbased/articles/introduction_comparisons_5.html) + +### Comparison to other R packages + +* [Comparison of R packages related to predictions, marginal means and effects](https://easystats.github.io/modelbased/articles/practical_comparison.html) diff --git a/vignettes/practical_comparison.Rmd b/vignettes/practical_comparison.Rmd new file mode 100644 index 000000000..83e68f8e9 --- /dev/null +++ b/vignettes/practical_comparison.Rmd @@ -0,0 +1,155 @@ +--- +title: "Case Study: Comparison of R packages related to predictions, marginal means and effects" +output: rmarkdown::html_vignette +vignette: > + %\VignetteIndexEntry{Case Study: Comparison of R packages related to predictions, marginal means and effects} + %\VignetteEngine{knitr::rmarkdown} + %\VignetteEncoding{UTF-8} +bibliography: bibliography.bib +--- + +```{r set-options, echo = FALSE} +knitr::opts_chunk$set( + collapse = TRUE, + comment = "#>", + dev = "png", + out.width = "100%", + dpi = 300, + message = FALSE, + warning = FALSE, + package.startup.message = FALSE +) + +pkgs <- c("emmeans", "marginaleffects", "ggeffects") +if (!all(insight::check_if_installed(pkgs, quietly = TRUE))) { + knitr::opts_chunk$set(eval = FALSE) +} +if (getRversion() < "4.1.0") { + knitr::opts_chunk$set(eval = FALSE) +} +``` + +This vignette compares the **modelbased** package to other common packages that can be used to compute adjusted predictions, marginal means, marginal effects, or contrasts and pairwise comparisons. + +**modelbased** is built on top of the two probably most popular R packages for extracting marginal means and effects, namely **emmeans** [@russell2024emmeans] and **marginaleffects** [@arel2024interpret]. Thus, you obtain the same results either from **modelbased** or one of the other two packages. This vignette shows how to replicate results using the different packages. + +```{r} +# to create data grids, we use `insight::get_datagrid()` +library(insight) + +# the four packages, which we compare +library(modelbased) +library(emmeans) +library(marginaleffects) +library(ggeffects) +``` + +The package design is built around following three questions: + +1. Predictor of Interest: Which variable's effect on the outcome do you want to analyze? This is specified with the `by`, `contrast`, or `slope` arguments. + +2. Evaluation Points: At which specific values should the predictor be evaluated? This can also be defined in the `by` argument; additionally, you have the arguments `length` and `range` especially for continuous predictors. For a more refined control over the evaluation points, see the [data grids](https://easystats.github.io/insight/reference/get_datagrid.html) vignette. + +3. Target Population: What population should the inferences generalize to? The `estimate` argument controls this by defining whether predictions are for a typical individual, an average of the sample, or an average of a broader population. + +# Estimated marginal means + +We start with the default `estimate` option (`"typical"`), which is the same as if we were estimating marginal means using the *emmeans* package. + +## Categorical predictors + +```{r} +# a very simple model +data(iris) +model <- lm(Petal.Length ~ Species, data = iris) + +# modelbased +estimate_means(model, by = "Species") + +# emmeans +emmeans(model, "Species") + +# marginaleffects +avg_predictions(model, by = "Species") + +# ggeffects +predict_response(model, "Species") +``` + +## Continuous predictors + +```{r} +# a very simple model +data(iris) +model <- lm(Petal.Length ~ Sepal.Length, data = iris) + +# create a range of representative values +grid <- get_datagrid(model, by = "Sepal.Length") +grid + +# modelbased - defaults to create a range of 10 values from +# minimum to maximum for numeric focal predictors +estimate_means(model, by = "Sepal.Length") + +# emmeans +emmeans( + model, + "Sepal.Length", + at = list(Sepal.Length = grid$Sepal.Length) +) + +# marginaleffects +avg_predictions( + model, + by = "Sepal.Length", + newdata = data.frame(Sepal.Length = grid$Sepal.Length) +) + +# ggeffects +predict_response(model, "Sepal.Length [4.3:7.9 by=0.4]") +``` + +## Interaction between continuous and categorical + +```{r} +# a very simple model +data(iris) +model <- lm(Petal.Length ~ Sepal.Length * Species, data = iris) + +# create a range of representative values +grid <- get_datagrid( + model, + by = c("Species", "Sepal.Length"), + range = "grid", + preserve_range = FALSE +) +grid + +# modelbased +estimate_means(model, by = c("Species", "Sepal.Length"), range = "grid") + +# alternative notation - for "Sepal.Length", we want mean and +/- SD +estimate_means(model, by = c("Species", "Sepal.Length = [meansd]")) + +# we could also pass a data grid to the `newdata` argument... +estimate_means(model, by = c("Species", "Sepal.Length"), newdata = grid) + +# emmeans +emmeans( + model, + c("Species", "Sepal.Length"), + at = lapply(grid, unique) +) + +# marginaleffects +avg_predictions( + model, + by = c("Species", "Sepal.Length"), + newdata = grid +) + +# ggeffects +predict_response(model, c("Species", "Sepal.Length")) +``` + +# References