--- title: "breakDown plots for the generalised linear models" author: "Przemyslaw Biecek" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{breakDown plots for the generalised linear model} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` Here we will use the HR churn data (https://www.kaggle.com/) to present the breakDown package for `glm` models. The data is in the `breakDown` package ```{r} library(breakDown) head(HR_data, 3) ``` Now let's create a logistic regression model for churn, the `left` variable. ```{r} model <- glm(left~., data = HR_data, family = "binomial") ``` But how to understand which factors drive predictions for a single observation? With the `breakDown` package! Explanations for the linear predictor. ```{r, fig.width=7} library(ggplot2) predict(model, HR_data[11,], type = "link") explain_1 <- broken(model, HR_data[11,]) explain_1 plot(explain_1) + ggtitle("breakDown plot for linear predictors") ``` Explanations for the probability with intercept set as an origin. ```{r, fig.width=7} predict(model, HR_data[11,], type = "response") explain_1 <- broken(model, HR_data[11,], baseline = "intercept") explain_1 plot(explain_1, trans = function(x) exp(x)/(1+exp(x))) + ggtitle("Predicted probability of leaving the company")+ scale_y_continuous( limits = c(0,1), name = "probability", expand = c(0,0)) ```