7.3. Multiple Logistic Regression

Much like linear regression, logistic regression can include multiple independent (\(X\)) variables. When there is more than one \(X\), we assume the following relationship:

\[P(Y = 1 | X) = \frac{e^{\beta_0+\beta_1X_1+\beta_2X_2+...+\beta_pX_p}}{1+e^{\beta_0+\beta_1X_1+\beta_2X_2+...+\beta_pX_p}}\]

Below we fit a multiple logistic regression using several of the independent variables in our data set:

depositLogMultiple <- glm(subscription ~ duration + campaign + loan + marital, 
                          data = deposit, family = "binomial")
summary(depositLogMultiple)
Call:
glm(formula = subscription ~ duration + campaign + loan + marital, 
    family = "binomial", data = deposit)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-3.9111  -0.4469  -0.3560  -0.2684   2.7765  

Coefficients:
                 Estimate Std. Error z value Pr(>|z|)    
(Intercept)    -2.6923460  0.1629547 -16.522  < 2e-16 ***
duration        0.0036105  0.0001753  20.594  < 2e-16 ***
campaign       -0.0995488  0.0256961  -3.874 0.000107 ***
loanyes        -0.8927075  0.1821953  -4.900  9.6e-07 ***
maritalmarried -0.3825457  0.1547694  -2.472 0.013447 *  
maritalsingle  -0.0539759  0.1668874  -0.323 0.746372    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 3231  on 4520  degrees of freedom
Residual deviance: 2641  on 4515  degrees of freedom
AIC: 2653

Number of Fisher Scoring iterations: 6

We interpret these estimated coefficients as follows:

  • On average, a one-unit increase in duration corresponds to an increase in the probability that the contacted person makes a deposit, assuming all other variables in the model are kept constant.

  • On average, a one-unit increase in campaign corresponds to a decrease in the probability that the contacted person makes a deposit, assuming all other variables in the model are kept constant.

  • On average, contacted persons with a personal loan are less likely to make a deposit than contacted persons without a personal loan, assuming all other variables in the model are kept constant.

  • On average, contacted persons who are married are less likely to make a deposit than contacted persons who are divorced, assuming all other variables in the model are kept constant. Note that marital can take on three possible values (divorced, married, and single), so divorced is our baseline because the model created explicit dummies for the other two categories (maritalmarried and maritalsingle).

  • On average, contacted persons who are single are less likely to make a deposit than contacted persons who are divorced, assuming all other variables in the model are kept constant. However, the p-value on this coefficient is quite large, so we cannot conclude that this difference is statistically significant.