7.3. Multiple Logistic Regression¶
Much like linear regression, logistic regression can include multiple independent (\(X\)) variables. When there is more than one \(X\), we assume the following relationship:
Below we fit a multiple logistic regression using several of the independent variables in our data set:
depositLogMultiple <- glm(subscription ~ duration + campaign + loan + marital,
data = deposit, family = "binomial")
summary(depositLogMultiple)
Call:
glm(formula = subscription ~ duration + campaign + loan + marital,
family = "binomial", data = deposit)
Deviance Residuals:
Min 1Q Median 3Q Max
-3.9111 -0.4469 -0.3560 -0.2684 2.7765
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.6923460 0.1629547 -16.522 < 2e-16 ***
duration 0.0036105 0.0001753 20.594 < 2e-16 ***
campaign -0.0995488 0.0256961 -3.874 0.000107 ***
loanyes -0.8927075 0.1821953 -4.900 9.6e-07 ***
maritalmarried -0.3825457 0.1547694 -2.472 0.013447 *
maritalsingle -0.0539759 0.1668874 -0.323 0.746372
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 3231 on 4520 degrees of freedom
Residual deviance: 2641 on 4515 degrees of freedom
AIC: 2653
Number of Fisher Scoring iterations: 6
We interpret these estimated coefficients as follows:
On average, a one-unit increase in
duration
corresponds to an increase in the probability that the contacted person makes a deposit, assuming all other variables in the model are kept constant.On average, a one-unit increase in
campaign
corresponds to a decrease in the probability that the contacted person makes a deposit, assuming all other variables in the model are kept constant.On average, contacted persons with a personal loan are less likely to make a deposit than contacted persons without a personal loan, assuming all other variables in the model are kept constant.
On average, contacted persons who are married are less likely to make a deposit than contacted persons who are divorced, assuming all other variables in the model are kept constant. Note that
marital
can take on three possible values (divorced
,married
, andsingle
), sodivorced
is our baseline because the model created explicit dummies for the other two categories (maritalmarried
andmaritalsingle
).On average, contacted persons who are single are less likely to make a deposit than contacted persons who are divorced, assuming all other variables in the model are kept constant. However, the p-value on this coefficient is quite large, so we cannot conclude that this difference is statistically significant.