{ "cells": [ { "cell_type": "code", "execution_count": 1, "id": "e989fbf9", "metadata": { "tags": [ "remove-cell" ] }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Registered S3 methods overwritten by 'ggplot2':\n", " method from \n", " [.quosures rlang\n", " c.quosures rlang\n", " print.quosures rlang\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Registered S3 method overwritten by 'rvest':\n", " method from\n", " read_xml.response xml2\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "-- Attaching packages --------------------------------------- tidyverse 1.2.1 --\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "v ggplot2 3.1.1 v purrr 0.3.2 \n", "v tibble 2.1.1 v dplyr 0.8.0.1\n", "v tidyr 0.8.3 v stringr 1.4.0 \n", "v readr 1.3.1 v forcats 0.4.0 \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "-- Conflicts ------------------------------------------ tidyverse_conflicts() --\n", "x dplyr::filter() masks stats::filter()\n", "x dplyr::lag() masks stats::lag()\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Parsed with column specification:\n", "cols(\n", " age = col_double(),\n", " marital = col_character(),\n", " education = col_character(),\n", " default = col_character(),\n", " housing = col_character(),\n", " loan = col_character(),\n", " contact = col_character(),\n", " duration = col_double(),\n", " campaign = col_double(),\n", " previous = col_double(),\n", " poutcome = col_character(),\n", " subscription = col_double()\n", ")\n" ] } ], "source": [ "library(tidyverse)\n", "library(ggplot2)\n", "deposit <- read_csv(\"../_build/data/marketing.csv\")" ] }, { "cell_type": "markdown", "id": "16f6c8c4", "metadata": {}, "source": [ "# Why Not Linear Regression?\n", "\n", "At this point, a natural question might be why one cannot use linear regression to model categorical outcomes. To understand why, let's look at an example. We have a data set related to a telephone marketing campaign of a Portuguese banking institution. Suppose we would like to model the likelihood that a phone call recipient will make a term deposit at the bank. The data is saved in a data frame called `deposit`, and the first few observations are shown below:" ] }, { "cell_type": "code", "execution_count": 2, "id": "36636052", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
agemaritaleducationdefaulthousingloancontactdurationcampaignpreviouspoutcomesubscription
30 married primary no no no cellular 79 1 0 unknown 0
33 married secondaryno yes yes cellular 220 1 4 failure 0
35 single tertiary no yes no cellular 185 1 1 failure 0
30 married tertiary no yes yes unknown 199 4 0 unknown 0
59 married secondaryno yes no unknown 226 1 0 unknown 0
35 single tertiary no no no cellular 141 2 3 failure 0
\n" ], "text/latex": [ "\\begin{tabular}{r|llllllllllll}\n", " age & marital & education & default & housing & loan & contact & duration & campaign & previous & poutcome & subscription\\\\\n", "\\hline\n", "\t 30 & married & primary & no & no & no & cellular & 79 & 1 & 0 & unknown & 0 \\\\\n", "\t 33 & married & secondary & no & yes & yes & cellular & 220 & 1 & 4 & failure & 0 \\\\\n", "\t 35 & single & tertiary & no & yes & no & cellular & 185 & 1 & 1 & failure & 0 \\\\\n", "\t 30 & married & tertiary & no & yes & yes & unknown & 199 & 4 & 0 & unknown & 0 \\\\\n", "\t 59 & married & secondary & no & yes & no & unknown & 226 & 1 & 0 & unknown & 0 \\\\\n", "\t 35 & single & tertiary & no & no & no & cellular & 141 & 2 & 3 & failure & 0 \\\\\n", "\\end{tabular}\n" ], "text/markdown": [ "\n", "| age | marital | education | default | housing | loan | contact | duration | campaign | previous | poutcome | subscription |\n", "|---|---|---|---|---|---|---|---|---|---|---|---|\n", "| 30 | married | primary | no | no | no | cellular | 79 | 1 | 0 | unknown | 0 |\n", "| 33 | married | secondary | no | yes | yes | cellular | 220 | 1 | 4 | failure | 0 |\n", "| 35 | single | tertiary | no | yes | no | cellular | 185 | 1 | 1 | failure | 0 |\n", "| 30 | married | tertiary | no | yes | yes | unknown | 199 | 4 | 0 | unknown | 0 |\n", "| 59 | married | secondary | no | yes | no | unknown | 226 | 1 | 0 | unknown | 0 |\n", "| 35 | single | tertiary | no | no | no | cellular | 141 | 2 | 3 | failure | 0 |\n", "\n" ], "text/plain": [ " age marital education default housing loan contact duration campaign\n", "1 30 married primary no no no cellular 79 1 \n", "2 33 married secondary no yes yes cellular 220 1 \n", "3 35 single tertiary no yes no cellular 185 1 \n", "4 30 married tertiary no yes yes unknown 199 4 \n", "5 59 married secondary no yes no unknown 226 1 \n", "6 35 single tertiary no no no cellular 141 2 \n", " previous poutcome subscription\n", "1 0 unknown 0 \n", "2 4 failure 0 \n", "3 1 failure 0 \n", "4 0 unknown 0 \n", "5 0 unknown 0 \n", "6 3 failure 0 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "head(deposit)" ] }, { "cell_type": "markdown", "id": "3a43cf2a", "metadata": {}, "source": [ "These variables are defined as follows:\n", "\n", "+ `age`: The age of the person contacted.\n", "+ `marital`: The marital status of the person contacted. \n", "+ `education`: The education level of the person contacted. \n", "+ `default`: Whether the person contacted has credit in default.\n", "+ `housing`: Whether the person contacted has a housing loan.\n", "+ `loan`: Whether the person contacted has a personal loan.\n", "+ `contact`: How the person was contacted (`cellular` or `telephone`).\n", "+ `duration`: The duration of the contact, in seconds.\n", "+ `campaign`: The number of contacts performed during this campaign for this client.\n", "+ `previous`: The number of contacts performed before this campaign for this client.\n", "+ `poutcome`: The outcome of the previous marketing campaign (`failure`, `nonexistent`, or `success`).\n", "+ `subscription`: Whether the person contacted made a term deposit (`1` if the person made a deposit and `0` if not).\n", "\n", "Let's try using linear regression to model `subscription` as a function of `duration`:" ] }, { "cell_type": "code", "execution_count": 3, "id": "28619470", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "\n", "Call:\n", "lm(formula = subscription ~ duration, data = deposit)\n", "\n", "Residuals:\n", " Min 1Q Median 3Q Max \n", "-1.47629 -0.11329 -0.05758 -0.01963 1.00009 \n", "\n", "Coefficients:\n", " Estimate Std. Error t value Pr(>|t|) \n", "(Intercept) -1.488e-02 6.203e-03 -2.399 0.0165 * \n", "duration 4.930e-04 1.675e-05 29.436 <2e-16 ***\n", "---\n", "Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1\n", "\n", "Residual standard error: 0.2926 on 4519 degrees of freedom\n", "Multiple R-squared: 0.1609,\tAdjusted R-squared: 0.1607 \n", "F-statistic: 866.5 on 1 and 4519 DF, p-value: < 2.2e-16\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "depositLinear <- lm(subscription ~ duration, data = deposit)\n", "summary(depositLinear)" ] }, { "cell_type": "markdown", "id": "29c355fe", "metadata": {}, "source": [ "Based on this output, our estimated regression equation is:\n", "\n", "$$predicted \\;subscription = \\hat{y} = -0.01488 + 0.0004930(duration)$$\n", "\n", "If we plot this line on top of our data, we can start to see the problem:" ] }, { "cell_type": "code", "execution_count": 4, "id": "31985c6a", "metadata": { "tags": [ "remove-input" ] }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAA0gAAANICAMAAADKOT/pAAAAY1BMVEUAAAAzMzMzZv9NTU1o\naGh8fHyDg4OMjIyVlZWampqjo6Onp6evr6+ysrK5ubm9vb3BwcHHx8fJycnQ0NDR0dHY2NjZ\n2dne3t7h4eHk5OTp6enq6urr6+vv7+/w8PD19fX///94A8tMAAAACXBIWXMAABJ0AAASdAHe\nZh94AAAgAElEQVR4nO2d6ULbyhKE5yJwgHAChIQkrH7/p7xY3m0ts091d9WPgzG2v6lG3zGW\nZMctGYZJjmu9AIbRENd6AQyjIa71AhhGQ1zrBTCMhrjWC2AYDXGtF8AwGuJaL4BhNMS1XgDD\naIjL9Dj/YxiDyS/S7C3eMpG0I1SUsDInioSLUFHCypwoEi5CRQkrc6JIuAgVJazMiSLhIlSU\nsDInioSLUFHCypwoEi5CRQkrc6JIuAgVJazMiSLhIlSUsDInioSLUFHCypwoEi5CRQkrc6JI\nuAgVJazMiSLhIlSUsDInioSLUFHCypwoEi5CRQkrc6JIuAgVJazMiSLhIlSUsDInioSLUFHC\nypwoEi5CRQkrc6JIuAgVJazMiSLhIlSUsDInioSLUFHCypwoEi5CRQkrc6JIuAgVJazMiSLh\nIlSUsDInioSLUFHCypwoEi5CRQkrc6JIuAgVJazMKUKkX93xt9fd4uFz6BHjV5UcFQgVJazM\nKVyk1+5IpIdulcXeJIokh6ADAVEiWKTXxZFIr92Pz9WT1I+BR4xfVXJUIFSUsDKnUJF+dbdH\nIt2tvzm4jiLJIehAQJQIFal7WB7/abe5miJJJOhAQJQIFel1OSTSZ3c78Ijxq0qOCoSKElbm\nFLHXbkCkX92f/sH6vDGMuWQR6X1xt/+Gz0hyCDoQECVyiPS5uD34jiLJIehAQJTIIdLt9fAj\njgWiuACEihJW5pQu0vv17fvwI44ForgAhIoS2RAXF6UJE6kg0p/u9vjHFEkOQRDi4mLUJIgS\nqSK9n3pEkQQRBCEuLkZVgiiRIFL/9Ue3ycAjxq8qOSoQKkpkQlxcjJsEUSJVpI4iCSaIQUx5\nhFGC70fCRagokQUx6RFGCYqEi1BRgiLFhiLJIQhBTHuEUYIi4SJUlMiAmPEIowRFwkWoKJGO\nmPMIowRFwkWoKJGMuKBIY4EoLgChokQqYt4jjBIUCRehokQiwsMjjBIUCRehokQmkcoRvEKR\nJCNUlEhD+HiEUYIi4SJUlEhCeHmEUYIi4SJUlEhB+HmEUYIi4SJUlEhA+OxoSCN4hyJJRqgo\nkUGkcgTvUCTJCBUl4hG+HmGUoEi4CBUlohHeHmGUoEi4CBUlUkUqSAgIRZKMUFEiFuH/hIRR\ngiLhIlSUiEQEeIRRgiLhIlSUSBOpICEoFEkyQkWJOETIExJGCYqEi1BRIgoR5BFGCYqEi1BR\nIgYR5hFGCYqEi1BRgiLFhiLJIWAiAj3CKEGRcBEqSoQjQj3CKEGRcBEqSgQjgj3CKEGRcBEq\nSlCk2FAkOQRARLhHGCUoEi5CRYlARIRHGCUoEi5CRYkwRIxHGCUoEi5CRYkghO+by+MJkaFI\nkhEqSkSJVI4QGYokGaGiRAgiziOMEhQJF6GiRAAi0iOMEhQJF6GihD8i1iOMEhQJF6GiBEWK\nDUWSQ0BCRHuEUYIi4SJUlPBFxHuEUYIi4SJUlPBEJHiEUYIi4SJUlPBDxB2JDSGkhSJJRqgo\nESZSOUJaKJJkhIoSXogkjzBKUCRchIoSPog0jzBKUCRchIoSHohEjzBKUCRchIoSFCk2FEkO\nAQKR6hFECYoEjFBRYhaR7BFCCYqEjFBRYg6R7hFAiSVFQkaoKDGDSDoS60XIEookGaGihKdI\n5QhZQpEkI1SUmEbk8Kh5iT4UCRehosQkIotHrUusQ5FwESpKTCHyeIQxJ4qEi1BRgiLFhiLJ\nIbRFZPIIY04UCRehosQ4IpdHGHOiSLgIFSVGEdk8wpgTRcJFqCgxhshxJHaakDMUSTJCRYkR\nREaPMOZEkXARKkoMI3J6hDEnioSLUFFiWqRyhLyhSJIRKkoMIrJ6hDEnioSLUFFiCJHXI4w5\nUSRchIoSA4jMHmHMiSLhIlSUOEdk3dEwSMgfiiQZoaLEGSK7Rxhzoki4CBUlRkUqRygQiiQZ\noaLEKSK/Rxhzoki4CBUlThAFPMKYE0XCRagocYwo4RHGnCgSLkJFCYoUG4okh1AbUcQjjDlR\nJFyEihKHiDIeYcyJIuEiVJQ4QBTyCGNOFAkXoaLEHpH/SOwpoVwokmSEihLnIpUjlAtFkoxQ\nUWKHKOYRxpwoEi5CRYktopxHGHOiSLgIFSU2iIIeYcyJIuEiVJSgSLGhSHII1RAlPcKYE0XC\nRago0SOKeoQxJ4qEi1BRYoUo6xHGnCgSLkJFiS9EsSOxO0LxUCTJCBUlDkQqRygeiiQZoaLE\n8q20Rxhz0iySc279nROeowbjdfY/Oeh+MIf116N7vC0P77C7wdE9D+82OvCBn22uOvVo6lGC\nsnugmO0pcBWmRdr95n22VY3ZzGJ7ef31+CZv53c6uOvptW506xv42eaqrUenq0nO/oEitqfQ\nVVgWafebD9r4VKUfxfZy+F3HH/M0Az/bXrUTyY3dMi4HDxS+PQWvoolIbxjZDOvt/H+6ZnI0\nh/C7jj/m6KzPrtp75MZumfjrrX7n4fAZSXH6UWwvh991/DFPM/Cz9TUHHrmxW8bl4IG0PiPN\n3oKvkepkM4vt5fXX45sUfo105hFfI/kHRqTlwAYhM1L32l0ciDR1y7jsHoh77cpFBUJ8ieJH\nkNaBmBNFwkVIL1HJI4w5USRchPAStTzCmBNFwkXILrH1yMicKBIuQnYJipQYiiSHUBCx+8PO\nyJwoEi5Ccon9CyQjc6JIuAjBJQ52NBiZE0XCRQguQZHSQ5HkEEohDvd8G5kTRcJFiC1xdATJ\nyJwoEi5CbImjI7FG5kSRcBFSS1xQpByhSHIIRRAnpwYZmRNFwkUILXFyip2ROVEkXITMEqfn\nqhqZE0XCRYgscXbOt5E5USRchMQS5++dMDInioSLkFiCIlEkOITAEgNv5jMyJ4qEi5BXYuhN\nsUbmRJFwEeJKDL653MicKBIuQlwJirSKy8SkSHIIeRHDn3ZiZE4UCRchrMSwR1bmRJFwEbJK\njHhkZU4UCRchqsQFRdrEZWJSJDmEfIhRj6zMiSLhIgSVGPfIypwoEi5CUIlxj6zMiSLhIuSU\nmPDIypwoEi5CTIkpj6zMiSLhIqSUmPTIypwoEi5CSImJHQ25ENOBmBNFwkXIKDHjkZU5USRc\nhIwSMx5ZmRNFwkWIKDHnkZU5USRchIQSsx5ZmRNFwkUIKDHvkZU5USRchIASFGkbioSLwC/h\n4ZGVOVEkXAR8CR+PrMyJIuEi0Et4eWRlThQJFwFeYu5IbAaEXyDmRJFwEeAl/DyyMieKhIvA\nLuHpkZU5USRcBHQJX4+szIki4SKQS3h7ZGVOFAkXgVyCIp2EIuEigEv4e2RlThQJF4FbIsAj\nK3OiSLgI2BIhHlmZE0XCRaCW8DwSm4IICsScKBIuArVEkEdW5kSRcBGgJcI8sjInioSLwCwR\n6JGVOVEkXARkiVCPrMyJIuEiIEtQpOFQJFwEYolgj6zMiSLhIgBLhHtkZU4UCReBVyLCIytz\noki4CLgSYUdioxAxgZgTRcJFwJWI8cjKnCgSLgKtRJRHVuZEkXARYCXiPLIyJ4qEi8AqEemR\nlTlRJFwEVgmKNBmKhIuAKhHrkZU5USRcBFKJaI+szIki4SKASsR7ZGVOFAkXgVMi6khsGCIh\nEHOiSLgInBIJHlmZE0XCRcCUSPHIypwoEi4CpUSSR1bmRJFwESAl0jyyMieKhIsAKUGRKJJs\nBEaJRI+szIki4SIgSqR6ZGVOFAkXgVAi2SMjc6JIwAiAEilHYj0R6QGYE0VCRrQvkcEjE3Na\nUiRkRPMSOTyyMKdVKBIuonmJHB5ZmNMqFAkX0bpEFo8MzKkPRcJFNC6RxyP9c1qHIuEi2pbI\n5JH6OW1CkXARTUtk2dEwjcgViF82RcJFtCyRzSPlc9qFIuEiEEQqiMgWiF82RcJFNCyRzyPd\nc9onWKSHRbd4+Nx//3l6BUWSQxhDZPRI9ZwOEirSbbfK9e7790V/xeJ94BHjV5UcFYhmJXJ6\npHlOhwkU6V+3eF2+Lrp/2yt+dA9f/33ofgw8YvyqkqMCQZFQCPlFeuj+fP33d/dze0XXHX45\nfsT4VSVHBaJViaweKZ7TUQJFuutWf8O9dnfbKxYbkRYDjxi/quSoQDQqkdcjvXM6TqBIZ09A\nPzd/2u2eoiiSIMIQIrNHaud0klSRlr9WexsWv9YP1ueNEZzdkdjWCxGWZJF+9nvt9k9IfEYS\nRBhA5H5C0jqn06SK9Gv1p93nj+7XwCPGryo5KhAtSmT3SOmczhIo0uJUpOtudSz28+DIEkWS\nQzhD5PdI55zOE7XX7n2/1467vyUTThEFPFI5p4EEivSzP470p99V12f9FPXJ3d8iCRQpGyL1\nzIaHbnWe3cPeLIokiHCCKOGRxjkNJfRcu+t+J93t6uL6z7nb/RWnjxi/quSoQNQuUcQjhXMa\nTKhI65O9+4ub10X7K04fMX5VyVGBqFyijEf65jQcvh8JF0GRUAgUSTSibolCHqmb00goEi6i\naolSHmmb01goEi6ihUglEaUC8cumSLiImiWKPSEpm9NoKBIuomKJch7pmtN4KBIuor5IJRHl\nAvHLpki4iHolCj4hqZrTRCgSLqJaiZIeaZrTVCgSLqJWiaIeKZrTZCgSLoIioRAokmhEpRJl\nPdIzp+lQJFxEnRKFPVIzp5lQJFxElRKlPdIyp7lQJFwERUIhUCTRiBolinukZE6zt6BIuIgK\nJcp7pGNOFEkyojyhgkcq5kSRRCOKEy4oUjYERcJFVBOpKETBnCiSbERpQhWPFMzJC0GRcBGF\nCXU8kj8nPwRFwkWUJVTySPycPBEUCRdBkVAIFEk0oiihlkfS5+SLoEi4iJKEah4Jn5M3giLh\nIgoSth5xTrkQFAkXUY5wQZFyIygSLqK8SJxTNgRFwkUUI+xfIHFOuRAUCRdRinCwo4FzyoWg\nSLiIQoTDHXacUy4ERcJFUCQUAkUSjShDODqCxDnlQlAkXEQRwvGRWM4pF4Ii4SJKEE7OaOCc\nciEoEi6iAOH0PbGcUy4ERcJF5Cecvbecc8qFoEi4iOyE889o4JxyISgSLqKYSAUR5xE4pxgE\nRcJF5CYMvHWCc8qFoEi4iMyEobcgcU65EBQJF5GXMPhWPs4pF4Ii4SKyEs53NGRHDEfYnGIR\nFAkXkZMw7BHnlA1BkXARJUQqiBiJrDlFIygSLiIjYcQjzikbgiLhIvIRxjzinLIhKBIuIhth\n1CPOKRuCIuEiKBIKgSKJRuQijHvEOWVDUCRcRCbChEecUzYERcJF5CFMecQ5ZUNQJFxEFsLI\nkdiciOkImVMqgiLhIrKKVBAxHSFzSkVQJFxEDsK0R5xTNgRFwkVkIMx4xDllQ1AkXEQ6Yc4j\nzikbgiLhIigSCoEiiUYkE2Y94pyyISgSLiKVMO8R55QNQZFwEYkED484p2wIioSLSCNMH4nN\ngvAK+pwyISgSLiKPSAURXkGfUyYERcJFJBG8POKcsiEoEi4iheDnEeeUDUGRcBEJBE+PzM8p\nH4Ii4SIoEgqBIolGxBN8PbI+p4wIioSLiCZ4e2R8TjkRFAkXEUvw98j2nLIiKBIuIpLgdSQ2\nDRES2DnlRVAkXESiSAURIYGdU17ElEjPV26bACZFaksI8cjynDIjJkR6do4itUREEYI8Mjyn\n3IgJka4oUltEDCHMI7tzyo6YEMm5m48IJkVqSaBIjRCTIkUxKVJDQqBHZueUHzEhUnd2jVco\nUjtCqEdW51QAMSHSvXuKYVKkZoRgj4zOqQRiavf3VRdjEkVqRQg5EhuJiAjenIogJl8jca9d\nU0SsSAUREcGbUxEERcJFBBIiPDI5pzIIioSLCCPEeGRxToUQPNcOFxFEiPLI4JxKISgSLoIi\noRAokmhECCHOI3tzKoaYFunxm3PfHsOYFKkBIdIjc3Mqh5gS6WVz2urVSwiTItUnxHpkbU4F\nEVMiddt9dpchTIpUnRBxJDYUER+gOZVETIj0n3P3X89FL/dfXwOYFKk6IdojY3MqiZgQ6dJt\nXh09uqsA5v/emLrZetR6Habj9TYKHpBtgvAkxD8f2ZpTWQRFwkX4EVI8sjSnwojJt5rH/Wk3\newuI4gIQFAmFkPp+JO5saIrwIiR5ZGhOpRFeu7+7ECZFqklI88jOnIojJg/IbkzqeEC2CcKD\nkOiRmTmVR/AUIVwERUIh8KRV0Yh5QqpHVuZUAUGRcBGzhGSPjMypBmJYpP49sXyHbGPEHCHd\nIxtzqoKgSLgIioRCoEiiETOEDB6ZmFMdBF8j4SKmCTk8sjCnSgiKhIuYJGTxyMCcaiEoEi5i\nihD/Xj5vRKZQpN0ro4+bbwFMilSDkMcj/XOqhuDbKHARE4RMHqmfUz3EiEg37jgBTIpUnpDL\nI+1zqogYe0bqjjzin3YtEKOEbB4pn1NNxJhIh/8Sc3cTwqRIxQkUqTIh12ukoFCk0oR8Hume\nU1UERcJFjBAyeqR6TnURPI6Eixgm5PRI85wqIyZF+nvTOXf5/W8QkyIVJWQ6EjuFyBuKtPy+\n3dvwXwiTIhUlZPVI8ZxqIyZEetzvt3sOYFKkkoS8HumdU3XE5EcWd08fy+XHU8fPtWuCGCBk\n9kjtnOojJvfabV4c/eWZDU0Q54TcHmmdUwPEhEjd7pqgD7ajSOUIFKkNIU2k79tnpGd+0moT\nxBkhu0dK59QCMbXX7qp/jfRy34W8RKJIxQj5PdI5pyaIyddIUaeAU6RChAIeqZxTGwRFwkUc\nE/IeiR1EFAlFokiNEcMiFUQUCUWKDEUqQijikcI5tUJQJFzEIaGMR/rm1AxBkXARB4RCHqmb\nUzsEP2kVF0GRUAgUSTRiTyjlkbY5NURQJFzEjlDMI2VzaongayRcxJZQziNdc2qKoEi4iA2h\nyJHYY0TJUKRvIaeqDj3iWCCKC0CciFQQUTIUiZ8i1BixJpT0SNOcGiMm3yH7EcOkSFkJRT1S\nNKfWiAmRXi6vniJUokg5CWU90jOn5gjPk1YDmBQpJ4EiIRAokmjEW3GP1MypPYIi4SLeinuk\nZU4ACB5HwkW8FfdIyZyKEyiSaETJI7GbqJgTRAmKhIso75GOOUGUmBTp6XL138uwMxwoUqZU\n8EjFnDBKzHyIfv8D9z2ESZHypIZHGuYEUmJCpCe3Fck9BTApUpZU8UjBnKoQ0kS6clf9iQ0f\nV/zHmOsjKBIQIfU40uYEoQ8eR6qOqOOR/DnVIeQ6+5si1UZU8kj8nCoR0kTq9h+iz3+Noi6i\nlkfS51SLkCbSd9f996XS3+9hu+0oUnIqHIndRPacqhHSRProdqfavQQwKVJq6nkke071CInH\nkf5uTOpC/glZipSanUeSS9REQJSYPkXo8Ztz3x7DmBQpMbvnI8klaiIgSvBcOzTE/u86wSWq\nIiBKUCQwxMHrI7kl6iIgSkyKdH+5XD517ibokxsoUkoO9zOILVEZAVFi8t+QdW750u9tCGFS\npIRcUCREQppIj6uTVi/7/Xb8V83rII53fAstUR0BUWLqk1bd1cfL6uno0YX8s+YUKT7HB5CE\nlqiOgCgxc9LqfX9WA8+1q4M4ORArs0R9BESJmZNWvzn3SJEqIU5PaBBZogECosSMSP3pQS8U\nqQbi7MQgiSVaICBKTH729/PT6iXSx5W7CWBSpMhQJFhC6tnfq3znW83rIM7PVBVYogkCosTs\n2d8f/PCTKoiBM77llWiDgCgxdUD25cpdPn/9iRfyfESR4jL0zglxJRohIErwXDsIxOBbkKSV\naIWAKEGRIBCDb+WTVqIVAqLEpEjPN18vky6//w1iUqTwDL8lVliJZgiIEjOftOo2e+4CQpGC\nM/LWclkl2iEgSsyctLpJyJtkKVJoxj6iQVSJhgiIEpMHZLvVvyH78dS5ywAmRQoNRUInpJ4i\ntHlx9JenCJVEjHkkqkRLBESJyQ+I3P2IHxBZDjHqkaQSTREQJSZE+m//Sav8gMhiiHGPBJVo\ni4AoMflW8251TsPHfRfyvj6KFJTBI7FZCZNRgYAoMSySO00AkyKFZMIjOSUaIyBKUKSmiCmP\nxJRojYAoQZFaIiY9klKiOQKiBM+1a4iY9khIifYIiBIUqSGCIgkhUCRoxIxHMkoAICBKTJ7Z\nwNdIJRFzHokogYCAKBEs0sOiWzx8Ht/y3+GZDxTJL7MeSSgBgYAoESrSbbfK9dENPxcUKTwU\nSQ4hz2uk54MzG/51i9fl66L7d3iDu44iBWfeIwElMBAQJXx2NjzvP0T/ofvz9d/f3c+DH//u\nKFJwPDzCLwGCgCjhtddu/36ku+7967+v3d3+h+/dLUUKjodH+CVAEBAl/ETa/WyjzKE5t907\nRQqNzxMSfAkUBEQJD5E+vk+J9LP7vfv2f33emLlsPWq9DiZb/Pba7fY2nInU/5XHZ6TA+Dwf\nwZeAQUCU8BNp94FcZyJdLz4pUmi8/rBDL4GDgCjhI9K35911ixORfvR78ShSUDw9wi4BhIAo\nEXiu3Xqv3ftur123y8Ajxq8qOcgIX4+gSyAhIEoEivSzfwb60z1svqdI4aFI4gj5RRo8s4F/\n2oXE2yPkElAIiBKTIt1fLpdPnbv52F913T/93K4u7vWhSP7x9wi4BBYCosTkpwg5t/r3Y93h\nx9p99md/9xcpUkQCPMItAYaAKDHz2d/Ly36/3f3pz7weMX5VyYFFUCSJhDSRvrmrj5fV09Gj\nC/lgO4o0kRCPYEugISBKTB5H+lje9/+mC98hmwkR5BFqCTgERIlJkVbPSqt/0oUi5UGEeQRa\nAg8BUWJGpK/XRy+rHQ4BTIo0lguKJJSQJtKle35avUT6uHI3AUyKNJJQjyBLICIgSkyItP6X\nL7+vnpWeApgUaTjBHiGWgERAlJgQ6aNbifSx5L8hmwUR7BFiCUgERImpA7IvV+7y+etPvJDn\nI4o0knCPAEtgIiBK8JNW6yAiPMIrAYqAKEGRqiBiPIIrgYqAKEGRaiDCdzSEEiKjAgFRgiJV\nQMR5BFYCFwFRgiJVQMR5BFYCFwFRgiKVR0R6hFUCGAFRgiIVR8R6BFUCGQFRgiKVRkR7hFQC\nGgFRgiKVRlAk+QSK1B4R7xFQCWwERAmKVBaR4BFOCXAERAmKVBSR4hFMCXQERAmKVBIReSQ2\ngJAYFQiIEhSpJCLJI5QS8AiIEhSpICLNI5AS+AiIEhSpHCLRI4wSAhAQJShSMUSqRxAlJCAg\nSlCkYgiKVAkBUYIilUIke4RQQgQCogRFKoRI9wighAwERAmKVAaRwaP2JYQgIEpQpCKItCOx\nPoQ8UYGAKEGRiiByeNS8hBQERAmKVAKRxaPWJcQgIEpQpAKIPB7pn5MYAkVqgsjkkfo5ySFQ\npCYIilQXAVGCImVH5PJI+5wEEShSA0Q2j5TPSRKBItVH5PNI95xEEShSdUSWI7GThKxRgYAo\nQZHyIjJ6pHpOsggUqTYip0ea5ySMQJEqI7J6pHhO0ggUqS4ir0d65ySOQJHqIihSEwRECYqU\nD5HZI7VzkkegSDURuT3SOieBBIpUEZHdI6VzkkigSPUQOY/EDhNKRAUCogRFyoTI75HOOYkk\nUKRqiAIeqZyTTAJFqoUo4ZHGOQklUKRKiCIeKZyTVAJFqoSgSC0RECUoUgZEGY/0zUksgSJV\nQRTySN2c5BIoUg1EKY+0zUkwgSJVQBQ4EntCKBgVCIgSFCkVUcwjZXOSTKBI5RHlPNI1J9EE\nilQcUdAjVXOSTaBIpRElPdI0J+EEilQaQZEAEBAlKFIKoqhHiuYknUCRyiLKeqRnTuIJFKko\norBHauYkn0CRSiLKHYndEko9sDIERAmKFJviHimZE0WKjBGRynukY04UKTbGRCqIUDEnihQb\nGyJV8EjFnChSdEyIVMMjDXOqgoAoQZFiUsUjBXOqg4AoQZEiUmFHwyri51QJAVGCIoWnkkfi\n51QLAVGCIoWnkkfi51QLAVGCIgWnlkfS51QNAVGCIoVm65GRDUQAAqIERQrM7vnIyAYiAAFR\ngiIFhiLBISBKUKSw7F8gGdlABCAgSlCkoBzsaDCygQhAQJSgSCE53GFnZAMRgIAoQZFCQpEQ\nERAlKFJAjo4gGdlABCAgSlAk/xwfiTWygQhAQJSgSP45PqPByAYiAAFRgiJ554IiYSIgSlAk\n35yeYmdkAxGAgChBkXxzeqqqkQ1EAAKiRAGR3lRmd64qwwyEz0h+OX/vhJH/0wpAQJSgSF4Z\neA+SkQ1EAAKiBEXyCkUCRkCUoEg+GXpTrJENRAACogRF8sjgm8uNbCACEBAlKNJ8hj+kwcgG\nIgABUYIizYciYSMgSlCk2Qx7ZGUDEYCAKEGR5jLikZUNRAACogRFmsmYR1Y2EAEIiBIUaToX\nFAkeAVGCIk1n1CMrG4gABEQJijSZcY+sbCACEBAlKNJUJjyysoEIQECUoEgTmfLIygYiAAFR\ngiJNhCKJQECUoEjjmfTIygYiAAFRgiKNZtojKxuIAARECYo0lhmPrGwgAhAQJSjSSMaPxGZD\nzEXEnAAQECUo0kjmPLKygQhAQJSgSMOZ9cjKBiIAAVGCIg1m3iMrG4gABEQJijQUD4+sbCAC\nEBAlKNJQKJIkBEQJijQQH4+sbCACEBAlKNJ5vDyysoEIQECUoEhn8fPIygYiAAFRgiKdZvZI\nbDrCM9hzwkFAlKBIJ/H1yMoGIgABUYIiHcfbIysbiAAERAmKdBxvj6xsIAIQECUo0lH8PbKy\ngQhAQJSgSIcJ8MjKBiIAAVGCIh0kxCMrG4gABEQJirSP/46GaERQUOeEhoAoQZF2CfPIygYi\nAAFRgiLtEuaRlQ1EAAKiBEXaJtAjKxuIAARECYq0SahHVjYQAQiIEhRpnWCPrGwgAhAQJSjS\nOhRJLgKiBEXqE+6RlQ1EAAKiBEVaJcIjKxuIAARECYq0jPPIygYiAAFRgiIFH4mNQUQFbU6o\nCIgSFCnyCcnKBiIAAVGCIkV6ZGUDEYCAKEGRIj2ysoEIQECUMC9SrEdWNhABCIgSFMGykVUA\nAAx0SURBVIkiSUdAlLAuUrRHVjYQAQiIEsZFivfIygYiAAFRwrZICR5Z2UAEICBKmBYp7khs\nECIlMHMCR0CUoEiRHlnZQAQgIEpYFinJIysbiAAERAnDIqV5ZGUDEYCAKGFXpESPrGwgAhAQ\nJShSQURiMOaEj4AoYVakVI+sbCACEBAlrIqU7JGVDUQAAqKEUZHSPbKygQhAQJSwKVLSkVg/\nRIa0n5MMBEQJ2yIVRGRI+znJQECUMClSDo+sbCACEBAlLIqUxSMrG4gABEQJgyLl8cjKBiIA\nAVGCIpVA5AnEBiIAAVHCnkiZPLKygQhAQJQwJ1Iuj6xsIAIQECWsiZTNIysbiAAERAljIuU4\nEjuDyBeIDUQAAqKEUZEKIvIFYgMRgIAoYUukjB5Z2UAEICBKmBIpp0dWNhABCIgSlkTK6pGV\nDUQAAqIERcqKyBqIDUQAAqKEIZHyemRlAxGAgChhR6TMHlnZQAQgIEqYESm3R1Y2EAEIiBIU\nKR8idyA2EAEIiBJWRMrukZUNRAACooQRkfJ7ZGUDEYCAKGFLpJKI/IHYQAQgIErYEKnAE5KV\nDUQAAqKECZFKeGRlAxGAgChhSaSSiBKB2EAEICBKWBCpyBOSlQ1EAAKihAGRynhkZQMRgIAo\noV+kQh5Z2UAEICBKqBcp45vLxxClArGBCEBAlNAuUjGPrGwgAhAQJYJFelh0i4fPiSswRSqI\nKBaIDUQAAqJEqEi33SrXE1dAiVTOIysbiAAERIlAkf51i9fl66L7N3oFlEgFPbKygQhAQJQI\nFOmh+/P139/dz9ErMERyfbYeOfhsV3x03dvJ9/3tzm54cNXmFkdjWP9n/4PNXU4ezGegy9OH\n97jf100o0lDuuvev/752d6NXLBFEWm8dF3JE8s5y9qqzMRz9YHvx9J5eAz2+ocf9fB46PRJF\n6rrDL0NXLAFEWv/KNXrkk5MxHP3A417jAz2+ocf9fB46Q/SJ9L8+b82z/v0Z9cidjOHoBx73\nGh/o8Q097ufz0Fqi9xnJqkfueAzHP/C41+hAT27ocT+fh84Qfc9IJ48Yv6rUOLUeLWevOh7D\n6Q+2F0/vOT/Qsxt63M/nodMjUaTFqTdnVywRRFpK84h77RIiUaT1Trr3071272B77ZZvJY8g\nbRDlHroaQQcCokSgSD/7w0Z/uofRKzBEKu+RlQ1EAAKihM4zGyp4ZGUDEYCAKBF6rt11f2rd\n7eri+nXRwRWnjxi/quSU98jKBiIAAVEiVKTP/mTv/uJapIMrTh8xflXJeaNIdhAQJfS+H6mw\nR1Y2EAEIiBJ6RZKPUFHCypwoEi5CRQkrc6JIuAgVJazMiSLhIlSUsDInioSLUFHCypwoEi5C\nRQkrc6JIuAgVJazMiSLhIlSUsDInioSLUFHCypwoEi5CRQkrc6JIuAgVJazMiSLhIlSUsDIn\nioSLUFHCypwoEi5CRQkrc6JIuAgVJazMiSLhIlSUsDInioSLUFHCypwoEi5CRQkrc6JIuAgV\nJazMiSLhIlSUsDInioSLUFHCypwoEi5CRQkrc6JIuAgVJazMiSLhIlSUsDInioSLUFHCypwo\nEi5CRQkrc6JIuAgVJazMiSLhIlSUsDInioSLUFHCypwoEi5CRQkrc6JIuAgVJazMiSLhIlSU\nsDInioSLUFHCypwoEi5CRQkrc6JIuAgVJazMiSLhIlSUsDInioSLUFHCypwoEi5CRQkrc6JI\nuAgVJazMiSLhIlSUsDInioSLUFHCypwoEi5CRQkrc6JIuAgVJazMiSLhIlSUsDInioSLUFHC\nypwoEi5CRQkrc6JIuAgVJazMiSLhIlSUsDInioSLUFHCypwoEi5CRQkrc6JIuAgVJazMiSLh\nIlSUsDInioSLUFHCypwoEi5CRQkrc6JIuAgVJazMiSLhIlSUsDInioSLUFHCypwoEi5CRQkr\nc6JIuAgVJazMiSLhIlSUsDInioSLUFHCypwoEi5CRQkrc6JIuAgVJazMiSLhIlSUsDInioSL\nUFHCypwoEi5CRQkrc6JIuAgVJazMiSLhIlSUsDInioSLUFHCypwoEi5CRQkrc6JIuAgVJazM\niSLhIlSUsDInioSLUFHCypwoEi5CRQkrc6JIuAgVJazMiSLhIlSUsDInioSLUFHCypwoEi5C\nRQkrc6JIuAgVJazMiSLhIlSUsDInioSLUFHCypwoEi5CRQkrc6JIuAgVJazMiSLhIlSUsDIn\nioSLUFHCypwoEi5CRQkrc6JIuAgVJazMiSLhIlSUsDInioSLUFHCypwoEi5CRQkrcyog0hvD\nmAufkXARKkpYmRNFwkWoKGFlThQJF6GihJU5USRchIoSVuZEkXARKkpYmRNFwkWoKGFlThQJ\nF6GihJU5USRchIoSVuZEkXARKkpYmRNFwkWoKGFlThQJF6GihJU5USRchIoSVuZEkXARKkpY\nmRNFwkWoKGFlThQJF6GihJU5USRchIoSVuZEkXARKkpYmRNFwkWoKGFlThQJF6GihJU5USRc\nhIoSVuZEkXARKkpYmRNFwkWoKGFlThQJF6GihJU5USRchIoSVuZEkXARKkpYmRNFwkWoKGFl\nThQJF6GihJU5USRchIoSVuZEkXARKkpYmRNFwkWoKGFlThQJF6GihJU5USRchIoSVuZEkXAR\nKkpYmRNFwkWoKGFlTgVEYhiDyS4SROafFgVERQkdLUJKuFKLaBFzvz3cqGhBkSRHRQkdLcyK\nxDCt4lovgGE0xLVeAMNoiGu9AIbRENd6AQyjIa71AhhGQ1zrBWTLw6JbPHy2XkVwfnWbCwfr\nH76Im1/XM0sX0OLzR9f9eF1fjinhiq+wUm67Va5bLyM0r91GpIP1D1/EzUO/yMVqG5PbYtGv\nsTcpqoSrssry+dctXpevi+5f64WE5WvFa5EO1j98ETev3Y/P1TPrD8ktHlbLf+julrElXMXF\nlsxD9+frv7+7n60XEpRf3e1GpIP1D1/Ezd26waqI3BaLbvWE2v8y4kq4Sgstnbvufbn6n+Nd\n64UEpXtYbkQ6WP/wRfisikhv0S2WsSVcrTUWTtcdfpGS19OFr74MX0TPZ3crvsVD92sZW8KV\nX16ViPllnUaLSL9Wf/qIbvG7+/oDYUmRDr8IihKR3herv3lEt/h1t+hfAVGkJf4v6zw6RPpc\n3K6+CG+x/LH62862SAsxv6yTbFZ8sP7hi9i5XR9fEd7i64XeIraEq7C8GlnvVHkXsGfoJEd7\n7d73+4dOLyLn/fr2vb8gusUq+12PoSVctTWWzc9+N/+f9ctFSdmIdLD+4YvA+dPdbi7JbbE+\njvS+OnMhroSrtdLCkXD0fDDyz2x433kkuEV/ZsPn3eo1ku0zG5bX/alQt/M3BMv2j+6D9Q9f\nhM2PbpOl5BaLuZXPlHBVVlkhn/3Jua1XEZ6tSAfrH74Im+5AJLktVid3X//qL0WVcMUXyDAG\n4lovgGE0xLVeAMNoiGu9AIbRENd6AQyjIa71AhhGQ1zrBTCMhrjWC2AYDXGtF8AwGuJaL4AZ\nj3Ouyn2Y9LjWC2DGEybFTcR9mFxxrRfAjCdEiufOBd+HyRfXegHMeEKkoEBt41ovgBkPRZIT\n13oBzFAeL13331aOrSP7b18u3bevS/eXX5e/Pa+vc+sf74S6v3Lu8r/l7p73nXNXzxU72Ipr\nvQBmIN97La5GRfoS6H65vNro8zQg0ku3vqZ72dxlc2OaVCiu9QKY8zx/CfB3+bcbFenq4+vr\nff/l/uu7gRt93ffx64lt/8PuS7ebzbdM/rjWC2DOc9M/yyyfRkVa/XR56V6Orz240WP/lNVr\n9ri/ywdfSBWLa70A5jzdenv/GBXpY3PDj+f/hv/+++bWlr24/sXUySMw+eNaL4A5z7k5Q98u\nn7cvks5vtLvN8bcUqVhc6wUw5/ETafWX383jC0WCiGu9AOY8l+s/3sb/tNve6u/5tfzTrk1c\n6wUw5/m+3jfwePyi6HlQmeFnpPOdDcvDGzHZ41ovgDnP3+Pd319PL9/7k+nOnpHul3+3OxtW\nOyg+9j892/29PPzKZI9rvQBmIMcHZJ/d0bdbGzZXb/7Cu1m9Ytr/dHtA9vJwFzlFKhfXegHM\nUFanCH3fbferp53L+8G9dldPL2uBVs9bhy+H+lOEru7XlylS8bjWC2AYDXGtF8AwGuJaL4Bh\nNMS1XgDDaIhrvQCG0RDXegEMoyGu9QIYRkNc6wUwjIa41gtgGA1xrRfAMBriWi+AYTTEtV4A\nw2jI/wEJ3so6O4LxRwAAAABJRU5ErkJggg==", "text/plain": [ "plot without title" ] }, "metadata": { "filenames": { "image/png": "C:\\Users\\phamilton\\Documents\\Projects\\dsm\\dsm_website\\_build\\jupyter_execute\\07_logistic_regression\\why_not_lin_6_0.png" } }, "output_type": "display_data" } ], "source": [ "suppressWarnings(print(ggplot(deposit, aes(x=duration, y=subscription)) + geom_point() +\n", " geom_smooth(method='lm', se=F,size=2) + theme_bw() +\n", " xlim(-500, 3000) + ylim(-0.2, 1.2) +\n", " theme(axis.text=element_text(size=12),\n", " axis.title=element_text(size=14,face=\"bold\"))))" ] }, { "cell_type": "markdown", "id": "863807b1", "metadata": {}, "source": [ "Our independent variable `subscription` can only take on two possible values, `0` and `1`. However, the line we fit is not bounded between zero and one. If `duration` equals 2,500 seconds, for example, the model would predict a value greater than one:\n", "\n", "$$\\begin{aligned}predicted \\;subscription = \\hat{y} & \\approx -0.01488 + 0.0004930(2500) \\\\ & \\approx -0.01488 + 1.2325 \\\\ & \\approx 1.2176\\end{aligned}$$\n", "\n", "A negative value of `duration` does not make sense in this context, but in principle the same problem applies in the opposite direction - for small values of $X$, the linear model might predict outcomes less than zero. To overcome this issue, logistic regression models the dependent variable $Y$ according to the logistic function, which is bounded between zero and one." ] } ], "metadata": { "jupytext": { "cell_metadata_filter": "-all", "formats": "md:myst", "text_representation": { "extension": ".md", "format_name": "myst", "format_version": 0.13, "jupytext_version": "1.11.5" } }, "kernelspec": { "display_name": "R", "language": "R", "name": "ir" }, "language_info": { "codemirror_mode": "r", "file_extension": ".r", "mimetype": "text/x-r-source", "name": "R", "pygments_lexer": "r", "version": "3.6.1" }, "source_map": [ 16, 21, 27, 29, 48, 51, 59, 66 ] }, "nbformat": 4, "nbformat_minor": 5 }