{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"id": "039fc568",
"metadata": {
"tags": [
"remove-cell"
]
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Registered S3 methods overwritten by 'ggplot2':\n",
" method from \n",
" [.quosures rlang\n",
" c.quosures rlang\n",
" print.quosures rlang\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"Registered S3 method overwritten by 'rvest':\n",
" method from\n",
" read_xml.response xml2\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"-- Attaching packages --------------------------------------- tidyverse 1.2.1 --\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"v ggplot2 3.1.1 v purrr 0.3.2 \n",
"v tibble 2.1.1 v dplyr 0.8.0.1\n",
"v tidyr 0.8.3 v stringr 1.4.0 \n",
"v readr 1.3.1 v forcats 0.4.0 \n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"-- Conflicts ------------------------------------------ tidyverse_conflicts() --\n",
"x dplyr::filter() masks stats::filter()\n",
"x dplyr::lag() masks stats::lag()\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"Parsed with column specification:\n",
"cols(\n",
" ID = col_double(),\n",
" Name = col_character(),\n",
" Gender = col_character(),\n",
" Age = col_double(),\n",
" Rating = col_double(),\n",
" Degree = col_character(),\n",
" Start_Date = col_character(),\n",
" Retired = col_logical(),\n",
" Division = col_character(),\n",
" Salary = col_character()\n",
")\n"
]
}
],
"source": [
"library(tidyverse)\n",
"employees <- read_csv(\"../../_build/data/employee_data.csv\")\n",
"employees$Salary <- parse_number(employees$Salary)\n",
"employees$Start_Date <- parse_date(employees$Start_Date, format = \"%m/%d/%Y\")"
]
},
{
"cell_type": "markdown",
"id": "1b9b39b3",
"metadata": {},
"source": [
"# Filtering Rows\n",
"\n",
"Instead of just sorting the rows in your data, you might want to filter out rows based on a set of conditions. You can do this with the `filter()` function, which uses the following syntax:\n",
"\n",
"```{admonition} Syntax\n",
"`tidyverse::filter(df, condition1, condition2, condition3, ...)`\n",
"+ *Required arguments*\n",
" - `df`: The tibble (data frame) with the data you would like to filter. \n",
" - `condition1`: The logical condition that identifies the rows you would like to keep.\n",
"+ *Optional arguments*\n",
" - `condition2, condition3, ...`: Any additional conditions that identify the rows you would like to keep.\n",
"```\n",
" \n",
"The conditions specified in `filter()` can use a variety of **comparison operators**:\n",
"\n",
"+ `>` (greater than), `<` (less than)\n",
"+ `>=` (greater than or equal to), `<=` (less than or equal to)\n",
"+ `==` (equal to); note that a single equals sign (`=`) will *not* work\n",
"+ `!=` (not equal to)\n",
"\n",
"For example, imagine we wanted to create a new data frame with only those employees who are retired. We could do this with `filter()` by writing a condition that specifies that the `Retired` variable equals `TRUE`:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "2a618398",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"ID | Name | Gender | Age | Rating | Degree | Start_Date | Retired | Division | Salary |
\n",
"\n",
"\t1933 | Medina, Brandy | Female | 62 | 7 | Associate's | 1979-02-23 | TRUE | Sales | NA |
\n",
"\t9259 | Armantrout, Allison | Female | 61 | 7 | Associate's | 1980-02-23 | TRUE | Operations | NA |
\n",
"\t6223 | Ali, Michelle | Female | 60 | 7 | Ph.D | 1982-02-23 | TRUE | Corporate | NA |
\n",
"\t5955 | al-Younes, Ashqar | Male | 63 | 7 | High School | 1979-02-23 | TRUE | Sales | NA |
\n",
"\t6195 | Tolbert, Kinae | Female | 60 | 8 | High School | 1983-02-23 | TRUE | Sales | NA |
\n",
"\t9620 | Medina, Rogelio | Male | 61 | 2 | Master's | 1980-02-23 | TRUE | Accounting | NA |
\n",
"\n",
"
\n"
],
"text/latex": [
"\\begin{tabular}{r|llllllllll}\n",
" ID & Name & Gender & Age & Rating & Degree & Start\\_Date & Retired & Division & Salary\\\\\n",
"\\hline\n",
"\t 1933 & Medina, Brandy & Female & 62 & 7 & Associate's & 1979-02-23 & TRUE & Sales & NA \\\\\n",
"\t 9259 & Armantrout, Allison & Female & 61 & 7 & Associate's & 1980-02-23 & TRUE & Operations & NA \\\\\n",
"\t 6223 & Ali, Michelle & Female & 60 & 7 & Ph.D & 1982-02-23 & TRUE & Corporate & NA \\\\\n",
"\t 5955 & al-Younes, Ashqar & Male & 63 & 7 & High School & 1979-02-23 & TRUE & Sales & NA \\\\\n",
"\t 6195 & Tolbert, Kinae & Female & 60 & 8 & High School & 1983-02-23 & TRUE & Sales & NA \\\\\n",
"\t 9620 & Medina, Rogelio & Male & 61 & 2 & Master's & 1980-02-23 & TRUE & Accounting & NA \\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"| ID | Name | Gender | Age | Rating | Degree | Start_Date | Retired | Division | Salary |\n",
"|---|---|---|---|---|---|---|---|---|---|\n",
"| 1933 | Medina, Brandy | Female | 62 | 7 | Associate's | 1979-02-23 | TRUE | Sales | NA |\n",
"| 9259 | Armantrout, Allison | Female | 61 | 7 | Associate's | 1980-02-23 | TRUE | Operations | NA |\n",
"| 6223 | Ali, Michelle | Female | 60 | 7 | Ph.D | 1982-02-23 | TRUE | Corporate | NA |\n",
"| 5955 | al-Younes, Ashqar | Male | 63 | 7 | High School | 1979-02-23 | TRUE | Sales | NA |\n",
"| 6195 | Tolbert, Kinae | Female | 60 | 8 | High School | 1983-02-23 | TRUE | Sales | NA |\n",
"| 9620 | Medina, Rogelio | Male | 61 | 2 | Master's | 1980-02-23 | TRUE | Accounting | NA |\n",
"\n"
],
"text/plain": [
" ID Name Gender Age Rating Degree Start_Date Retired\n",
"1 1933 Medina, Brandy Female 62 7 Associate's 1979-02-23 TRUE \n",
"2 9259 Armantrout, Allison Female 61 7 Associate's 1980-02-23 TRUE \n",
"3 6223 Ali, Michelle Female 60 7 Ph.D 1982-02-23 TRUE \n",
"4 5955 al-Younes, Ashqar Male 63 7 High School 1979-02-23 TRUE \n",
"5 6195 Tolbert, Kinae Female 60 8 High School 1983-02-23 TRUE \n",
"6 9620 Medina, Rogelio Male 61 2 Master's 1980-02-23 TRUE \n",
" Division Salary\n",
"1 Sales NA \n",
"2 Operations NA \n",
"3 Corporate NA \n",
"4 Sales NA \n",
"5 Sales NA \n",
"6 Accounting NA "
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
"\t- 80
\n",
"\t- 10
\n",
"
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 80\n",
"\\item 10\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 80\n",
"2. 10\n",
"\n",
"\n"
],
"text/plain": [
"[1] 80 10"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"retiredEmployees <- filter(employees, Retired == TRUE)\n",
"head(retiredEmployees)\n",
"dim(retiredEmployees)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "64191079",
"metadata": {
"tags": [
"remove-output"
]
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\t- 80
\n",
"\t- 10
\n",
"
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 80\n",
"\\item 10\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 80\n",
"2. 10\n",
"\n",
"\n"
],
"text/plain": [
"[1] 80 10"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"dim(retiredEmployees)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "097bc54d",
"metadata": {
"tags": [
"remove-input"
]
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[1] 80 10\n"
]
}
],
"source": [
"print(dim(retiredEmployees))"
]
},
{
"cell_type": "markdown",
"id": "01f5562b",
"metadata": {},
"source": [
"By specifying multiple conditions in our call to `filter()`, we can filter by more than one rule. Let's say we wanted a data set with employees who:\n",
"\n",
"+ Are still working, *and*\n",
"+ Started on or after January 1, 1995, *and*\n",
"+ Are *not* in human resources."
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "fe960340",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"ID | Name | Gender | Age | Rating | Degree | Start_Date | Retired | Division | Salary |
\n",
"\n",
"\t2671 | Lewis, Austin | Male | 34 | 4 | Ph.D | 2007-02-23 | FALSE | Engineering | 182343 |
\n",
"\t7821 | Holleman, Shaquaisha | Female | 43 | 8 | Master's | 1999-02-23 | FALSE | Human Resources | 149468 |
\n",
"\t5915 | Rogers, Arielle | Female | 27 | 2 | Bachelor's | 2013-02-23 | FALSE | Human Resources | 79183 |
\n",
"\t9871 | Sontag, Taylor | Female | 30 | 7 | Ph.D | 2011-02-23 | FALSE | Engineering | 164384 |
\n",
"\t2828 | el-Ullah, Saabira | Female | 42 | 5 | Bachelor's | 1999-02-23 | FALSE | Sales | 138973 |
\n",
"\t2836 | Ochoa, Luis | Male | 41 | 7 | Bachelor's | 1999-02-23 | FALSE | Corporate | 152011 |
\n",
"\n",
"
\n"
],
"text/latex": [
"\\begin{tabular}{r|llllllllll}\n",
" ID & Name & Gender & Age & Rating & Degree & Start\\_Date & Retired & Division & Salary\\\\\n",
"\\hline\n",
"\t 2671 & Lewis, Austin & Male & 34 & 4 & Ph.D & 2007-02-23 & FALSE & Engineering & 182343 \\\\\n",
"\t 7821 & Holleman, Shaquaisha & Female & 43 & 8 & Master's & 1999-02-23 & FALSE & Human Resources & 149468 \\\\\n",
"\t 5915 & Rogers, Arielle & Female & 27 & 2 & Bachelor's & 2013-02-23 & FALSE & Human Resources & 79183 \\\\\n",
"\t 9871 & Sontag, Taylor & Female & 30 & 7 & Ph.D & 2011-02-23 & FALSE & Engineering & 164384 \\\\\n",
"\t 2828 & el-Ullah, Saabira & Female & 42 & 5 & Bachelor's & 1999-02-23 & FALSE & Sales & 138973 \\\\\n",
"\t 2836 & Ochoa, Luis & Male & 41 & 7 & Bachelor's & 1999-02-23 & FALSE & Corporate & 152011 \\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"| ID | Name | Gender | Age | Rating | Degree | Start_Date | Retired | Division | Salary |\n",
"|---|---|---|---|---|---|---|---|---|---|\n",
"| 2671 | Lewis, Austin | Male | 34 | 4 | Ph.D | 2007-02-23 | FALSE | Engineering | 182343 |\n",
"| 7821 | Holleman, Shaquaisha | Female | 43 | 8 | Master's | 1999-02-23 | FALSE | Human Resources | 149468 |\n",
"| 5915 | Rogers, Arielle | Female | 27 | 2 | Bachelor's | 2013-02-23 | FALSE | Human Resources | 79183 |\n",
"| 9871 | Sontag, Taylor | Female | 30 | 7 | Ph.D | 2011-02-23 | FALSE | Engineering | 164384 |\n",
"| 2828 | el-Ullah, Saabira | Female | 42 | 5 | Bachelor's | 1999-02-23 | FALSE | Sales | 138973 |\n",
"| 2836 | Ochoa, Luis | Male | 41 | 7 | Bachelor's | 1999-02-23 | FALSE | Corporate | 152011 |\n",
"\n"
],
"text/plain": [
" ID Name Gender Age Rating Degree Start_Date Retired\n",
"1 2671 Lewis, Austin Male 34 4 Ph.D 2007-02-23 FALSE \n",
"2 7821 Holleman, Shaquaisha Female 43 8 Master's 1999-02-23 FALSE \n",
"3 5915 Rogers, Arielle Female 27 2 Bachelor's 2013-02-23 FALSE \n",
"4 9871 Sontag, Taylor Female 30 7 Ph.D 2011-02-23 FALSE \n",
"5 2828 el-Ullah, Saabira Female 42 5 Bachelor's 1999-02-23 FALSE \n",
"6 2836 Ochoa, Luis Male 41 7 Bachelor's 1999-02-23 FALSE \n",
" Division Salary\n",
"1 Engineering 182343\n",
"2 Human Resources 149468\n",
"3 Human Resources 79183\n",
"4 Engineering 164384\n",
"5 Sales 138973\n",
"6 Corporate 152011"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
"\t- 376
\n",
"\t- 10
\n",
"
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 376\n",
"\\item 10\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 376\n",
"2. 10\n",
"\n",
"\n"
],
"text/plain": [
"[1] 376 10"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"employeesSub <- filter(employees, \n",
" Retired == FALSE, Start_Date >= \"1995-01-01\", Division != \"Operations\")\n",
"head(employeesSub)\n",
"dim(employeesSub)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "b0f1e781",
"metadata": {
"tags": [
"remove-output"
]
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\t- 376
\n",
"\t- 10
\n",
"
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 376\n",
"\\item 10\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 376\n",
"2. 10\n",
"\n",
"\n"
],
"text/plain": [
"[1] 376 10"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"dim(employeesSub)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "286519d9",
"metadata": {
"tags": [
"remove-input"
]
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[1] 376 10\n"
]
}
],
"source": [
"print(dim(employeesSub))"
]
},
{
"cell_type": "markdown",
"id": "1a6fdf1c",
"metadata": {},
"source": [
"When you list multiple conditions in `filter()`, those conditions are combined with \"and\". In the previous example, our new data frame `employeesSub` contains the `r nrow(employeesSub)` employees who are still working, *and* who started on or after January 1, 1995, *and* who are not in human resources. \n",
"\n",
"However, you might want to filter based on one condition *or* another. For example, imagine we wanted to find all employees who have a master's, *or* who started on or before December 31, 2000, *or* who make less than $100,000. To do this, instead of listing each condition as a separate argument, we combine the conditions with the `|` character, which evaluates to \"or\". For example:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "1f43e29a",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"ID | Name | Gender | Age | Rating | Degree | Start_Date | Retired | Division | Salary |
\n",
"\n",
"\t6881 | al-Rahimi, Tayyiba | Female | 51 | 10 | High School | 1990-02-23 | FALSE | Operations | 108804 |
\n",
"\t8925 | el-Jaffer, Manaal | Female | 50 | 10 | Master's | 1991-02-23 | FALSE | Engineering | 206770 |
\n",
"\t2769 | Soto, Michael | Male | 52 | 10 | High School | 1987-02-23 | FALSE | Sales | 183407 |
\n",
"\t2658 | al-Ebrahimi, Mamoon | Male | 55 | 8 | Ph.D | 1985-02-23 | FALSE | Corporate | 236240 |
\n",
"\t1933 | Medina, Brandy | Female | 62 | 7 | Associate's | 1979-02-23 | TRUE | Sales | NA |
\n",
"\t3570 | Troftgruben, Tierra | Female | 47 | 8 | High School | 1995-02-23 | FALSE | Operations | 101138 |
\n",
"\n",
"
\n"
],
"text/latex": [
"\\begin{tabular}{r|llllllllll}\n",
" ID & Name & Gender & Age & Rating & Degree & Start\\_Date & Retired & Division & Salary\\\\\n",
"\\hline\n",
"\t 6881 & al-Rahimi, Tayyiba & Female & 51 & 10 & High School & 1990-02-23 & FALSE & Operations & 108804 \\\\\n",
"\t 8925 & el-Jaffer, Manaal & Female & 50 & 10 & Master's & 1991-02-23 & FALSE & Engineering & 206770 \\\\\n",
"\t 2769 & Soto, Michael & Male & 52 & 10 & High School & 1987-02-23 & FALSE & Sales & 183407 \\\\\n",
"\t 2658 & al-Ebrahimi, Mamoon & Male & 55 & 8 & Ph.D & 1985-02-23 & FALSE & Corporate & 236240 \\\\\n",
"\t 1933 & Medina, Brandy & Female & 62 & 7 & Associate's & 1979-02-23 & TRUE & Sales & NA \\\\\n",
"\t 3570 & Troftgruben, Tierra & Female & 47 & 8 & High School & 1995-02-23 & FALSE & Operations & 101138 \\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"| ID | Name | Gender | Age | Rating | Degree | Start_Date | Retired | Division | Salary |\n",
"|---|---|---|---|---|---|---|---|---|---|\n",
"| 6881 | al-Rahimi, Tayyiba | Female | 51 | 10 | High School | 1990-02-23 | FALSE | Operations | 108804 |\n",
"| 8925 | el-Jaffer, Manaal | Female | 50 | 10 | Master's | 1991-02-23 | FALSE | Engineering | 206770 |\n",
"| 2769 | Soto, Michael | Male | 52 | 10 | High School | 1987-02-23 | FALSE | Sales | 183407 |\n",
"| 2658 | al-Ebrahimi, Mamoon | Male | 55 | 8 | Ph.D | 1985-02-23 | FALSE | Corporate | 236240 |\n",
"| 1933 | Medina, Brandy | Female | 62 | 7 | Associate's | 1979-02-23 | TRUE | Sales | NA |\n",
"| 3570 | Troftgruben, Tierra | Female | 47 | 8 | High School | 1995-02-23 | FALSE | Operations | 101138 |\n",
"\n"
],
"text/plain": [
" ID Name Gender Age Rating Degree Start_Date Retired\n",
"1 6881 al-Rahimi, Tayyiba Female 51 10 High School 1990-02-23 FALSE \n",
"2 8925 el-Jaffer, Manaal Female 50 10 Master's 1991-02-23 FALSE \n",
"3 2769 Soto, Michael Male 52 10 High School 1987-02-23 FALSE \n",
"4 2658 al-Ebrahimi, Mamoon Male 55 8 Ph.D 1985-02-23 FALSE \n",
"5 1933 Medina, Brandy Female 62 7 Associate's 1979-02-23 TRUE \n",
"6 3570 Troftgruben, Tierra Female 47 8 High School 1995-02-23 FALSE \n",
" Division Salary\n",
"1 Operations 108804\n",
"2 Engineering 206770\n",
"3 Sales 183407\n",
"4 Corporate 236240\n",
"5 Sales NA\n",
"6 Operations 101138"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
"\t- 741
\n",
"\t- 10
\n",
"
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 741\n",
"\\item 10\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 741\n",
"2. 10\n",
"\n",
"\n"
],
"text/plain": [
"[1] 741 10"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"employeesSubOr <- filter(employees, \n",
" Degree == \"Master's\" | Start_Date <= \"2000-12-31\" | Salary < 100000)\n",
"head(employeesSubOr)\n",
"dim(employeesSubOr)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "20f7d48d",
"metadata": {
"tags": [
"remove-output"
]
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\t- 741
\n",
"\t- 10
\n",
"
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 741\n",
"\\item 10\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 741\n",
"2. 10\n",
"\n",
"\n"
],
"text/plain": [
"[1] 741 10"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"dim(employeesSubOr)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "cffa42eb",
"metadata": {
"tags": [
"remove-input"
]
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[1] 741 10\n"
]
}
],
"source": [
"print(dim(employeesSubOr))"
]
},
{
"cell_type": "markdown",
"id": "d2ef30ba",
"metadata": {},
"source": [
"Now let's create a data frame with the employees who have a Master's or a Ph.D. We could do this using the or operator `|`:"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "87f2d4fb",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"ID | Name | Gender | Age | Rating | Degree | Start_Date | Retired | Division | Salary |
\n",
"\n",
"\t2671 | Lewis, Austin | Male | 34 | 4 | Ph.D | 2007-02-23 | FALSE | Engineering | 182343 |
\n",
"\t8925 | el-Jaffer, Manaal | Female | 50 | 10 | Master's | 1991-02-23 | FALSE | Engineering | 206770 |
\n",
"\t2658 | al-Ebrahimi, Mamoon | Male | 55 | 8 | Ph.D | 1985-02-23 | FALSE | Corporate | 236240 |
\n",
"\t7821 | Holleman, Shaquaisha | Female | 43 | 8 | Master's | 1999-02-23 | FALSE | Human Resources | 149468 |
\n",
"\t9871 | Sontag, Taylor | Female | 30 | 7 | Ph.D | 2011-02-23 | FALSE | Engineering | 164384 |
\n",
"\t2687 | Benavidez, Juan | Male | 45 | 8 | Ph.D | 1997-02-23 | FALSE | Engineering | 181924 |
\n",
"\n",
"
\n"
],
"text/latex": [
"\\begin{tabular}{r|llllllllll}\n",
" ID & Name & Gender & Age & Rating & Degree & Start\\_Date & Retired & Division & Salary\\\\\n",
"\\hline\n",
"\t 2671 & Lewis, Austin & Male & 34 & 4 & Ph.D & 2007-02-23 & FALSE & Engineering & 182343 \\\\\n",
"\t 8925 & el-Jaffer, Manaal & Female & 50 & 10 & Master's & 1991-02-23 & FALSE & Engineering & 206770 \\\\\n",
"\t 2658 & al-Ebrahimi, Mamoon & Male & 55 & 8 & Ph.D & 1985-02-23 & FALSE & Corporate & 236240 \\\\\n",
"\t 7821 & Holleman, Shaquaisha & Female & 43 & 8 & Master's & 1999-02-23 & FALSE & Human Resources & 149468 \\\\\n",
"\t 9871 & Sontag, Taylor & Female & 30 & 7 & Ph.D & 2011-02-23 & FALSE & Engineering & 164384 \\\\\n",
"\t 2687 & Benavidez, Juan & Male & 45 & 8 & Ph.D & 1997-02-23 & FALSE & Engineering & 181924 \\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"| ID | Name | Gender | Age | Rating | Degree | Start_Date | Retired | Division | Salary |\n",
"|---|---|---|---|---|---|---|---|---|---|\n",
"| 2671 | Lewis, Austin | Male | 34 | 4 | Ph.D | 2007-02-23 | FALSE | Engineering | 182343 |\n",
"| 8925 | el-Jaffer, Manaal | Female | 50 | 10 | Master's | 1991-02-23 | FALSE | Engineering | 206770 |\n",
"| 2658 | al-Ebrahimi, Mamoon | Male | 55 | 8 | Ph.D | 1985-02-23 | FALSE | Corporate | 236240 |\n",
"| 7821 | Holleman, Shaquaisha | Female | 43 | 8 | Master's | 1999-02-23 | FALSE | Human Resources | 149468 |\n",
"| 9871 | Sontag, Taylor | Female | 30 | 7 | Ph.D | 2011-02-23 | FALSE | Engineering | 164384 |\n",
"| 2687 | Benavidez, Juan | Male | 45 | 8 | Ph.D | 1997-02-23 | FALSE | Engineering | 181924 |\n",
"\n"
],
"text/plain": [
" ID Name Gender Age Rating Degree Start_Date Retired\n",
"1 2671 Lewis, Austin Male 34 4 Ph.D 2007-02-23 FALSE \n",
"2 8925 el-Jaffer, Manaal Female 50 10 Master's 1991-02-23 FALSE \n",
"3 2658 al-Ebrahimi, Mamoon Male 55 8 Ph.D 1985-02-23 FALSE \n",
"4 7821 Holleman, Shaquaisha Female 43 8 Master's 1999-02-23 FALSE \n",
"5 9871 Sontag, Taylor Female 30 7 Ph.D 2011-02-23 FALSE \n",
"6 2687 Benavidez, Juan Male 45 8 Ph.D 1997-02-23 FALSE \n",
" Division Salary\n",
"1 Engineering 182343\n",
"2 Engineering 206770\n",
"3 Corporate 236240\n",
"4 Human Resources 149468\n",
"5 Engineering 164384\n",
"6 Engineering 181924"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
"\t- 400
\n",
"\t- 10
\n",
"
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 400\n",
"\\item 10\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 400\n",
"2. 10\n",
"\n",
"\n"
],
"text/plain": [
"[1] 400 10"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"employeesGrad <- filter(employees, Degree == \"Master's\" | Degree == \"Ph.D\")\n",
"head(employeesGrad)\n",
"dim(employeesGrad)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "8e32fdae",
"metadata": {
"tags": [
"remove-output"
]
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\t- 400
\n",
"\t- 10
\n",
"
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 400\n",
"\\item 10\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 400\n",
"2. 10\n",
"\n",
"\n"
],
"text/plain": [
"[1] 400 10"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"dim(employeesGrad)"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "7848ec83",
"metadata": {
"tags": [
"remove-input"
]
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[1] 400 10\n"
]
}
],
"source": [
"print(dim(employeesGrad))"
]
},
{
"cell_type": "markdown",
"id": "96b55d43",
"metadata": {},
"source": [
"Now imagine we wanted to find all employees who have a Master's, a Ph.D, or a Bachelor's. We could add another `|` to our condition and specify that `Degree == \"Bachelor's\"`. Alternatively, we could make our code more compact by re-writing the condition as `var_name %in% values`. This will filter to only those rows where `var_name` is equal to one of the values specified in the atomic vector `values`. For example:"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "84c03d6f",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"ID | Name | Gender | Age | Rating | Degree | Start_Date | Retired | Division | Salary |
\n",
"\n",
"\t2671 | Lewis, Austin | Male | 34 | 4 | Ph.D | 2007-02-23 | FALSE | Engineering | 182343 |
\n",
"\t8925 | el-Jaffer, Manaal | Female | 50 | 10 | Master's | 1991-02-23 | FALSE | Engineering | 206770 |
\n",
"\t2658 | al-Ebrahimi, Mamoon | Male | 55 | 8 | Ph.D | 1985-02-23 | FALSE | Corporate | 236240 |
\n",
"\t7821 | Holleman, Shaquaisha | Female | 43 | 8 | Master's | 1999-02-23 | FALSE | Human Resources | 149468 |
\n",
"\t5915 | Rogers, Arielle | Female | 27 | 2 | Bachelor's | 2013-02-23 | FALSE | Human Resources | 79183 |
\n",
"\t9871 | Sontag, Taylor | Female | 30 | 7 | Ph.D | 2011-02-23 | FALSE | Engineering | 164384 |
\n",
"\n",
"
\n"
],
"text/latex": [
"\\begin{tabular}{r|llllllllll}\n",
" ID & Name & Gender & Age & Rating & Degree & Start\\_Date & Retired & Division & Salary\\\\\n",
"\\hline\n",
"\t 2671 & Lewis, Austin & Male & 34 & 4 & Ph.D & 2007-02-23 & FALSE & Engineering & 182343 \\\\\n",
"\t 8925 & el-Jaffer, Manaal & Female & 50 & 10 & Master's & 1991-02-23 & FALSE & Engineering & 206770 \\\\\n",
"\t 2658 & al-Ebrahimi, Mamoon & Male & 55 & 8 & Ph.D & 1985-02-23 & FALSE & Corporate & 236240 \\\\\n",
"\t 7821 & Holleman, Shaquaisha & Female & 43 & 8 & Master's & 1999-02-23 & FALSE & Human Resources & 149468 \\\\\n",
"\t 5915 & Rogers, Arielle & Female & 27 & 2 & Bachelor's & 2013-02-23 & FALSE & Human Resources & 79183 \\\\\n",
"\t 9871 & Sontag, Taylor & Female & 30 & 7 & Ph.D & 2011-02-23 & FALSE & Engineering & 164384 \\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"| ID | Name | Gender | Age | Rating | Degree | Start_Date | Retired | Division | Salary |\n",
"|---|---|---|---|---|---|---|---|---|---|\n",
"| 2671 | Lewis, Austin | Male | 34 | 4 | Ph.D | 2007-02-23 | FALSE | Engineering | 182343 |\n",
"| 8925 | el-Jaffer, Manaal | Female | 50 | 10 | Master's | 1991-02-23 | FALSE | Engineering | 206770 |\n",
"| 2658 | al-Ebrahimi, Mamoon | Male | 55 | 8 | Ph.D | 1985-02-23 | FALSE | Corporate | 236240 |\n",
"| 7821 | Holleman, Shaquaisha | Female | 43 | 8 | Master's | 1999-02-23 | FALSE | Human Resources | 149468 |\n",
"| 5915 | Rogers, Arielle | Female | 27 | 2 | Bachelor's | 2013-02-23 | FALSE | Human Resources | 79183 |\n",
"| 9871 | Sontag, Taylor | Female | 30 | 7 | Ph.D | 2011-02-23 | FALSE | Engineering | 164384 |\n",
"\n"
],
"text/plain": [
" ID Name Gender Age Rating Degree Start_Date Retired\n",
"1 2671 Lewis, Austin Male 34 4 Ph.D 2007-02-23 FALSE \n",
"2 8925 el-Jaffer, Manaal Female 50 10 Master's 1991-02-23 FALSE \n",
"3 2658 al-Ebrahimi, Mamoon Male 55 8 Ph.D 1985-02-23 FALSE \n",
"4 7821 Holleman, Shaquaisha Female 43 8 Master's 1999-02-23 FALSE \n",
"5 5915 Rogers, Arielle Female 27 2 Bachelor's 2013-02-23 FALSE \n",
"6 9871 Sontag, Taylor Female 30 7 Ph.D 2011-02-23 FALSE \n",
" Division Salary\n",
"1 Engineering 182343\n",
"2 Engineering 206770\n",
"3 Corporate 236240\n",
"4 Human Resources 149468\n",
"5 Human Resources 79183\n",
"6 Engineering 164384"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
"\t- 600
\n",
"\t- 10
\n",
"
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 600\n",
"\\item 10\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 600\n",
"2. 10\n",
"\n",
"\n"
],
"text/plain": [
"[1] 600 10"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"employeesCollege <- filter(employees, Degree %in% c(\"Bachelor's\", \"Master's\", \"Ph.D\"))\n",
"head(employeesCollege)\n",
"dim(employeesCollege)"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "24a26596",
"metadata": {
"tags": [
"remove-output"
]
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\t- 600
\n",
"\t- 10
\n",
"
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 600\n",
"\\item 10\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 600\n",
"2. 10\n",
"\n",
"\n"
],
"text/plain": [
"[1] 600 10"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"dim(employeesCollege)"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "40be76da",
"metadata": {
"tags": [
"remove-input"
]
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[1] 600 10\n"
]
}
],
"source": [
"print(dim(employeesCollege))"
]
},
{
"cell_type": "markdown",
"id": "f6998fd6",
"metadata": {},
"source": [
"```{warning}\n",
"Be careful filtering data when you have missing values (`NA`).\n",
"```\n",
"\n",
"The `filter()` function keeps only those rows where the specified condition(s) evaluate(s) to `TRUE`. This is complicated by the presence of missing values, as it is impossible to determine whether a condition is `TRUE` or `FALSE` if the relevant information is missing. In our employees data set, we are missing a `Salary` for around 5% of the employees. If we wanted to filter to only those individuals who made more than \\$100,000, how would `filter()` treat the 5% of employees with `NA` values for `Salary`? Because the condition does not evaluate to `TRUE` for these rows, they are dropped."
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "befed129",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"ID | Name | Gender | Age | Rating | Degree | Start_Date | Retired | Division | Salary |
\n",
"\n",
"\t6881 | al-Rahimi, Tayyiba | Female | 51 | 10 | High School | 1990-02-23 | FALSE | Operations | 108804 |
\n",
"\t2671 | Lewis, Austin | Male | 34 | 4 | Ph.D | 2007-02-23 | FALSE | Engineering | 182343 |
\n",
"\t8925 | el-Jaffer, Manaal | Female | 50 | 10 | Master's | 1991-02-23 | FALSE | Engineering | 206770 |
\n",
"\t2769 | Soto, Michael | Male | 52 | 10 | High School | 1987-02-23 | FALSE | Sales | 183407 |
\n",
"\t2658 | al-Ebrahimi, Mamoon | Male | 55 | 8 | Ph.D | 1985-02-23 | FALSE | Corporate | 236240 |
\n",
"\t3570 | Troftgruben, Tierra | Female | 47 | 8 | High School | 1995-02-23 | FALSE | Operations | 101138 |
\n",
"\n",
"
\n"
],
"text/latex": [
"\\begin{tabular}{r|llllllllll}\n",
" ID & Name & Gender & Age & Rating & Degree & Start\\_Date & Retired & Division & Salary\\\\\n",
"\\hline\n",
"\t 6881 & al-Rahimi, Tayyiba & Female & 51 & 10 & High School & 1990-02-23 & FALSE & Operations & 108804 \\\\\n",
"\t 2671 & Lewis, Austin & Male & 34 & 4 & Ph.D & 2007-02-23 & FALSE & Engineering & 182343 \\\\\n",
"\t 8925 & el-Jaffer, Manaal & Female & 50 & 10 & Master's & 1991-02-23 & FALSE & Engineering & 206770 \\\\\n",
"\t 2769 & Soto, Michael & Male & 52 & 10 & High School & 1987-02-23 & FALSE & Sales & 183407 \\\\\n",
"\t 2658 & al-Ebrahimi, Mamoon & Male & 55 & 8 & Ph.D & 1985-02-23 & FALSE & Corporate & 236240 \\\\\n",
"\t 3570 & Troftgruben, Tierra & Female & 47 & 8 & High School & 1995-02-23 & FALSE & Operations & 101138 \\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"| ID | Name | Gender | Age | Rating | Degree | Start_Date | Retired | Division | Salary |\n",
"|---|---|---|---|---|---|---|---|---|---|\n",
"| 6881 | al-Rahimi, Tayyiba | Female | 51 | 10 | High School | 1990-02-23 | FALSE | Operations | 108804 |\n",
"| 2671 | Lewis, Austin | Male | 34 | 4 | Ph.D | 2007-02-23 | FALSE | Engineering | 182343 |\n",
"| 8925 | el-Jaffer, Manaal | Female | 50 | 10 | Master's | 1991-02-23 | FALSE | Engineering | 206770 |\n",
"| 2769 | Soto, Michael | Male | 52 | 10 | High School | 1987-02-23 | FALSE | Sales | 183407 |\n",
"| 2658 | al-Ebrahimi, Mamoon | Male | 55 | 8 | Ph.D | 1985-02-23 | FALSE | Corporate | 236240 |\n",
"| 3570 | Troftgruben, Tierra | Female | 47 | 8 | High School | 1995-02-23 | FALSE | Operations | 101138 |\n",
"\n"
],
"text/plain": [
" ID Name Gender Age Rating Degree Start_Date Retired\n",
"1 6881 al-Rahimi, Tayyiba Female 51 10 High School 1990-02-23 FALSE \n",
"2 2671 Lewis, Austin Male 34 4 Ph.D 2007-02-23 FALSE \n",
"3 8925 el-Jaffer, Manaal Female 50 10 Master's 1991-02-23 FALSE \n",
"4 2769 Soto, Michael Male 52 10 High School 1987-02-23 FALSE \n",
"5 2658 al-Ebrahimi, Mamoon Male 55 8 Ph.D 1985-02-23 FALSE \n",
"6 3570 Troftgruben, Tierra Female 47 8 High School 1995-02-23 FALSE \n",
" Division Salary\n",
"1 Operations 108804\n",
"2 Engineering 182343\n",
"3 Engineering 206770\n",
"4 Sales 183407\n",
"5 Corporate 236240\n",
"6 Operations 101138"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"employees100k <- filter(employees, Salary >= 100000)\n",
"head(employees100k)"
]
},
{
"cell_type": "markdown",
"id": "7b6bbb11",
"metadata": {},
"source": [
"In this example, `employees100k` would *not* include any of the employees whose salary is unknown. Note that R provides no warning that these rows are being excluded as well; it is up to the user to recognize that the data contains missing values, and that this will affect how data are filtered. \n",
"\n",
"If you did not want R to drop the missing values, you could explicitly state in the condition to keep all rows where `Salary` is greater than or equal to $100,000, *or* where `Salary` is missing. As we saw before, we can combine conditions with \"or\" using the `|` character. It may be tempting to assume we should add `Salary == NA` in order to capture the rows where `Salary` is missing; however, this will not work. We cannot use logical operators like `==` to compare something to `NA`. Instead have to use the `is.na()`, as follows:"
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "4c47d93e",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"ID | Name | Gender | Age | Rating | Degree | Start_Date | Retired | Division | Salary |
\n",
"\n",
"\t6881 | al-Rahimi, Tayyiba | Female | 51 | 10 | High School | 1990-02-23 | FALSE | Operations | 108804 |
\n",
"\t2671 | Lewis, Austin | Male | 34 | 4 | Ph.D | 2007-02-23 | FALSE | Engineering | 182343 |
\n",
"\t8925 | el-Jaffer, Manaal | Female | 50 | 10 | Master's | 1991-02-23 | FALSE | Engineering | 206770 |
\n",
"\t2769 | Soto, Michael | Male | 52 | 10 | High School | 1987-02-23 | FALSE | Sales | 183407 |
\n",
"\t2658 | al-Ebrahimi, Mamoon | Male | 55 | 8 | Ph.D | 1985-02-23 | FALSE | Corporate | 236240 |
\n",
"\t1933 | Medina, Brandy | Female | 62 | 7 | Associate's | 1979-02-23 | TRUE | Sales | NA |
\n",
"\n",
"
\n"
],
"text/latex": [
"\\begin{tabular}{r|llllllllll}\n",
" ID & Name & Gender & Age & Rating & Degree & Start\\_Date & Retired & Division & Salary\\\\\n",
"\\hline\n",
"\t 6881 & al-Rahimi, Tayyiba & Female & 51 & 10 & High School & 1990-02-23 & FALSE & Operations & 108804 \\\\\n",
"\t 2671 & Lewis, Austin & Male & 34 & 4 & Ph.D & 2007-02-23 & FALSE & Engineering & 182343 \\\\\n",
"\t 8925 & el-Jaffer, Manaal & Female & 50 & 10 & Master's & 1991-02-23 & FALSE & Engineering & 206770 \\\\\n",
"\t 2769 & Soto, Michael & Male & 52 & 10 & High School & 1987-02-23 & FALSE & Sales & 183407 \\\\\n",
"\t 2658 & al-Ebrahimi, Mamoon & Male & 55 & 8 & Ph.D & 1985-02-23 & FALSE & Corporate & 236240 \\\\\n",
"\t 1933 & Medina, Brandy & Female & 62 & 7 & Associate's & 1979-02-23 & TRUE & Sales & NA \\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"| ID | Name | Gender | Age | Rating | Degree | Start_Date | Retired | Division | Salary |\n",
"|---|---|---|---|---|---|---|---|---|---|\n",
"| 6881 | al-Rahimi, Tayyiba | Female | 51 | 10 | High School | 1990-02-23 | FALSE | Operations | 108804 |\n",
"| 2671 | Lewis, Austin | Male | 34 | 4 | Ph.D | 2007-02-23 | FALSE | Engineering | 182343 |\n",
"| 8925 | el-Jaffer, Manaal | Female | 50 | 10 | Master's | 1991-02-23 | FALSE | Engineering | 206770 |\n",
"| 2769 | Soto, Michael | Male | 52 | 10 | High School | 1987-02-23 | FALSE | Sales | 183407 |\n",
"| 2658 | al-Ebrahimi, Mamoon | Male | 55 | 8 | Ph.D | 1985-02-23 | FALSE | Corporate | 236240 |\n",
"| 1933 | Medina, Brandy | Female | 62 | 7 | Associate's | 1979-02-23 | TRUE | Sales | NA |\n",
"\n"
],
"text/plain": [
" ID Name Gender Age Rating Degree Start_Date Retired\n",
"1 6881 al-Rahimi, Tayyiba Female 51 10 High School 1990-02-23 FALSE \n",
"2 2671 Lewis, Austin Male 34 4 Ph.D 2007-02-23 FALSE \n",
"3 8925 el-Jaffer, Manaal Female 50 10 Master's 1991-02-23 FALSE \n",
"4 2769 Soto, Michael Male 52 10 High School 1987-02-23 FALSE \n",
"5 2658 al-Ebrahimi, Mamoon Male 55 8 Ph.D 1985-02-23 FALSE \n",
"6 1933 Medina, Brandy Female 62 7 Associate's 1979-02-23 TRUE \n",
" Division Salary\n",
"1 Operations 108804\n",
"2 Engineering 182343\n",
"3 Engineering 206770\n",
"4 Sales 183407\n",
"5 Corporate 236240\n",
"6 Sales NA"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"employees100kNA <- filter(employees, Salary >= 100000 | is.na(Salary))\n",
"head(employees100kNA)"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "b0f2b17a",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"ID | Name | Gender | Age | Rating | Degree | Start_Date | Retired | Division | Salary |
\n",
"\n",
"\t5271 | Leschinsky, Johanna | Female | 36 | 4 | Ph.D | 2004-02-23 | FALSE | Corporate | 160364 |
\n",
"\t6681 | Bruns, Austin | Male | 41 | 8 | Ph.D | 2002-02-23 | FALSE | Engineering | 188656 |
\n",
"\t2031 | Martinez, Caleb | Male | 57 | 8 | Ph.D | 1984-02-23 | FALSE | Engineering | 218430 |
\n",
"\t3239 | Larson, Trusten | Male | 37 | 5 | Bachelor's | 2002-02-23 | FALSE | Human Resources | 149789 |
\n",
"\t3717 | Levy-Minter, Quintin | Male | 53 | 10 | Bachelor's | 1989-02-23 | FALSE | Operations | 172703 |
\n",
"\t4209 | Dena, Gianna | Female | 49 | 6 | Master's | 1991-02-23 | FALSE | Accounting | 185445 |
\n",
"\n",
"
\n"
],
"text/latex": [
"\\begin{tabular}{r|llllllllll}\n",
" ID & Name & Gender & Age & Rating & Degree & Start\\_Date & Retired & Division & Salary\\\\\n",
"\\hline\n",
"\t 5271 & Leschinsky, Johanna & Female & 36 & 4 & Ph.D & 2004-02-23 & FALSE & Corporate & 160364 \\\\\n",
"\t 6681 & Bruns, Austin & Male & 41 & 8 & Ph.D & 2002-02-23 & FALSE & Engineering & 188656 \\\\\n",
"\t 2031 & Martinez, Caleb & Male & 57 & 8 & Ph.D & 1984-02-23 & FALSE & Engineering & 218430 \\\\\n",
"\t 3239 & Larson, Trusten & Male & 37 & 5 & Bachelor's & 2002-02-23 & FALSE & Human Resources & 149789 \\\\\n",
"\t 3717 & Levy-Minter, Quintin & Male & 53 & 10 & Bachelor's & 1989-02-23 & FALSE & Operations & 172703 \\\\\n",
"\t 4209 & Dena, Gianna & Female & 49 & 6 & Master's & 1991-02-23 & FALSE & Accounting & 185445 \\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"| ID | Name | Gender | Age | Rating | Degree | Start_Date | Retired | Division | Salary |\n",
"|---|---|---|---|---|---|---|---|---|---|\n",
"| 5271 | Leschinsky, Johanna | Female | 36 | 4 | Ph.D | 2004-02-23 | FALSE | Corporate | 160364 |\n",
"| 6681 | Bruns, Austin | Male | 41 | 8 | Ph.D | 2002-02-23 | FALSE | Engineering | 188656 |\n",
"| 2031 | Martinez, Caleb | Male | 57 | 8 | Ph.D | 1984-02-23 | FALSE | Engineering | 218430 |\n",
"| 3239 | Larson, Trusten | Male | 37 | 5 | Bachelor's | 2002-02-23 | FALSE | Human Resources | 149789 |\n",
"| 3717 | Levy-Minter, Quintin | Male | 53 | 10 | Bachelor's | 1989-02-23 | FALSE | Operations | 172703 |\n",
"| 4209 | Dena, Gianna | Female | 49 | 6 | Master's | 1991-02-23 | FALSE | Accounting | 185445 |\n",
"\n"
],
"text/plain": [
" ID Name Gender Age Rating Degree Start_Date Retired\n",
"1 5271 Leschinsky, Johanna Female 36 4 Ph.D 2004-02-23 FALSE \n",
"2 6681 Bruns, Austin Male 41 8 Ph.D 2002-02-23 FALSE \n",
"3 2031 Martinez, Caleb Male 57 8 Ph.D 1984-02-23 FALSE \n",
"4 3239 Larson, Trusten Male 37 5 Bachelor's 2002-02-23 FALSE \n",
"5 3717 Levy-Minter, Quintin Male 53 10 Bachelor's 1989-02-23 FALSE \n",
"6 4209 Dena, Gianna Female 49 6 Master's 1991-02-23 FALSE \n",
" Division Salary\n",
"1 Corporate 160364\n",
"2 Engineering 188656\n",
"3 Engineering 218430\n",
"4 Human Resources 149789\n",
"5 Operations 172703\n",
"6 Accounting 185445"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"tail(employees100kNA)"
]
}
],
"metadata": {
"jupytext": {
"cell_metadata_filter": "-all",
"formats": "md:myst",
"text_representation": {
"extension": ".md",
"format_name": "myst",
"format_version": 0.13,
"jupytext_version": "1.11.5"
}
},
"kernelspec": {
"display_name": "R",
"language": "R",
"name": "ir"
},
"language_info": {
"codemirror_mode": "r",
"file_extension": ".r",
"mimetype": "text/x-r-source",
"name": "R",
"pygments_lexer": "r",
"version": "3.6.1"
},
"source_map": [
16,
22,
46,
52,
57,
60,
68,
75,
80,
83,
89,
96,
101,
104,
108,
114,
119,
122,
126,
132,
137,
140,
148,
151,
157,
162
]
},
"nbformat": 4,
"nbformat_minor": 5
}