Sorting Data

Often you would like to sort your data based on one or more of the columns in your data set. This can be done using the arrange() function, which uses the following syntax:

Syntax

tidyverse::arrange(df, var1, var2, var3, ...)

  • Required arguments

    • df: The tibble (data frame) with the data you would like to sort.

    • var1: The name of the column to use to sort the data.

  • Optional arguments

    • var2, var3, ...: The name of additional columns to use to sort the data. When multiple columns are specified, each additional column is used to break ties in the preceding column.

By default, arrange() sorts numeric variables from smallest to largest and character variables alphabetically. You can reverse the order of the sort by surrounding the column name with desc() in the function call.

First, let’s create a new version of the data frame called employeesSortedAge, with the employees sorted from youngest to oldest.

employeesSortedAge <- arrange(employees, Age)
head(employeesSortedAge)
IDNameGenderAgeRatingDegreeStart_DateRetiredDivisionSalary
7068 Dimas, Roman Male 25 8 High School 2017-02-23 FALSE Operations 84252
5464 al-Pirani, Rajab Male 25 3 Associate's 2016-02-23 FALSE Operations 37907
7910 Hopper, Summer Female 25 7 Bachelor's 2017-02-23 FALSE Engineering 100688
6784 al-Siddique, ZaitoonaFemale 25 4 Master's 2015-02-23 FALSE Human Resources 127618
3240 Steggall, Shai Female 25 7 Master's 2017-02-23 FALSE Operations 117062
1413 Tanner, Sean Male 25 2 Associate's 2016-02-23 FALSE Operations 61869
tail(employeesSortedAge)
IDNameGenderAgeRatingDegreeStart_DateRetiredDivisionSalary
6798 Werkele, Jakob Male 65 7 Ph.D 1976-02-23 TRUE Engineering NA
6291 Anderson, Collyn Male 65 6 High School 1977-02-23 FALSE Operations 179634
8481 Phillips, Jasmyn Female 65 5 High School 1975-02-23 TRUE Sales NA
4600 Olivas, Julian Male 65 2 Ph.D 1976-02-23 FALSE Engineering 204576
6777 Mortimer, KendallFemale 65 7 Master's 1977-02-23 FALSE Corporate 248925
2924 Mills, Tasia Female 65 8 High School 1977-02-23 FALSE Operations 138212

We can instead sort the data from oldest to youngest by adding desc() around Age:

employeesSortedAgeDesc <- arrange(employees, desc(Age))
head(employeesSortedAgeDesc)
IDNameGenderAgeRatingDegreeStart_DateRetiredDivisionSalary
8060 al-Morad, MastoorMale 65 8 Ph.D 1977-02-23 FALSE Corporate 213381
9545 Lloyd, Devante Male 65 9 Bachelor's 1974-02-23 FALSE Accounting 243326
7305 Law, Charisma Female 65 8 Associate's 1976-02-23 FALSE Human Resources 214788
4141 Herrera, Yarabbi Female 65 8 High School 1975-02-23 FALSE Operations 143728
2559 Holiday, Emma Female 65 7 Bachelor's 1975-02-23 TRUE Operations NA
4407 Ross, Caitlyn Female 65 7 Bachelor's 1975-02-23 TRUE Corporate NA
tail(employeesSortedAgeDesc)
IDNameGenderAgeRatingDegreeStart_DateRetiredDivisionSalary
1413 Tanner, Sean Male 25 2 Associate's 2016-02-23 FALSE Operations 61869
8324 Bancroft, Isaiah Male 25 7 Master's 2017-02-23 FALSE Corporate 135935
1230 Kirgis, Arissa Female 25 8 Bachelor's 2015-02-23 FALSE Operations 113573
6308 Barnett, MarquiseMale 25 8 Master's 2016-02-23 FALSE Operations 103798
3241 Byrd, Sydny Female 25 6 Ph.D 2016-02-23 FALSE Engineering 126366
9249 Lopez, Karissa Female 25 8 Associate's 2016-02-23 FALSE Sales 75689

Now imagine that we wanted to perform a multi-level sort, where we first sort the employees from oldest to youngest, and then within each age sort the names alphabetically. We can do this by adding the Name column to our function call:

employeesSortedAgeDescName <- arrange(employees, desc(Age), Name)
head(employeesSortedAgeDescName)
IDNameGenderAgeRatingDegreeStart_DateRetiredDivisionSalary
8060 al-Morad, MastoorMale 65 8 Ph.D 1977-02-23 FALSE Corporate 213381
6291 Anderson, Collyn Male 65 6 High School 1977-02-23 FALSE Operations 179634
3661 el-Meskin, Asad Male 65 9 Bachelor's 1977-02-23 FALSE Engineering 177504
5245 Gowen, Hannah Female 65 7 Bachelor's 1975-02-23 FALSE Accounting 191765
4141 Herrera, Yarabbi Female 65 8 High School 1975-02-23 FALSE Operations 143728
2559 Holiday, Emma Female 65 7 Bachelor's 1975-02-23 TRUE Operations NA
tail(employeesSortedAgeDescName)
IDNameGenderAgeRatingDegreeStart_DateRetiredDivisionSalary
7068 Dimas, Roman Male 25 8 High School 2017-02-23 FALSE Operations 84252
7910 Hopper, SummerFemale 25 7 Bachelor's 2017-02-23 FALSE Engineering 100688
1230 Kirgis, ArissaFemale 25 8 Bachelor's 2015-02-23 FALSE Operations 113573
9249 Lopez, KarissaFemale 25 8 Associate's 2016-02-23 FALSE Sales 75689
3240 Steggall, ShaiFemale 25 7 Master's 2017-02-23 FALSE Operations 117062
1413 Tanner, Sean Male 25 2 Associate's 2016-02-23 FALSE Operations 61869