Essential R Commands

Section

Command

Explanation

Link

R Basics

+ - * /

Addition, subtraction, muliplication, division

Here

R Basics

^

Exponentiation

Here

R Basics

<-

Assignment

Here

R Basics

class(var)

Determine the data type of var

Here

R Basics

c()

Create an atomic vector with multiple elements

Here

R Basics

length(x)

Determine the length of the atomic vector x

Here

R Basics

sum(x)

Sum the values of the atomic vector x

Here

R Basics

mean(x)

Determine the mean of the atomic vector x

Here

R Basics

min(x) & max(x)

Determine the miniomum and maximum (respectively) of the atomic vector x

Here

Data Frames

install.packages("packageName")

Install a new package called packageName

Here

Data Frames

library(packageName)

Load a pre-installed package called packageName

Here

Data Frames

tidyverse::read_csv(file)

Read in data from a csv file (file)

Here

Data Frames

readxl::read_excel(file)

Read in data from an Excel file (file)

Here

Data Frames

nrow(df)

Count the number of rows of the df dataframe

Here

Data Frames

ncol(df)

Count the number of columns of the df dataframe

Here

Data Frames

dim(df)

Determine the dimensions of the df dataframe

Here

Data Frames

head(df) & tail(df)

Output the first and last few rows (respectively) of the df dataframe

Here

Data Frames

str(df)

Output the structure of the df dataframe

Here

Data Frames

df$var

Return an atomic vector of the var column from the df dataframe

Here

Data Frames

tidyverse::parse_number(x)

Convert the atomic vector x to a numeric

Here

Data Frames

tidyverse::parse_date(x)

Convert the atomic vector x to a date

Here

Data Frames

tidyverse::arrange(df, var1, var2, var3, ...)

Sort the dataframe df by the columns var1, var2, var3, etc.

Here

Data Frames

tidyverse::filter(df, cond1, cond2, cond3, ...)

Filter the dataframe df by the logical condition cond1, cond2, cond3, etc.

Here

Data Frames

tidyverse::select(df, var1, var2, var3, ...)

Select columns var1, var2, var3, etc. from the dataframe df

Here

Exploring Data

median(x)

Determine the median of the atomic vector x

Here

Exploring Data

quantile(x, probs = c(0, 0.25, 0.5, 0.75, 1))

Determine the specified quantiles (in probs) of the atomic vector x

Here

Exploring Data

sd(x) & var(x)

Calculate the standard deviation and variance (respectively) of the atomic vector x

Here

Exploring Data

table(x) & prop.table(table(x))

Tabulate the count and proportion (respectively) of the atomic vector x

Here

Exploring Data

table(x) & prop.table(table(x))

Tabulate the count and proportion (respectively) of the atomic vector x

Here

Exploring Data

hist(x)

Create a histogram of the atomic vector x

Here

Exploring Data

boxplot(x)

Create a boxplot of the atomic vector x

Here

Exploring Data

boxplot(y ~ x)

Create a side-by-side boxplot of the atomic vector y over the values of x

Here

Exploring Data

plot(x, y)

Create a scatter plot of x and y

Here

Exploring Data

barplot(x)

Create a barplot of the atomic vector x

Here

Exploring Data

pie(table(x))

Create a pie chart of the atomic vector x

Here

Wrangling & Visualization with the tidyverse

%>%

Chain multiple tidyverse operations together

Here

Wrangling & Visualization with the tidyverse

tidyverse::summarise(df, stat1 = ..., stat2 = ..., ...)

Calculate summary statistics (stat1, stat2, etc.) from the dataframe df

Here

Wrangling & Visualization with the tidyverse

tidyverse::group_by(var)

Group tidyverse operations by the var variable

Here

Statistical Inference

t.test(x, mu = 0)

Conduct a one-sample t-test of means

Here

Statistical Inference

binom.test(x, n, p = 0.5)

Conduct a one-sample proportions test

Here

Statistical Inference

t.test(y ~ x)

Conduct a two-sample t-test of means, where y contains the outcome variable and x indicates group membership

Here

Statistical Inference

prop.test(x = c(x1, x2), n = c(n1, n2))

Conduct a two-sample proportions test

Here

Causal Inference

pwr::pwr.t.test(d, power)

Perform a power calculation based on the effect size (d) and power of the test (power)

Here

Linear Regression

cor(x, y)

Calculate the correlation between the atomic vectors x and y

Here

Linear Regression

cor.test(x, y)

Calculate the correlation between the atomic vectors x and y with a 95% confidence interval

Here

Linear Regression

fit <- lm(y ~ x1 + x2 + ..., data = df)

Create a linear regression model called fit based on the y, x1, x2, etc. variables from the dataframe df

Here

Linear Regression

confint(fit)

Calculate 95% confidence intervals for the regression coefficients in the model fit

Here

Linear Regression

summary(fit)

Summarize the regression model fit

Here

Linear Regression

predict(fit, newData)

Apply the regression model fit to new observations stored in newData

Here

Logistic Regression

fit <- glm(y ~ x1 + x2 + ..., data = df, family = "binomial")

Create a logistic regression model called fit based on the y, x1, x2, etc. variables from the dataframe df

Here

Logistic Regression

predict(fit, newData, type = "response")

Apply the logistic regression model fit to new observations stored in newData

Here

Logistic Regression

MLmetrics::LogLoss(y_pred, y_true)

Calculate the log loss of a model whose predictions are stored in y_pred based on the true values stored in y_true

Here

Tree Models

fit <- rpart::rpart(y ~ x1 + x2 + ..., data = df)

Create a decision tree model called fit based on the y, x1, x2, etc. variables from the dataframe df

Here

Tree Models

rpart.plot::rpart.plot(fit)

Visualize the decision tree model fit

Here

Model Evaluation

caTools::sample.split(df$y)

Define a random split of the dataframe df based on the outcome variable y

Here

Model Evaluation

MLmetrics::Accuracy(y_pred, y_true)

Calculate the accuracy of a model whose predictions are stored in y_pred based on the true values stored in y_true

Here

Model Evaluation

MLmetrics::ConfusionMatrix(y_pred, y_true)

Calculate the confusion matrix of a model whose predictions are stored in y_pred based on the true values stored in y_true

Here

Model Evaluation

MLmetrics::AUC(y_pred, y_true)

Calculate the area under the curve (AUC) of a model whose predictions are stored in y_pred based on the true values stored in y_true

Here