R Tutorial

R is a programming language and environment for statistical computing and graphics. It is widely used among statisticians and data miners.

Key Concepts

  • R was created by Ross Ihaka and Robert Gentleman in 1993.
  • R is open-source and free to use.
  • R is the gold standard for statistical analysis and data visualization.
  • It has a massive ecosystem of packages available via CRAN.
R Example
"Hello World!"
Live Editor

R Introduction

R is a language and environment for statistical computing and graphics. It provides a wide variety of statistical and graphical techniques and is highly extensible.

Why Use R?

  • Statistical Analysis: R has extensive support for statistical computing including linear and nonlinear modelling, time-series analysis, classification, and clustering.
  • Data Visualization: R produces publication-quality plots with just a few lines of code, including mathematical symbols and formulae.
  • Cross-Platform: R runs on Windows, macOS, and Linux.
  • Huge Community: Thousands of contributed packages available on CRAN.

R is particularly popular in academia and research. If you are working in data science, bioinformatics, or statistical research, learning R is essential. R's strength lies in its vectorized operations that make it fast for data analysis, and its unparalleled plotting capabilities through libraries like ggplot2.

R Get Started

To start using R, you need to install R and optionally RStudio, a popular IDE for R.

Installation Steps

  • Download R from CRAN.
  • Download RStudio (recommended IDE) from posit.co.
  • Verify installation by typing R --version in your terminal.

Best Practices

  • Use RStudio for a much better development experience with syntax highlighting, autocompletion, and built-in plot viewer.
  • Install packages using install.packages("package_name").

R Syntax

R uses <- as the primary assignment operator. You can also use =, but <- is the R convention.

Key Concepts

  • R is case-sensitive.
  • The assignment operator is <- (read as "gets").
  • Use print() or just type the variable name to output values.
  • R uses # for comments.
Syntax Example
name <- "John"
age <- 25
print(name)
print(paste("Age:", age))
Live Editor

Common Mistakes

  • Using = instead of <-. While = works, it's not idiomatic R and can cause unexpected behavior in some contexts.
  • Forgetting that R is 1-indexed (not 0-indexed like Python or C).

R Comments

Comments in R start with the # symbol. R does not have multi-line comments, but you can place a # before each line.

Comment Example
# This is a comment
"Hello World!" # This is also a comment

# This is a multi-line comment
# written across several lines

R Variables

R does not have a command for declaring a variable. A variable is created the moment you first assign a value to it.

Key Concepts

  • Use <- to assign values.
  • Variable names must start with a letter and can contain letters, digits, periods, and underscores.
  • R is case-sensitive: myVar and myvar are different.
  • You can concatenate text using paste().
Variable Example
name <- "John"
age <- 40

# Concatenate
paste("My name is", name, "and I am", age)
Live Editor

R Data Types

In R, variables do not need to be declared with a type. The type is inferred from the value assigned.

Basic Data Types

  • Numeric: 10.5 — decimal values.
  • Integer: 10L — whole numbers (use L suffix).
  • Character: "Hello" — text strings.
  • Logical: TRUE / FALSE — boolean values.
  • Complex: 3+5i — complex numbers.
Data Type Example
x <- 10.5    # numeric
y <- 10L     # integer
z <- "Hello" # character
w <- TRUE    # logical

class(x)
class(y)
class(z)
Live Editor

R Operators

Operators are used to perform operations on variables and values.

Operator Types

  • Arithmetic: +, -, *, /, ^ (exponentiation), %% (modulus), %/% (integer division).
  • Assignment: <-, =, <<-
  • Comparison: ==, !=, >, <, >=, <=
  • Logical: &, |, !

R Strings

Strings in R are defined with either single or double quotes. R provides several functions for string manipulation.

Useful String Functions

  • nchar() — number of characters.
  • grepl() — check if a pattern exists in a string.
  • paste() — concatenate strings.
  • substr() — extract a substring.
  • toupper() / tolower() — case conversion.
String Example
str <- "Hello World!"
nchar(str)           # 12
toupper(str)         # "HELLO WORLD!"
grepl("World", str)  # TRUE

R If...Else

R supports the usual conditional statements: if, else if, and else.

If...Else Example
a <- 200
b <- 33

if (b > a) {
  print("b is greater than a")
} else if (a == b) {
  print("a and b are equal")
} else {
  print("a is greater than b")
}
Live Editor

R Loops

Loops can execute a block of code as long as a specified condition is reached.

Loop Types

  • while — loops while a condition is true.
  • for — iterates over a sequence.
  • repeat — loops until a break is hit.
  • next — skips the current iteration (like continue).
  • break — exits the loop.
Loop Examples
# While loop
i <- 1
while (i < 6) {
  print(i)
  i <- i + 1
}

# For loop
fruits <- list("apple", "banana", "cherry")
for (x in fruits) {
  print(x)
}
Live Editor

R Functions

A function is a block of code which only runs when it is called. You create a function using the function() keyword.

Function Example
my_function <- function(fname) {
  paste("Hello", fname)
}

my_function("Peter")  # "Hello Peter"
my_function("Lois")   # "Hello Lois"
Live Editor

R Vectors

A vector is the simplest data structure in R. It is a sequence of data elements of the same basic type, created using the c() function.

Key Concepts

  • Vectors contain elements of the same type.
  • Created with the combine function: c().
  • R is 1-indexed (first element is at position 1, not 0).
  • You can sort, access, and modify vector elements.
Vector Example
fruits <- c("apple", "banana", "cherry")
numbers <- c(1, 5, 3, 6, 2)

fruits[1]          # "apple"
sort(numbers)      # 1 2 3 5 6
length(fruits)     # 3
Live Editor

R Lists

A list in R can contain many different data types inside it — numbers, strings, vectors, and even another list.

List Example
thislist <- list("apple", "banana", "cherry")

# Access elements
thislist[[1]]   # "apple"

# Change element
thislist[[1]] <- "blackcurrant"

# List length
length(thislist)  # 3

R Matrices

A matrix is a two-dimensional data set with columns and rows. It is created using the matrix() function.

Matrix Example
thismatrix <- matrix(c(1,2,3,4,5,6), nrow = 3, ncol = 2)
thismatrix

# Access element at row 1, column 2
thismatrix[1, 2]  # 4
Live Editor

R Data Frames

Data Frames are data displayed in a table format, similar to a spreadsheet or SQL table. They are the most commonly used data structure in R for data analysis.

Key Concepts

  • Columns can be of different types (numeric, character, logical, etc.).
  • Created using the data.frame() function.
  • Use summary() to get a statistical summary.
  • Access columns with $ operator.
Data Frame Example
Data_Frame <- data.frame(
  Training = c("Strength", "Stamina", "Other"),
  Pulse = c(100, 150, 120),
  Duration = c(60, 30, 45)
)

Data_Frame
summary(Data_Frame)
Live Editor

R Factors

Factors are used to categorize data. Examples of categories are Gender (Male/Female), Music genre (Rock, Pop, Jazz).

Factor Example
music <- c("Jazz", "Rock", "Classic", "Classic", "Pop", "Jazz", "Rock", "Jazz")
music_factor <- factor(music)

print(music_factor)
print(levels(music_factor))

R Statistics

R is built for statistics. It has a comprehensive set of functions for performing statistical analysis right out of the box.

Statistical Functions

  • mean() — average value.
  • median() — middle value.
  • min() / max() — smallest / largest value.
  • sd() — standard deviation.
  • var() — variance.
  • cor() — correlation.
  • summary() — comprehensive summary.
Statistics Example
numbers <- c(10, 20, 30, 40, 50)

mean(numbers)    # 30
median(numbers)  # 30
sd(numbers)      # 15.81
max(numbers)     # 50
Live Editor

R Plotting

The plot() function is used to draw points (markers) in a diagram. R's base graphics are powerful and flexible.

Plot Types

  • plot() — scatter plots and line charts.
  • barplot() — bar charts.
  • pie() — pie charts.
  • hist() — histograms.
  • boxplot() — box plots.
Plot Example
# Scatter plot
plot(1, 3)

# Multiple points
plot(c(1,2,3,4,5), c(3,7,8,9,12))

# Line plot
plot(1:10, type="l")

R GGPlot2

ggplot2 is the most popular data visualization package in R. It is based on the Grammar of Graphics, which provides a systematic way to create any plot.

Key Concepts

  • Data: The dataset to visualize.
  • Aesthetics (aes()): Mapping of variables to visual properties (x, y, color, size).
  • Geometries (geom_*()): The type of plot — points, lines, bars, etc.
  • Facets: Creating multi-panel plots.
GGPlot2 Example
library(ggplot2)

# Create a scatter plot
ggplot(mtcars, aes(x=wt, y=mpg)) +
  geom_point() +
  labs(title="Car Weight vs MPG",
       x="Weight", y="Miles per Gallon")

Best Practices

  • Install ggplot2 with install.packages("ggplot2").
  • Always label your axes and give your plot a title.
  • Use themes like theme_minimal() or theme_bw() for clean, professional plots.

R Quiz

Test your R skills with our Quiz!

R Exercises

We have gathered a variety of R exercises (with answers) for each R Chapter.

🎉 Congratulations!

You've completed the R module.