Key Concepts
- R was created by Ross Ihaka and Robert Gentleman in 1993.
- R is open-source and free to use.
- R is the gold standard for statistical analysis and data visualization.
- It has a massive ecosystem of packages available via CRAN.
R is a programming language and environment for statistical computing and graphics. It is widely used among statisticians and data miners.
"Hello World!"
R is a language and environment for statistical computing and graphics. It provides a wide variety of statistical and graphical techniques and is highly extensible.
R is particularly popular in academia and research. If you are working in data science, bioinformatics, or statistical research, learning R is essential. R's strength lies in its vectorized operations that make it fast for data analysis, and its unparalleled plotting capabilities through libraries like ggplot2.
To start using R, you need to install R and optionally RStudio, a popular IDE for R.
install.packages("package_name").R uses <- as the primary assignment operator. You can also use =, but <- is the R convention.
<- (read as "gets").print() or just type the variable name to output values.# for comments.name <- "John"
age <- 25
print(name)
print(paste("Age:", age))
= instead of <-. While = works, it's not idiomatic R and can cause unexpected behavior in some contexts.Comments in R start with the # symbol. R does not have multi-line comments, but you can place a # before each line.
# This is a comment
"Hello World!" # This is also a comment
# This is a multi-line comment
# written across several lines
R does not have a command for declaring a variable. A variable is created the moment you first assign a value to it.
<- to assign values.myVar and myvar are different.paste().name <- "John"
age <- 40
# Concatenate
paste("My name is", name, "and I am", age)
In R, variables do not need to be declared with a type. The type is inferred from the value assigned.
10.5 — decimal values.10L — whole numbers (use L suffix)."Hello" — text strings.TRUE / FALSE — boolean values.3+5i — complex numbers.x <- 10.5 # numeric
y <- 10L # integer
z <- "Hello" # character
w <- TRUE # logical
class(x)
class(y)
class(z)
Operators are used to perform operations on variables and values.
+, -, *, /, ^ (exponentiation), %% (modulus), %/% (integer division).<-, =, <<-==, !=, >, <, >=, <=&, |, !Strings in R are defined with either single or double quotes. R provides several functions for string manipulation.
nchar() — number of characters.grepl() — check if a pattern exists in a string.paste() — concatenate strings.substr() — extract a substring.toupper() / tolower() — case conversion.str <- "Hello World!"
nchar(str) # 12
toupper(str) # "HELLO WORLD!"
grepl("World", str) # TRUE
R supports the usual conditional statements: if, else if, and else.
a <- 200
b <- 33
if (b > a) {
print("b is greater than a")
} else if (a == b) {
print("a and b are equal")
} else {
print("a is greater than b")
}
Loops can execute a block of code as long as a specified condition is reached.
while — loops while a condition is true.for — iterates over a sequence.repeat — loops until a break is hit.next — skips the current iteration (like continue).break — exits the loop.# While loop
i <- 1
while (i < 6) {
print(i)
i <- i + 1
}
# For loop
fruits <- list("apple", "banana", "cherry")
for (x in fruits) {
print(x)
}
A function is a block of code which only runs when it is called. You create a function using the function() keyword.
my_function <- function(fname) {
paste("Hello", fname)
}
my_function("Peter") # "Hello Peter"
my_function("Lois") # "Hello Lois"
A vector is the simplest data structure in R. It is a sequence of data elements of the same basic type, created using the c() function.
c().fruits <- c("apple", "banana", "cherry")
numbers <- c(1, 5, 3, 6, 2)
fruits[1] # "apple"
sort(numbers) # 1 2 3 5 6
length(fruits) # 3
A list in R can contain many different data types inside it — numbers, strings, vectors, and even another list.
thislist <- list("apple", "banana", "cherry")
# Access elements
thislist[[1]] # "apple"
# Change element
thislist[[1]] <- "blackcurrant"
# List length
length(thislist) # 3
A matrix is a two-dimensional data set with columns and rows. It is created using the matrix() function.
thismatrix <- matrix(c(1,2,3,4,5,6), nrow = 3, ncol = 2)
thismatrix
# Access element at row 1, column 2
thismatrix[1, 2] # 4
Data Frames are data displayed in a table format, similar to a spreadsheet or SQL table. They are the most commonly used data structure in R for data analysis.
data.frame() function.summary() to get a statistical summary.$ operator.Data_Frame <- data.frame(
Training = c("Strength", "Stamina", "Other"),
Pulse = c(100, 150, 120),
Duration = c(60, 30, 45)
)
Data_Frame
summary(Data_Frame)
Factors are used to categorize data. Examples of categories are Gender (Male/Female), Music genre (Rock, Pop, Jazz).
music <- c("Jazz", "Rock", "Classic", "Classic", "Pop", "Jazz", "Rock", "Jazz")
music_factor <- factor(music)
print(music_factor)
print(levels(music_factor))
R is built for statistics. It has a comprehensive set of functions for performing statistical analysis right out of the box.
mean() — average value.median() — middle value.min() / max() — smallest / largest value.sd() — standard deviation.var() — variance.cor() — correlation.summary() — comprehensive summary.numbers <- c(10, 20, 30, 40, 50)
mean(numbers) # 30
median(numbers) # 30
sd(numbers) # 15.81
max(numbers) # 50
The plot() function is used to draw points (markers) in a diagram. R's base graphics are powerful and flexible.
plot() — scatter plots and line charts.barplot() — bar charts.pie() — pie charts.hist() — histograms.boxplot() — box plots.# Scatter plot
plot(1, 3)
# Multiple points
plot(c(1,2,3,4,5), c(3,7,8,9,12))
# Line plot
plot(1:10, type="l")
ggplot2 is the most popular data visualization package in R. It is based on the Grammar of Graphics, which provides a systematic way to create any plot.
aes()): Mapping of variables to visual properties (x, y, color, size).geom_*()): The type of plot — points, lines, bars, etc.library(ggplot2)
# Create a scatter plot
ggplot(mtcars, aes(x=wt, y=mpg)) +
geom_point() +
labs(title="Car Weight vs MPG",
x="Weight", y="Miles per Gallon")
install.packages("ggplot2").theme_minimal() or theme_bw() for clean, professional plots.Test your R skills with our Quiz!
We have gathered a variety of R exercises (with answers) for each R Chapter.
You've completed the R module.