---
title: "R programming basics"
author: "Stuart Lee"
output:
xaringan::moon_reader:
nature:
highlightStyle: github
highlightLines: true
countIncrementalSlides: false
---
```{r setup, include=FALSE}
options(htmltools.dir.version = FALSE)
```
## Goat, Goat, Car
In this lab we are going to create a simulation of the Monty Hall game.
In order to complete our task we'll need to learn some more programming techniques!
```{r, out.width = "500px", echo=FALSE, fig.align='center'}
knitr::include_graphics("https://upload.wikimedia.org/wikipedia/commons/thumb/3/3f/Monty_open_door.svg/1280px-Monty_open_door.svg.png")
```
If you've never heard of this game you can play it here:
http://www.shodor.org/interactivate/activities/SimpleMontyHall/
Image credit: [[Wikimedia Commons](https://upload.wikimedia.org/wikipedia/commons/thumb/3/3f/Monty_open_door.svg/1280px-Monty_open_door.svg.png)]
---
class: center, middle
# Activity Time!
Let's go through exerciser 1 of the tutorial. Remember to record your
answers in your "mylab2.Rmd" file!
---
## Everything in R is an object
Let's go through some R programming basics.
> “To understand computations in R, two slogans are helpful:
>
* Everything that exists is an object.
* Everything that happens is a function call."
> John Chambers
Remember that objects are assigned a value using the "<-" operator.
```{r objects}
x <- 10 * 2
# here the object x has value 20
x
# 100 samples from a uniform distribution stored in object `mydata`
mydata <- runif(100)
head(mydata)
```
---
## What are functions?
Remember that functions take an input(s) and return an output.
*Example*: a simple function - what do you think it does?
```{r functions1}
f <- function(x) x + 1
# what is the value of f(10)?
```
*Example*: a function can have multiple inputs (called arguments)
```{r}
hello <- function(name, age) {
paste("Hello my name is", name, "and I am", age, "years old.")
}
hello("Harry Potter, boy wizard,", 11)
```
*Example*: or no inputs at all!
```{r}
random_number <- function() runif(1)
random_number()
```
---
# Why do we write functions?
- We use functions because they allow us to reduce the amount of repetition in our
code (and hopefully reduce the chance of errors!).
- If you find yourself writing the same bit of code over and over with only
minor modifications, it's probably time to write a function!
---
# What do functions consist of?
Functions in R consist of the following:
- arguments: these are the names of the inputs to the function,
- body: the code inside the function (this is what does the work).
*Example*: what are the arguments of the functions on the previous slide?
---
# Calling functions
- To call a function (perform a computation)
simply type it's name and enclose it's arguments in brackets.
- Arguments in a function are first matched by their name then their
position in the function definition.
---
# Examples
```{r call-fun}
mydata <- c(10, 11, 1, 5)
mean(mydata)
rnorm(5, mean = 5) # is the same as rnorm(n = 5, mean = 5)
sample(mydata, 1) # is the same as sample(x = mydata, size = 1)
```
---
# Writing functions
We have already seen a few examples of functions but how can write our own?
Functions are defined as follows
```{r, eval = FALSE}
function_name <- function(args) {
do_stuff
}
```
This has several components:
- the function name (here creatively titled `function_name`)
- a call to `function` (this lets R know we are writing a function)
- the arguments of the function
- a left brace `{`, to enclose the code contained in the function
- the code inside the function that does all the heavy lifting
- a right brace `}` to let R know we have finished defining our function.
---
# Naming objects and functions
- Naming things is hard!
- Remember names in R are case sensitive and cannot
start with a number or a special character (such as '*' or '_').
- Try to be consistent in your naming.
---
class: center, middle
# Activity time!
Exercise from [R4DS](http://r4ds.had.co.nz/functions.html)
> Practice turning the following code snippets into functions. Think about what each function does. What would you call it? How many arguments does it need? Can you rewrite it to be more expressive or less duplicative?
```{r, eval = FALSE}
x / sum(x, na.rm = TRUE)
sd(x, na.rm = TRUE) / mean(x, na.rm = TRUE)
```
---
# Control statements - if
An `if` statement let's you control when a bit of your code is
executed based on a logical condition.
```{r, eval = FALSE}
if (condition) {
# code that is run when condition is TRUE
} else {
# code that is run when condition is FALSE
}
```
Note that the condition must evaluate to either TRUE or FALSE.
---
# Control statements - multiple condtions
You can put multiple conditions together using `else if`
```{r, eval = FALSE}
if (condition) {
# stuff that happens when condtion is TRUE
} else if (something_else) {
# stuff that happens when something_else is TRUE
} else {
# some other stuff
}
```
---
# Example - weather
```{r}
melb_temp <- function(temp) {
if (temp < 11) {
"cold"
} else if (temp >= 15 & temp < 25) {
"okay"
} else {
"hot"
}
}
```
What does the function `melb_temp` do?
---
## Vectors
- Vectors in R are either **atomic** or **lists**.
- Atomic vectors contain elements that are all the same type while lists can contain anything.
For example here are some atomic vectors,
```{r}
c("A", "B", "C") # character
runif(3) # a numeric (double) vector
c(1L, 2L, 3L) # a numeric (integer vector)
c(TRUE, TRUE, FALSE) # a logical vector
```
---
# Vectors have a length and type
The number of elements in a vector can be found using `length`
```{r}
length(c(1,2,3))
```
The type of a vector can be found using `typeof`
```{r}
typeof(c(1,2,3))
# the is. family can be used to test the type of a vector
is.numeric(c(1,2,3))
```
---
# Creating vectors
We can combine objects into a vector using `c`
```{r}
x <- 1
y <-"A"
c(x,y)
```
We can create a sequence of numbers from a to b using `a:b`
```{r}
1:10
# alternatively
seq_len(10)
```
---
# Subsetting vectors
We can subset vectors using `[` and a position. We can also replace the
elements of a vector using `<-`.
```{r}
x <- 1:10
x[1]
x[1] <- 3
x[[1]] # alternative syntax for getting first element
x[[1]] <- 1
```
---
# Subsetting vectors
We can also subset using another vector of positions!
```{r}
x[c(1,2,4)] # we can subset using another vector
x[-1] # all but the first element
x[length(x)] # the last element
```
---
# Subsetting vectors
We can also subset using a logical statement.
```{r}
x <- c(10, 3, NA, 5, 8, 1, NA)
x > 5
is.na(x)
x[x>5]
x[!is.na(x)]
```
---
class: middle
# Activity time!
Using the `x` defined below, write code to do the following:
- create a new vector `y` from `x` with missing values removed
- subset `x` to only have elements greater than 10 and with no missing values
- replace the missing values in `x` to have value -1
```{r}
x <- c(10, 3, NA, 5, 8, 1, NA, 14, 15, NA)
```
---
# Iteration
- As with functions, iteration can help us reduce the amount of repetition in
our code.
- Iteration can be used to perform the same operation over and over which
is necessary when performing simulations.
---
# for loops
A `for` loop can be used to iterate over the elements of a vector and
perform some computation. There are three components to creating a for loop:
- an object of sufficient size to store the output of the loop.
- a sequence to iterate over.
- a body that contains the computation.
Example: sampling from normal distribution with different means
```{r}
mu <- c(10, 20, 50, 100, 1000)
# output where we will store our results, here we create
# an empty numeric vector of length of mu vector
output <- numeric(length(mu))
for (i in seq_along(mu) ) { # sequence to iterate over, here the columns
# body of the loop, what are we doing?
output[[i]] <- rnorm(1, mean = mu[[i]])
}
output
```
---
# Comments on the for loop pattern
* output: we create an empty vector that has the same type as the
result of our computation in the body, in the above example a "numeric" vector.
* sequence: `i in seq_along(means)`, this is what we we're looping over.
Each run of the loop will create an object `i` and assign it to a different value in `seq_along(means)`.
* body: here we assigning the `i` th element of output to have a random sample
from normal distribution with mean equal to the `i`th element of the `mu` vector.
---
# Learning more
A lot of the material I've used here is inspired by the book
R for data science by Garett Grolemund and Hadley Wickham
You can read the book for free online at:
http://r4ds.had.co.nz/
---
class: center, middle
# Activity time!
You know have all the necessary material to complete the exercises in
lab 2. Happy coding!