Object Oriented Programming

Statistical Computing, 36-350

Friday July 26, 2019

On Wednesday: Fitting Models

fill this in

Part I

Object Oriented Programming ‘Theory’/ Foundation

What we’ve been doing so far (Functional Programming)

Object Oriented Programming (OOP)

OOP continued:

Why learn OOP in R?

OOP in R

There are many class structures in R.

You can read more about all the class structure options in the Object Oriented part of in Hadley Wickham’s Advanced R book.

Today’s Lecture: OOP in R (S3)

Part II

Motivation: Coin Toss

Our Starting example: the coin flip

Let’s start with a functional and simulation-based view of a coin flip.

# coin object
coin <- c("heads", "tails")

We can image we could toss the coin for some simulation event:

sample(coin, size = 1)
## [1] "heads"

Maybe more complicated:

sample(coin, size = 5, replace = TRUE)
## [1] "tails" "heads" "heads" "heads" "tails"

Functionalize

In our simulation lecture we saw the benefit of using functions to encapsulate things we’d like to do multiple times

toss <- function(coin, times = 1) {
  sample(coin, size = times, replace = TRUE)
}

toss(coin, times = 1)
## [1] "tails"

Note how this “abstracts” away some of what is going on (desirable)

Useful additions

Typical probability problems that have to do with coin tossing, require to compute the total proportion of "heads" or "tails":

# five tosses
five <- toss(coin, times = 5)

# proportion of heads
sum(five == "heads") / 5
## [1] 0.6

It is also customary to compute the relative frequencies of "heads" or "tails" in a series of tosses:

# relative frequencies of heads
cumsum(five == "heads") / 1:length(five)
## [1] 1.0000000 0.5000000 0.6666667 0.5000000 0.6000000

Or even to visualize this:

library(tidyverse)
set.seed(5938)
hundreds <- toss(coin, times = 500)
head_freqs <- cumsum(hundreds == "heads") / 1:500

my_data_vis <- data.frame(index = 1:length(hundreds),
                          flips = hundreds, 
                          head_freqs = head_freqs)
ggplot(my_data_vis) + 
  geom_line(aes(x = index, y = head_freqs)) +
  labs(y = "Head Frequency")

Overview:

So far we have written code in R that

Part III

Object Oriented Programming

Motivating object structure

Take a look at these 2 experiments with coins (notice we basically did a quick “copy & paste”).

# random seed
set.seed(534)

# five tosses
five <- toss(coin, times = 5)

# prop of heads in five
sum(five == "heads") / length(five)
## [1] 0.6

The second experiment involves tossing a coin six times and computing the proportion of heads:

# six tosses
six <- toss(coin, times = 6)

# prop of heads in six
sum(six == "heads") / length(five)
## [1] 0.8

Let’s make a class

To make a class, we should really be doing 3 things

  1. create a constructor (way to create a new element of the class)
  2. create methods to apply to the class
  3. create a validator to check to see that an element of the class follows the desired structure

Constructing a class (S3)

S3 objects are usually built on top of lists, or atomic vectors with attributes. You can also turn functions into S3 objects.

To make an object an instance of a class, you just take an existing base object and set the "class" attribute.

# object coin
coin1 <- structure(c("heads", "tails"), 
                   class = "coin")

# object coin
coin2 <- c("heads", "tails")
class(coin2) <- "coin" 

You can also determine if an object inherits from a specific class using inherits()

inherits(coin2, "coin")
## [1] TRUE

Making a method: let’s flip our coin:

A coin could have a function flip():

flip <- function(coin, times = 1) {
  sample(coin, size = times, replace = TRUE)
}

flip(coin1)
## [1] "tails"

1 issue with this function is that it will “flip” anything - even things that aren’t coins:

flip(c('tic', 'tac', 'toe'))
## [1] "tic"

Making a method: Only flipping coins:

We could add a stop() condition that checks if the argument coin is of the right class:

flip <- function(coin, times = 1) {
  if (class(coin) != "coin") {
    stop("\nflip() requires an object 'coin'")
  }
  sample(coin, size = times, replace = TRUE)
}

# ok
flip(coin1)
## [1] "heads"
# bad coin
flip(c('tic', 'tac', 'toe'))
## Error in flip(c("tic", "tac", "toe")): 
## flip() requires an object 'coin'

Making a method: the OOP way

# print method
print
## function (x, ...) 
## UseMethod("print")
## <bytecode: 0x7f9f5fc56330>
## <environment: namespace:base>
# plot method
plot
## function (x, y, ...) 
## UseMethod("plot")
## <bytecode: 0x7f9f5ba174d8>
## <environment: namespace:graphics>
# methods for objects "lm"
methods(class = "lm")
##  [1] add1           alias          anova          case.names    
##  [5] coerce         confint        cooks.distance deviance      
##  [9] dfbeta         dfbetas        drop1          dummy.coef    
## [13] effects        extractAIC     family         formula       
## [17] fortify        hatvalues      influence      initialize    
## [21] kappa          labels         logLik         model.frame   
## [25] model.matrix   nobs           plot           predict       
## [29] print          proj           qqnorm         qr            
## [33] residuals      rstandard      rstudent       show          
## [37] simulate       slotsFromS3    summary        variable.names
## [41] vcov          
## see '?methods' for accessing help and source code
# methods for objects "matrix"
methods(class = "matrix")
##  [1] anyDuplicated as_tibble     as.data.frame as.raster     as.tbl_cube  
##  [6] boxplot       coerce        determinant   duplicated    edit         
## [11] head          initialize    isSymmetric   Math          Math2        
## [16] Ops           relist        subset        summary       tail         
## [21] unique       
## see '?methods' for accessing help and source code

Making a Method: So…, only flipping coins (the better way):

flip method

When implementing new methods, you begin by creating a generic method with the function UseMethod():

flip <- function(x, ...) UseMethod("flip")

A generic method alone is not very useful. You need to create specific cases for the generic function. In our example, we only have one class "coin", we follow the naming scheme: “method_name.class_name”

flip.coin <- function(x, times = 1) {
  sample(x, size = times, replace = TRUE)
}

Example:

# good
flip(coin1)
## [1] "heads"
# bad (no flip() method for regular vectors)
flip(c('tic', 'tac', 'toe'))
## Error in UseMethod("flip"): no applicable method for 'flip' applied to an object of class "character"

Constructing a class: which option should I use?

Let’s review our class "coin". The way we defined a coin object was like this:

# object coin
coin1 <- c("heads", "tails")
class(coin1) <- "coin" 

# bad coin
ttt <- c('tic', 'tac', 'toe')
class(ttt) <- "coin"

flip(ttt) # now flips :/
## [1] "tic"

Constructing a class: create a function to wrap around element

Constructor

For convenience purposes, we can define a class constructor function to initialize a "coin" object:

coin <- function(object = c("heads", "tails")) {
  class(object) <- "coin"
  object
}

# default coin
coin()
## [1] "heads" "tails"
## attr(,"class")
## [1] "coin"
# another coin
coin(c("h", "t"))
## [1] "h" "t"
## attr(,"class")
## [1] "coin"

Validation

ttt <- coin(c("tick", "tac", "toe")) # still undesirable

For now, we could just write our coin as:

coin <- function(object = c("heads", "tails")) {
  if (length(object) != 2) {
    stop("\n'object' must be of length 2")
  }
  class(object) <- "coin"
  object
}

standard <- coin()
standard
## [1] "heads" "tails"
## attr(,"class")
## [1] "coin"
ttt <- coin(c("tick", "tac", "toe"))
## Error in coin(c("tick", "tac", "toe")): 
## 'object' must be of length 2

but more will be need in the future.

Part IV

OOP attributes, standard methods

Attributes: Biased coins anyone?

The sample function allows for different probabilities in sampling, why not allow our coins to be biased?

coin <- function(object = c("heads", "tails"), prob = c(0.5, 0.5)) {
  if (length(object) != 2) {
    stop("\n'object' must be of length 2")
  }
  attr(object, "prob") <- prob
  class(object) <- "coin"
  object
}

coin()
## [1] "heads" "tails"
## attr(,"prob")
## [1] 0.5 0.5
## attr(,"class")
## [1] "coin"
pols <- factor(sample(c("R","D"),size = 6, replace = TRUE)) 

print(pols)
## [1] R R R D D R
## Levels: D R
print(unclass(pols))
## [1] 2 2 2 1 1 2
## attr(,"levels")
## [1] "D" "R"
#^ unclass removes the class structure that makes print pretty

Validation 2.0: just a reminder to validate things…

check_prob <- function(prob) {
  if (!is.numeric(prob)) {
    stop("\n'prob' must be a numeric vector")
  }
  if (length(prob) != 2 | !is.numeric(prob)) {
    stop("\n'prob' must be a numeric vector of length 2")
  }
  if (any(prob < 0) | any(prob > 1)) {
    stop("\n'prob' values must be between 0 and 1")
  }
  if (sum(prob) != 1) {
    stop("\nelements in 'prob' must add up to 1")
  }
  TRUE
}

Our updated coin class:

coin <- function(object = c("heads", "tails"), prob = c(0.5, 0.5)) {
  if (length(object) != 2) {
    stop("\n'object' must be of length 2")
  }
  check_prob(prob)
  attr(object, "prob") <- prob
  class(object) <- "coin"
  object
}

flip.coin <- function(x, times = 1) {
  sample(x, prob = attr(x, "prob"),
         size = times, replace = TRUE)
}

coin1 <- coin(prob = c(.2, .8))
flip(coin1, times = 10)
##  [1] "tails" "heads" "tails" "tails" "tails" "heads" "tails" "tails"
##  [9] "tails" "tails"

Bad coins

coin_a <- coin(prob = c(.7, .7))
## Error in check_prob(prob): 
## elements in 'prob' must add up to 1
coin_b <- coin(c('tic', 'tac', 'toe'))
## Error in coin(c("tic", "tac", "toe")): 
## 'object' must be of length 2

Inhertance:

Some objects can be multiple classes. For example we saw before that glm objects actually had multiple classes.

glm(I(cut %in% c("Fair", "Good")) ~. , 
    data = diamonds, family = "binomial") %>% class
## [1] "glm" "lm"

(E.g. there is no plot.glm but there is a plot.lm function (in your console type methods(plot) and you’ll see no plot.glm).)

methods(plot)
##  [1] plot,ANY-method        plot,color-method      plot.acf*             
##  [4] plot.ACF*              plot.augPred*          plot.compareFits*     
##  [7] plot.data.frame*       plot.decomposed.ts*    plot.default          
## [10] plot.dendrogram*       plot.density*          plot.ecdf             
## [13] plot.factor*           plot.formula*          plot.function         
## [16] plot.ggplot*           plot.gls*              plot.gtable*          
## [19] plot.hcl_palettes*     plot.hclust*           plot.histogram*       
## [22] plot.HoltWinters*      plot.intervals.lmList* plot.isoreg*          
## [25] plot.lm*               plot.lme*              plot.lmList*          
## [28] plot.medpolish*        plot.mlm*              plot.nffGroupedData*  
## [31] plot.nfnGroupedData*   plot.nls*              plot.nmGroupedData*   
## [34] plot.pdMat*            plot.ppr*              plot.prcomp*          
## [37] plot.princomp*         plot.profile.nls*      plot.R6*              
## [40] plot.ranef.lme*        plot.ranef.lmList*     plot.raster*          
## [43] plot.shingle*          plot.simulate.lme*     plot.spec*            
## [46] plot.stepfun           plot.stl*              plot.table*           
## [49] plot.trellis*          plot.ts                plot.tskernel*        
## [52] plot.TukeyHSD*         plot.Variogram*       
## see '?methods' for accessing help and source code

Inhertance: Coins

Some coins are special (people collect them), let’s make a special coin class:

rare_coin <- function(name, year, ...){
  object <- coin(...)
  attr(object, "name") <- name
  attr(object, "year") <- year
  class(object) <- c("rare_coin", "coin")
  object
}
my_penny <- rare_coin(name = "Lincoln penny", year =  1972)
class(my_penny)
## [1] "rare_coin" "coin"
flip(my_penny, 5)
## [1] "heads" "tails" "tails" "tails" "tails"

Standard generics (standard methods)

We already have a flip method for our class, but there are lots of common functions that we apply to may objects

print.coin <- function(coin){
  cat(paste0("Coin: ", coin[1], "/", coin[2], "\n"))
  prob <- attr(coin, "prob")
  cat(paste0("  Prob: ", prob[1], "/", prob[2], "\n"))
}

print.rare_coin <- function(coin){
  cat(paste0("Rare coin: ", attr(coin, "name"),
      ", ", attr(coin, "year"), "\n"))
  print.coin(coin)
}

print(coin1)
## Coin: heads/tails
##   Prob: 0.2/0.8
print(my_penny)
## Rare coin: Lincoln penny, 1972
## Coin: heads/tails
##   Prob: 0.5/0.5

Summary

Pictorially…

and for homework

#install.packages(devtools)
devtools::install_github("benjaminleroy/coin")