Last week: Object oriented programming

Part I

Debugging basics

Bug!

The original name for glitches and unexpected defects: dates back to at least Edison in 1876, but better story from Grace Hopper in 1947:

(From Wikipedia)

Debugging: what and why?

Debugging is a the process of locating, understanding, and removing bugs from your code

Why should we care to learn about this?

Debugging: how?

Debugging is (largely) a process of differential diagnosis. Stages of debugging:

  1. Reproduce the error: can you make the bug reappear?
  2. Characterize the error: what can you see that is going wrong?
  3. Localize the error: where in the code does the mistake originate?
  4. Modify the code: did you eliminate the error? Did you add new ones?

Reproduce the bug

Step 0: make if happen again

Characterize the bug

Step 1: figure out if it’s a pervasive/big problem

Localize the bug

Step 2: find out exactly where things are going wrong

Localizing can be easy or hard

Sometimes error messages are easier to decode, sometimes they’re harder; this can make locating the bug easier or harder

f <- function(a) g(5 * a)
g <- function(b) h(b - 1)
h <- function(c) {
        c <- log(-c)
      if (c > 2){
        return(c^2)
      } else {
        return(c^3)
      }
}

f(-5)
## [1] 10.61519
f(5)
## Warning in log(-c): NaNs produced
## Error in if (c > 2) {: missing value where TRUE/FALSE needed

What do you mean we have a missing value! c definitely exists right?

traceback()

Calling traceback(), after an error: traces back through all the function calls leading to the error


If you run f(5) in the console, then call traceback(), you’ll see:

> traceback()
3: h(b - 1) at #1
2: g(5 * a) at #1
1: f(5)

We can see that f() is calling h() is calling g() and this last function is throwing the error.

Why? It ends up that if you do log of a negative number it returns NA, and NAs and booleans don’t really mix.

Part II

Debugging tools

cat(), print()

Most primitive strategy: manually call cat() or print() at various points, to print out the state of variables, to help you localize the error

This is the “stone knives and bear skins” approach to debugging; it is still very popular among some people (actual quote from stackoverflow):

I’ve been a software developer for over twenty years … I’ve never had a problem I could not debug using some careful thought, and well-placed debugging print statements. Many people say that my techniques are primitive, and using a real debugger in an IDE is much better. Yet from my observation, IDE users don’t appear to debug faster or more successfully than I can, using my stone knives and bear skins.

Specialized tools for debugging

R provides you with many debugging tools. Why should we use them, and move past our handy cat() or print() statements?

Let’s see what our primitive hunter found on stackoverflow, after a receiving bunch of comments in response to his quote:

Sweet! … Very illuminating. Debuggers can help me do ad hoc inspection or alteration of variables, code, or any other aspect of the runtime environment, whereas manual debugging requires me to stop, edit, and re-execute.

browser()

One of the simplest but most powerful built-in debugging tools: browser(). Place a call to browser() at any point in your function that you want to debug. As in:

my_fun <- function(arg1, arg2, arg3) {
  # Some initial code 
  browser()
  # Some final code
}

Then redefine the function in the console, and run it. Once execution gets to the line with browser(), you’ll enter an interactive debug mode

Things to do while browsing

While in the interactive debug mode granted to you by browser(), you can type any normal R code into the console, to be executed within in the function environment, so you can, e.g., investigate the values of variables defined in the function

You can also type:

(To print any variables named n, s, f, c, or Q, defined in the function environment, use print(n), print(s), etc.)

Browsing in R Studio

You have buttons to click that do the same thing as “n”, “s”, “f”, “c”, “Q” in the “Console” panel; you can see the locally defined variables in the “Environment” panel; the traceback in the “Traceback” panel