Name:
Andrew ID:
Collaborated with:

On this homework, you can collaborate with your classmates, but you must identify their names above, and you must submit your own homework as an knitted HTML file on Canvas, by Tuesday 10pm, next week (July 24th).

Survey (5 points)

Fill out the midsemester survey. Write the final word in this phrase that appears in the last question “Ben is ____”.

Huber loss function (10 pts)

Recall, as covered in lab, the Huber loss function (or just Huber function, for short), with cutoff \(a\), which is defined as: \[ \psi_a(x) = \begin{cases} x^2 & \text{if $|x| \leq a$} \\ 2a|x| - a^2 & \text{if $|x| > a$} \end{cases} \] This function is quadratic on the interval \([-a,a]\), and linear outside of this interval. It transitions from quadratic to linear “smoothly”, and looks like this (when \(a=1\)):

Exploring function environments (10 pts)

huber <- function(x, a = 1) {
  x_squared = x^2
  ifelse(abs(x) <= a, x_squared, 2 * a * abs(x) - a^2)
}
huber_sloppy <- function(x) {
  ifelse(abs(x) <= a, x^2, 2 * a * abs(x) - a^2)
}

Prostate cancer data, revisited (with dplyr) (17 points)

library(tidyverse)

Below we read in a data frame pros_df containing measurements on men with prostate cancer, as seen in previous labs. As before, in what follows, use dplyr and pipes to answer the following questions on pros_df.

pros_df <- 
  read.table(paste0("https://raw.githubusercontent.com/benjaminleroy/",
                    "36-350-summer-data/master/Week1/pros.dat"))

Small Assignments with tidyr (17 points)

shark_attacks <- read.csv("https://raw.githubusercontent.com/benjaminleroy/36-350-summer-data/master/Week2/shark-attacks-clean.csv", stringsAsFactors = TRUE)

Merging with tidyverse (13 points)

Split-Apply-Combine