Name:
Andrew ID:
Collaborated with:

On this homework, you can collaborate with your classmates, but you must identify their names above, and you must submit your own homework as an knitted HTML file on Canvas, by Monday August 5 at 10pm, next week.

Introduction to percolation

In this homework, we’ll be coding a series of functions to investigate percolation via simulation. We’ll spend this section to discuss the problem and setup. You can read related literature on this topic at Wikipedia, but we’ll be working with a simplified setting for this homework. As a word of caution, this is a coding heavy homework (as opposed to statistical), so be prepared to spend a lot of time debugging and testing.

Here’s the idea. Imagine you have a square board (10 by 10 squares) like the left board show below. This board consist of white “open” squares and black “blocked” squares. We are interested in knowing (abstractly) if we “pour water” from the top of the matrix, does the water “leak” from the bottom of the matrix. This is demonstrated in the right board. (Don’t worry, we’ll explain all the necessary specifics later.) You can think of this as the following: we indefinitely keep pouring water into each white square in the top row at the same time, and water runs through the board by spreading to any adjacent white square (left, right, top, and bottom). We keep pouring water until all the possibly-flooded squares are flooded. These are the blue squares shown below in the right board. Once we’re done, we see if the water reached any of the open squares in the bottom row of the board (i.e., are there any blue squares in the bottom row?). If so, we say the board “percolates”. Otherwise, it does not percolate. In the figure shown below, the displayed board percolates (left is the original board, and right is the percolated board).

However, not all boards percolate. Consider the next board, shown below. Once again, we show the initial board on the left, and we start pouring water, resulting in the board on the right.

Now imagine we had a way to randomly generate these boards. We would like to see if there are certain types of randomly-generated boards that more likely to percolate. We’ll formalize this task in this homework.

So what are the goals of this homework? We will be developing a package with version control and github connection. This package will encapsulate a set of functions related to a new object, called board. While developing our package, we will be writing the following functions: generate_board_mat() will generate a random board (matrix version) with white “open” squares and black “blocked” squares. is_valid() will check whether or not a given board matrix is correctly formatted. plot.board() will plot the board, similar to the four boards shown above. The last two functions contribute the challenges to this homework: percolate.board(), which determines whether or not a board percolates, and read_board() reads in a text file that specifies many boards. Note that plot.board() and percolate.board() will be methods of the class.

We will not give you explicit guidance on how to debug and test throughout most of this homework, but the concepts and tools you’ve learned in lecture will certainly be beneficial as you write and try out your code.

Note: The grading of this homework with depend mainly on whether or not you pass the test cases provided. Your implementation of percolate.board() and read_boards() might look quite different from another classmates, but be you sure that your code passes the tests if you want full credit!

Making a package and using version control

As you will be learning version control and connection to github during Wednesday’s lecture and making packages but you should start this assignment early in the week, I provide 3 ways to approach this homework (conditional on your experience and willingness to read future lectures). I recommend that you try to at least start with Approach 2 if you have some knowledge about Github.

Approach 1 (Base Approach):

Start writing all you code into this homework .Rmd file as usual. As long as the code you write well documented the you can transfer your code to a package with version control at the very end (this is less prefered in terms of using version control - but is fine for now). At the very end of the assignment will be guidance on how to create your package with version control. Note: the “Approach 2/3” comments throught this document can also aid this this endeavor.

In this approach - it is very important that all functions and tests you write deal well with the “local” vs “global” paradigm - that is, don’t assume more things are in the global environment than you need.

Approach 2 (Github start):

Using the guide in Wednesday’s Lecture, create a new github public repository please call it percolate and create a project in your Rstudio that links to it. Please additionally make folders R/ and /tests/testthat/ in this new directory. We will put your functions and class definitions in .R files in the R/ folder and your tests in .R files in the tests/testthat/ folder. I provide a small section at the end of the homework on how you should moving from being “approach 2” to “approach 3”.

Approach 3 (package start):

Similar to the Approach 2, create a new github public repository please call it percolate and create a project in your Rstudio that links to it. Using usethis::create_package("percolate") create a package in this directory. Make sure in your console you call this function when you are in the folder that contains the folder precolate.

To navigate in the console use `setwd()` and `getwd()` (which sets the directory you're in and returns which directory you're currently in)

You will be using usethis to set up your package and devtools to check your tests, document your functions and more. Please also see closing remarks at the bottom of the file.

Generating and plotting boards

In the first section, we will be creating our new class board as well as helper function and a plot function for our object. We will write generate_board_mat(), and create a class object that can either take in a matrix or generate the board with certain parameters. In this section we will also create the function is_valid() and the method plot.board(). To no surprise, we can represent these boards as square, numeric matrices with dimension n by n. These matrices will only have values 0 (for black “blocked” squares), 1 (for white “open and dry” squares), and 2 (for blue “open and flooded”) squares.

Approach 2: you’ll see the names of files to create / update. Just run the code in these files into your console to get interactions.

1a. Before we define our class let’s define a helper funtion to generate boards. Write the function generate_board_mat() that takes arguments n (a positive integer denote the size of the board) and p (a number between 0 and 1 that denotes the fraction of the n^2 squares are blocked). This function should return a n by n matrix with values 0 or 1. The specific locations are the floor(p*n^2) blocked squares are chosen uniformly at random. Be sure to write lines to check that the input arguments are valid using assert_that(). Set the default values to be n=5 and p=0.25. Print the matrix for generate_board_mat() (using the default parameters) and generate_board_mat(n = 8, p = 0.75). Document the function using Roxygen2 (control + alt + shift + R).
```
library(assertthat)
library(testthat)
```
Approach 3: To make sure you have a tests/testthat folder, run the commands in the console below while you’re in your package:
```
usethis::use_testthat()
# and
usethis::use_package("assertthat")
```
Approach 2/3:

Write this function generate_board_mat in a .R file called utils.R in the R folder. Put the suggested examples in the examples section (make sure to run them after your run your function in the console). For approach 3 you can create the file using usethis::use_r("utils"). and after you’re done writing the function use devtools::document() to document the function. Then use devtools::load_all() to put the generate_board_mat into your working space (you should use this function whenever you update a function or documentation - but I won’t keep mentioning it).

Finally, add the utils.R file (and potentially other files associated with it) and make a commit.
1b. Now, using the test_that() function, write the following tests: 1) ensure that when using generate_board_mat() (default parameters), the output is a 5 by 5 numeric matrix containing only 0 and 1, 2) ensure the same but for a different value of n, 3) ensure that using p = 0 gives a board with all 1’s, 4) ensure that using p = 1 gives a board with all 0’s, 5) ensure that the function throws an error when n = c(1,2) or n = "asdf" or n = 5.4 or n = -5.

Approaches 2/3:

Write these tests in a .R file called test-utils.R in the tests/testthat/ folder. Run your tests and make sure the pass. For approach 3, use usethis::use_test("utils.R") to create the test-utils.R file in tests/testthat/, once written, save file and run devtools::test() and see that all your tests have passed.

Finally, add the test-utils.R file (and potentially other files associated with it) and make a commit.
1c. One last helper function before our class definition… Now we will write the is_valid() function (please document). This function should take in a matrix mat as input. It should check that mat is a square matrix that contains values only 0, 1, or 2 using assert_that(). Then, it should return TRUE. Hence, is_valid() will always throw an error or return TRUE. Print out the result of is_valid(generate_board_mat()) and is_valid(generate_board_mat(n = 1)) (which should both return TRUE). Then, write 3 tests using test_that() to ensure that is_valid() will throw an error for inputs mat that are not valid. Each of your 3 tests should be testing for a different reason for invalidity.

In your documentation please put #' @import assertthat at the bottom of the documentation. This will make it such that you can use all functions from assertthat in your package without having to do assertthat::assert_that every time. (See the challenge at the end of this assignment for a better approach - aka actually doing assertthat::assert_that every time).

Approaches 2/3:

Place this function into utils.R and the associated tests in test-utils.R. For approach 3, create documentation and check your tests using the correct devtools functions. Remember to do devtools::load_all() when done.

Finally, (sensing a pattern), add updated files and make a commit.
1d. Finally! Let’s make a function that returns an object with class board (specifically, but have the object also inherit the matrix class functionality – aka have board be a subclass of matrix). This function should either take in a matrix (default = NULL) or take in and n and p value (defaults same as generate_board). Note: These comments means I want the parameter to be mat = NULL, n = 5, p = .25. Additionally, if a matrix (mat) is provided (aka non-NULL) then use that at the matrix and create a object of class board. Additionally, add attributes n and p to the object (empirical values for p if mat is provided or the generating parameters if not). Write a test to check that if you input a mat it returns a board with the same structure (use expect_equivalent and unclass here), and that the empirical n and p values are correct. Also check if you put in an incorrect matrix structure (check the same matrices in 1c) it errors. And finally, test some examples of inputting n and p. Document your function.

Approaches 2/3: Place this function/class definition into board.R and the associated tests in test-board.R. Continue following similar procedures for devtools calls (approach 3) and commits (everyone).
1d. Lastly, we will write the plot.board() method, which also takes in a board x. This function should check that x is valid using the is_valid() function prior to plotting (just in case someone is trying to trick us to plot a fake board). Using ggplot (remember back to my_volcano) Plot the board so the resulting plot is a square with axes labels or legend (+ theme(legend.position = "none")), has a title stating the size of the board, and has a black square for each 0 entry of mat, a white square for each 1 entry of mat, and a light blue square for each 2 entry of mat. Additionally make the theme + theme_void(). (Hint: It might be desirable to define n <- attr(board, "n") before you do anything else. Also, please use the color lightblue3 when making the image for the light blue squares. You’ll find the + scale_fill_manual() function quite useful… use a named vector for the values in this function.)

As always: document, document, document.

In your documentation please put #' @import ggplot2 (and additional lines for any other packages you use in this function: e.g. tidyr) at the bottom of the documentation. I won’t remind you to do this again, but you’ll need to do so.

For example, we provide a specific board below, board_example. The desired plot you should produce when running plot(board_example) is also shown below. Plot your output for plot(board_example). We also provide board_example2-4 that you should check (but no need to put in documentation or this document).

Challenge: make the axis proportional like our examples and have the title centered.
```
board_example <- board(matrix(c(0,1,1,1,0,
                            0,1,1,0,1,
                            0,0,1,0,0,
                            0,0,0,2,2,
                            2,2,2,2,0), 5, 5)) 
#^note the matrix will be the transpose of what you see here (due to byrow = F)

board_example2 <- board(matrix(c(0,1,1,1,0,
                             0,1,1,0,1,
                             0,0,1,0,0,
                             0,0,0,1,1,
                             1,1,1,1,0), 5, 5))

board_example3 <- board(matrix(c(0,2,2,2,0,
                             0,2,2,0,2,
                             0,0,2,0,0,
                             0,0,0,2,2,
                             2,2,2,2,0), 5, 5))

board_example4 <- board(    # board of size 10 (correctly percolated)
  matrix(c(2, 0, 0, 2, 0, 0, 2, 2, 0, 0,
       2, 2, 2, 0, 0, 2, 2, 0, 0, 1,
       0, 2, 0, 0, 0, 0, 2, 0, 0, 0,
       0, 2, 2, 2, 0, 0, 0, 0, 0, 0,
       0, 2, 0, 0, 0, 0, 1, 0, 0, 0,
       2, 2, 2, 0, 1, 0, 0, 1, 1, 1,
       0, 2, 2, 0, 1, 1, 0, 0, 1, 0,
       1, 0, 2, 0, 0, 0, 0, 1, 1, 0,
       1, 0, 0, 1, 1, 0, 0, 0, 0, 0,
       0, 1, 1, 0, 0, 1, 0, 1, 0, 0),
     10, 10))
```
Approaches 2/3:

Place this function/class definition into visualize.R (tests are hard for visualizations - we’ll ignore them for now - but do put your board_examples in the documentation). Continue following similar procedures for devtools calls (approach 3) and commits (everyone). * Approach 3, since you’re also using a new package add ggplot2 to the used packages (see how you added assertthat at the begining of this problem).*

1e. Challenge/Extra: Add an parameter to plot.board() called grid which is a boolean. If grid = FALSE, nothing additional happens. However, if grid = TRUE,draw dashed grid lines onto the plot as well, so each square of mat is outlined. This is a pretty similar update, but you’ll need to tinker around with it (Hint: look at geom_tile) The desired plot when running plot.board(board_example, grid = TRUE) is shown below. Plot your output for plot.board(board_example, grid = TRUE).

Approach 2/3: update your function, documentation, etc and follow standard git/ devtools approach.

1f. Before you go any future, write a few sentences are why it was useful / why you think it will be use to have the board’s superclass be matrix. (If you have the general idea you can write this after you do the next question.). If you are having struggles thinking about why it’s useful, try doing t(board_example) and board_example[1,2].

Percolating the board

We will now write the percolate.board() method that takes as input start_board (a board) and outputs a list with two entries: result_board, the resulting matrix after “water has been poured”, and result, a boolean on whether or not mat percolates. The function should use the is_valid() function to check that board is a valid matrix first, and then use a separate call to assert_that() to ensure board contains only 0’s and 1’s.

When computing result_board, any open & dry square (i.e, value of 1) in the top row of mat are changed to open & flooded (i.e, changed to a value of 2). When done, result_board should have the same exact pattern of blocked squares (i.e., value of 0) as start_board. In addition, any open & dry square that is adjacent to any open & flooded square becomes open & flooded (via left, right, top and bottom). Once no open & dry square can become open & flooded, your algorithm is done computing result_board. To determine whether or not start_board percolates, check to see if there are any open & flooded square along the bottom row of result_board.

Important: This is not a course about algorithms. You do not need to write an efficient algorithm, but it needs to be correct. Feel free to use any strategy for this algorithm, even if it’s computationally inefficient.

2a. Write the method percolate.board() according to the specifications above. That is, we will be calling the function percolate on a board object. For consistency, have percolate take in x and ... as parameters (and percolate.board take in x).

We provide 3 example matrices in mat_example_list below (before continuing, create a board_example_list using lapply). Display the boolean results when applying percolate() on each board. (You shoult not print the matrices themselves.) Then, using plot(), plot 6 boards, (2 rows and 3 columns using arrange.grid()) where the top row shows the initial boards in board_example_list (from left to right) and the bottom row shows the resulting boards after pouring water. The first two boards should percolate, whereas the last board should not.

Approach 2/3: please make a new .R file in the R/ folder (name it percolate.R). Follow structure as see in problem 1 in terms of devtools and github. Make sure the requested examples are in the example documentation requested.
```
mat_example_list <- list(matrix(c(1,1,1,1,0, 
                              0,0,0,1,0, 
                              1,1,1,1,0, 
                              0,1,0,0,0, 
                              0,1,1,1,1), 5, 5), 
                    matrix(c(1,1,1,1,0,
                             0,0,0,1,0, 
                             0,1,1,1,0, 
                             0,1,0,0,0, 
                             0,1,1,1,1), 5, 5),
                    matrix(c(1,1,1,1,0,
                             0,0,0,1,0, 
                             0,1,1,0,0, 
                             0,1,0,0,0, 
                             0,1,1,1,1), 5, 5))
```
2b. Now, write four tests to ensure percolate() behaves properly. Let my_board be a 10 by 10 matrix. The first test should set the input my_board as a matrix with all open sites (i.e, all 1’s). The second test should set my_board as a matrix with all blocked sites (i.e, all 0’s). The third test should set my_board as any valid matrix but have all the squares on the top row be blocked. The last set should set my_board as any valid matrix but have all the squares on the bottom row be blocked. You want to ensure that percolate() outputs a list (containing result_board and result) in each of the four tests, gives the correct result_board (to the best of your knowledge), and only the first matrix percolates.

Approach 2/3: Reminder: this will need a new test file.
2c. Fortunately, since this is a class where the TAs can write their own version of percolate.board(), we can provide you test cases and their corresponding correct answers. This are provided in https://raw.githubusercontent.com/benjaminleroy/36-350-summer-data/master/Week5/percolate_test.Rdata, which loads in objects board_list and result_list. Run the following code, which runs your version of percolate.board() on 50 different boards (be sure to remove eval = F in the code chunk). This test_that() function makes sure that your output matches exactly the output our TAs provided. For this question, your code should run and report nothing (as there should be no errors). If there are, go back to your percolate.board() function in Q2a and debug your code.

Approaches 2/3: put this test in the correct file (and do appriate next steps)
```
test_that("percolate.board() works with all the test cases",{
  load(url("https://raw.githubusercontent.com/benjaminleroy/36-350-summer-data/master/Week5/percolate_test.Rdata"))

  your_result_list <- lapply(board_list, percolate)

  bool_vec <- sapply(1:length(result_list), function(x){
your_board <- your_result_list[[x]]$result_board
result_board <- result_list[[x]]$result_board

identical(your_board, result_board) * 
  (your_result_list[[x]]$result == result_list[[x]]$result)
  })

  expect_true(all(bool_vec))
})
```
You may (optionally) add additional test cases to this question (Q2c) to test specific boards that we provided in board_list or boards of your own. This is to your own benefit to do so since you want to have some sort of guarantee that as you “fix” your code, old bugs that have been “dealt with” do not “reappear”.

Running experiments

3a. Hopefully now, your percolate.board() function is bug-free! The time percolate.board() takes to complete scales with n and p, the size of the board and the percentage of blocked squares. We can time our algorithm using the system.time() function. For example, if we wanted to know how fast our algorithm can perform matrix multiplication, we can compute the following.
```
n <- 500
system.time(matrix(rnorm(n^2), n, n) %*% matrix(rnorm(n^2), n, n))
```
```
##    user  system elapsed 
##   0.101   0.003   0.103
```
The elapsed number is the total time that has passed. We can extract this number in the following way. (This number might be slightly different from above since we’re running two separate instances of matrix multiplication.)
```
system.time(matrix(rnorm(n^2), n, n) %*% matrix(rnorm(n^2), n, n))[3]
```
```
## elapsed 
##   0.098
```
For p = 0.4, run 10 trials for each value of n from 10 to 100 with spacing of 10. For each trial, generate a random board using board(...). Then, record the elapsed time to compute whether or not the board percolates using percolate(). That is, there will be a total of 10*10 = 100 times percolate() will be called. After all the trials are completed, compute average elapsed time for each n. (Hint: Writing this set of simulations as an sapply() might make your life easier.) (Note: This function might take a while to run, so it’s highly encouraged to fully debug your code beforehand and use coding practices you learned Week 5 when doing this.)

Then, make a line plot of elapsed time verses n^2. Label your plot appropriately. After inspecting your graph, does it seem like your algorithm’s elapsed time (often called the complexity of the algorithm) linear with n^2? (For students more comfortable with CS theory: you can try other values of n and change your x-axis to other terms such as n^2*log(n) if needed.)

Approach 2/3: save your homework file in whatever directory you’d like.

Approach 2: remove the eval = F in the following code and replace the ... with the path to your project (do global path for be safe). Make this string of the directory contains a “/” at the end of it. This should load all your functions in (I’m assuming your haven’t been putting your code in 2 places).
```
source("https://raw.githubusercontent.com/benjaminleroy/36-350-summer-data/master/Week5/load_my_files_function.R")
load_my_files(...)
```
Approach 3: remove the eval = F in the following code and replace the ... with the path to your project (do global path for be safe).
```
devtools::load_all(path = ...)
```
3b. Add a red linear regression line to the plot above (regressing elapsed time onto n^2). (no se band.)
Challenge. More likely than not, the algorithm you implemented was not as efficient as it could have been. For starters, most “simple” implementations of percolate.board() would scale more than n^2 (meaning the plot you created in Q2d would still look strongly convex despite plotting elapsed time verses n^2). Change your implementation of percolate.board() so that (empirically) it looks like your code scaled linearly with n^2, and then run percolate.board(generate.board(n = 1000, p = 0.4)) and print out the boolean on whether or not the board percolates. (Warning: if you don’t have an “efficient” implementation, this could take a really long time to run. I wouldn’t try this unless you are confident in your implementation.) Of course, your new implementation needs to pass all the tests in Q2c still.

This challenge problem is for the CS-savy students. The trick is to use efficient data structures that allow faster computation compared to the matrix. If you want a hint on what to code, you can read about the union/find data structure in this this paper for inspiration. Alternatively, you might find the igraph R package to be useful…

Approach 2/3: make sure to reload your functions after you make the update.
3c. We now want to see how the fraction of blocked sites p affects whether or not a board is likely to percolate. For n = 5, n = 10 and n = 25, run 20 trials of each value of p from 0 to 1 with spacing of 0.05 (i.e, 21 different values of p). For each trial, generate a board using board(...) and determine if it percolates using percolate(). Among the 20 trials for each setting of n and p, record the percentage of boards that percolated.

That is, there will be a total of 20*21*3 = 1260 times percolate() will be called.

Then plot 3 curves (one for each setting of n), all on the same plot, of percentage of boards that percolated verses the value of p (using only a single data.frame - think back to dplyr and tidyr). The curves should be a black line for n = 5, red for n = 10 and blue for n = 25. Add appropriate axes labels and a legend.

After inspecting your graph, do you notice a “phase transition” effect where a small change in p results in a large change in the probability of percolation? Roughly what value of p do you estimate this value to be, and how do you think it relates to n?

Reading in boards

In this final section, we will write read_boards(), a function that takes in a filepath to a text file and outputs a list of matrices that the text file represents. We describe the specific format of how read_boards() should work. Let file denote the text file that contains our boards. Each board in file is separated by exactly ----, four dashes. There must be ---- before and after each board. If not, read_boards() should error, stating that the file is not properly formatted. This is the only way read_boards() should throw an error. Hence, the length of the returned list is equal to the number of ---- lines in file minus one.

The first non-empty line strictly between the two ---- lines must be a positive integer. This number represents n, the size of the board. The next n lines afterwards then denote the precise pattern of the n by n board, row-wise. Each line represents a row of the board and should have exactly n visible characters, * or .. The * character represents a blocked square, while the . character represents a open square. If the lines between the two ---- lines do not meet these specifications, then return an NA (for an incorrectly specified board) instead of an n by n matrix for this particular board.

Other than storing information pertinent to boards, file should not contain any other visible characters, but file might have extra spaces or extra line breaks.

We provide a small example of what to expect. The file at https://raw.githubusercontent.com/benjaminleroy/36-350-summer-data/master/Week5/percolation_write_example.txt provides 3 boards, the first two of which are correctly specified and the last of which is not. The raw text of this file looks like this:

```
----
4
* . . *
. . * *
. * . .
. . . .
----
5
. . . . .
* . . . *
. . . . *
. * . . *
* * * * .
----
4
. . . .
* . . *
. . a *
. * . .
----
```

The output of `read_boards("https://raw.githubusercontent.com/benjaminleroy/36-350-summer-data/master/Week5/percolation_write_example.txt")` should then be the following:


```
## [[1]]
##      [,1] [,2] [,3] [,4]
## [1,]    0    1    1    0
## [2,]    1    1    0    0
## [3,]    1    0    1    1
## [4,]    1    1    1    1
## attr(,"class")
## [1] "board"  "matrix"
## attr(,"p")
## [1] 0.3125
## attr(,"n")
## [1] 4
## 
## [[2]]
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    1    1    1    1    1
## [2,]    0    1    1    1    0
## [3,]    1    1    1    1    0
## [4,]    1    0    1    1    0
## [5,]    0    0    0    0    1
## attr(,"class")
## [1] "board"  "matrix"
## attr(,"p")
## [1] 0.36
## attr(,"n")
## [1] 5
## 
## [[3]]
## [1] NA
```

**Approaches 2/3:** put this function in the `utils.R` folder (and tests in the correct location as well) and load in your functions before doing problem **4d**.

4a. Write the read_boards() function based on the specifications above. Similar to how you did HW1, you will use the readLines() function to read in a file. We highly encourage you to write additional functions to help read_boards(), as you will make your life easier if you modularize your function (i.e., write simple functions that are easy to understand, and other functions that piece together these simpler functions as opposed to writing one massive function that does everything itself.) Demonstrate your function works by printing the output to read_boards("https://raw.githubusercontent.com/benjaminleroy/36-350-summer-data/master/Week5/percolation_write_example.txt").
4b. Using the read_boards() function, load in the 50 boards represented in the text file https://raw.githubusercontent.com/benjaminleroy/36-350-summer-data/master/Week5/percolation_write_test.txt. These are the same boards you looked at in Q2c. Write a test_that() function to ensure that after reading in these 50 boards, you get the same boards as boards_list located in https://raw.githubusercontent.com/benjaminleroy/36-350-summer-data/master/Week5/percolate_test.Rdata. Your test should use the identical() function. (Hint: You might run into some problems when you use the identical() function. If you’re stuck, you might be interested in attributes() - lists all attr()s)
4c. We provide 6 test files, https://raw.githubusercontent.com/benjaminleroy/36-350-summer-data/master/Week5/percolation_write_test1.txt through https://raw.githubusercontent.com/benjaminleroy/36-350-summer-data/master/Week5/percolation_write_test6.txt. (You can look at each of these 6 files in your internet browser.) Each of these test files contains a misspecification. Write 6 tests using test_that(), each using one of these 6 files, to show that read_boards() properly returns a list containing one NA when used on any of these files. (Note: Most likely, as in the other questions, you’ll need to go back to change your implementation of read_boards() to accommodate these errors.)
4d. As the last question, let’s end with a pretty graphic to use all our functions. Read the board (50 by 50) from https://raw.githubusercontent.com/benjaminleroy/36-350-summer-data/master/Week5/percolation_write_large.txt using read_boards(). Then, use the percolate the board using percolate.board(). Finally, plot the board before percolation and after percolation (similar to the examples in the Introduction) using plot.board(). (Hint: You’ll need to use grid.arrange(...,ncol = 2).)

Approach 1: Making a package and using version control

Hi there. Congratulations on completing coding of this assignment! We now need to convert the code in this file into a package. First, please save this file as “hw_5-transfer.Rmd” so if you accidently delete things / things aren’t work you can come back to previous work.

Now, we first need to create a github directory (See notes in Wednesday lecture for a refresher). Create a new github public repository please call it percolate and create a project in your Rstudio that links to it.

Following Friday’s lecture, we want to make a package in this project. Using usethis::create_package("percolate") create a package in this directory. Make sure in your console you call this function when you are in the folder that contains the folder precolate.

To navigate in the console use setwd() and getwd() (which sets the directory you’re in and returns which directory you’re currently in)

To set up the package dependencies we’ll be needing today run the following:

usethis::use_testthat()
usethis::use_package("assertthat")
usethis::use_package("ggplot2")

Add all the new files you just created and commit them with a good commit message, e.g. “first commit - initalizing package”.

Now, we’ll be transfering tests and functions from this document to your package directory. Use the usethis::use_r command we learned in Friday’s lecture to create new R function files and use the usethis::use_test command to create new test files (note that usethis::use_test("banana") will create a test-banana.R file).

utils.R and test-utils.R

Run

usethis::use_r("utils")

and store function code you wrote in 1a, 1c. In your console (with the directory as your package directory), run devtools::document(), then run devtools::load_all() look at ?generate_board_mat to see that the documentation is done correctly. Put all examples I request you run into the #' @example section of your documenation.

Run

usethis::use_test("utils")

and store tests of the functions in utils.R (found in 1b, 1c). Run devtools::test(). If this fails, examine where it fails and correct. After tests are running, add all these new files and commit the files you updated.

*To make your package work correctly, at the botton of your documentation for is_valid put

#' @import assertthat

and rerun devtools::document() and devtools::load_all()

board.R and test-board.R

Create a new R file named board.R (same way you created utils.R) and add the function/ class structure defined in 1c. Add associated test to a test files (named test-board.R).

Create documentation your function with devtools::document(), load your package with devtools::load_all(), make sure your tests still run pass with devtools::test() (update if they do not).

Add updated/ new files and commit them.

plot.board and visualize.R Put your plot.board function into the visualize.R file (again you can use usethis::use_r("visualize") to create/ reopen this file). Commit your new code after you’ve documented your package and checked tests (just in case).

** percolate.board and percolate.R** Put percolate.board inot a new file percolate.R, and associated tests in an associated R file. Make sure tests work and functions are documented. Add updated/new files and commit them.

read_board

Put all functions and test associated with problem 4 into your utils.R and test-utils.R files.

Add updated files and commit them (after documenting and testing your functions).

Congrats! Now your code is a package with version control!

doing problem 3

Go to problem 3’s start and follow the directions for Approach 3 (but instead do devtools::install_github("user_name/percolate") in your console and then do library(percolate) at the beginning of this file - see piazza post). Remove code (feel free to also remove text that I wrote as well), above problem 3 (except for problem 1f). Make sure your homework file still knits.

Finally, jump down to the final section of this document labeled Approach 3 (two sections below).

Approach 2

Hi there. Congratulations on completing coding of this assignment while using version control! We now need to convert the code in your directory into a package. Before we go any futher, please make sure you commit all files that you’ve changed and want a snapshot of (it will be impossible to correct errors that occur when transfering to a package if you do not.)

To navigate in the console use `setwd()` and `getwd()` (which sets the directory you're in and returns which directory you're currently in)

You’ll probably notice that the function create_package notices a lot of structure you already have - make sure you only overwrite things that don’t effect the functionality of your code (for example - you should feel free to overwrite the .rproj file, but not any files in R and tests/testthis)

To set up the package dependencies we’ll be needing today run the following:

usethis::use_testthat()
usethis::use_package("assertthat")
usethis::use_package("ggplot2")

Now, run devtools::document() to document all your functions. Check that they are well documented by doing devtools::load_all and then doing things like ?board (you can also look at the create folder man/ and see that all the functions you expect to be documented are).

Now, run devtools::test() make sure all your tests run. Else go back and correct/ update them.

Congrats! Now your code is a package!

doing problem 3

Finally, jump down to the final section of this document labeled Approach 3 (the next section).

Approach 3 / Everyone

Now that you’ve done all the coding, committed all the files you need for the function, etc. We need to do some clean up and push for package to github.

First, open the “DESCRIPTION” file in the main percolate folder. Update this with useful information

Now do the following: - usethis::use_package_doc(): this adds a small documentation file if people want to question your package name

Challenge: Although you’ve used #' @import ggplot2 to have your package put a line in NAMESPACE that says import(ggplot2), it better form to identify the namespace a function come from by doing package_name::function_name. Remove the #' import ggplot2 lines from your documentation, and the import(ggplot2) type lines from the NAMESPACE file. Now run devtools::check()). You’ll see (at minimium) that the we need to actually have tell which namespace functions come from (specifically ggplot2, assert_this and test_this). Do so by doing things like ggplot2::ggplot(ggplot2::aes(...)) + ggplot2::geom_tile(...). Additionally if you use a pipe anywhere run the usethis::use_pipe() command. You may also need to add additional packages to your dependences and more (Feel free to chat with me after the Homework is turned in if you’d like help with this).

What should I submit:

(You may wish to save this file under a different name before making this update).

Please remove all text and code except: 1f, 3a, 3c and 4d (including the devtools::load_all(path = ...) or library(percolate) if you’ve installed from github at the beginning for the Rmd file).

To finish of the assignment, go to this google form and provide your name and github repository link (in the requested format).

Homework 5: Debugging and Testing

Statistical Computing, 36-350

Monday July 29, 2019