[SOLVED] R: instrument function to capture all assignments

Issue

Given a regular R function f, I’d like to be able to create a new function f_debug that acts just like f, but lets me keep track of all the assignments to function-local variables that happened inside it.

For example:

f <- function(x, y) {
  z <- x + y
  df <- data.frame(z=z)
  df
}

# This function doesn't work as intended - would like it to (in the case of `f` above)
# write out a list containing `z` and `df` to an RDS file
capturing <- function(func) {
  e <- new.env()
  altered <- function(...) {
    parent <- parent.frame()
    e <- something...(func, environment(), parent, etc., etc.)
    result <- func(...)
    saveRDS(as.list(e), 'foo.rds')
    result
  }
  environment(func) <- e
  altered
}

f_debug <- capturing(f)

I’m not sure whether my knowledge gap to do this is large or small, anyone have a solution?

Solution

What you are describing is already implemented in base R with utils::dump.frames, in an even more sophisticated way. It saves the frame (environment) associated with each call in the call stack to an object of class "dump.frames", which you can explore retroactively with utils::debugger as if you had actually run your code under a debugger.

capturing <- function(func, ...) {
    cc <- as.call(c(quote(utils::dump.frames), list(...)))
    cc <- call("on.exit", cc, add = TRUE)
    body(func) <- call("{", cc, body(func))
    func
}

capturing injects the call on.exit(utils::dump.frames(...), add = TRUE) into the body of func and returns the modified function.
Here, ... is a list of arguments to dump.frames:

  • dumpto, a character string giving the name to be used for the "dump.frames" object
  • to.file, a logical flag indicating whether the "dump.frames" object should be assigned in the global environment or save-ed to paste0(dumpto, ".rda") in the current working directory
  • include.GlobalEnv, a logical flag indicating whether the global environment should be saved as well

A quick example, which you should try yourself:

tmp <- tempfile()
dir.create(tmp)
cwd <- setwd(tmp)

f <- function(x, y) {
    z <- x + y
    z + 1
}
g <- capturing(f, dumpto = "zzz", to.file = TRUE)
h <- function(a, b) {
    d <- g(a, b)
    d + 1
}
h12 <- h(1, 2)

load("zzz.rda")
zzz
## $`h(1, 2)`
## <environment: 0x14c16cb58>
## 
## $`#2: g(a, b)`
## <environment: 0x14c16ca40>
## 
## attr(,"error.message")
## [1] ""
## attr(,"class")
## [1] "dump.frames"

ls(zzz[[1L]])
## [1] "a" "b"

ls(zzz[[2L]])
## [1] "z" "x" "y"

utils::debugger(zzz)
## Message:  Available environments had calls:
## 1: h(1, 2)
## 2: #2: g(a, b)
## 
## Enter an environment number, or 0 to exit  
## Selection: 2
## Browsing in the environment with call:
##    #2: g(a, b)
## Called from: debugger.look(ind)
## Browse[1]> ls()
## [1] "x" "y" "z"
## Browse[1]> x == 1 && y == 2 && z == x + y
## [1] TRUE
## Browse[1]> Q

setwd(cwd)
unlink(tmp, recursive = TRUE)

See ?browser if you are unfamiliar with R’s environment browser.

My capturing function has the limitation that on.exit calls in the body of func must also use add = TRUE. If you have written func yourself, then it is not much of a limitation at all, and passing add = TRUE is a good habit anyway.

Ultimately, there is no completely safe way to inject code into functions, but, in an interactive setting, I would say that this level of "unsafety" is fine.

Answered By – Mikael Jagan

Answer Checked By – Mildred Charles (BugsFixing Admin)

Leave a Reply

Your email address will not be published. Required fields are marked *