# [SOLVED] Issue with user-defined function for descriptive statistics from imputed data

## Issue

I am trying to write a function that will calculate the mean and SD for a variable from a multiply imputed dataframe (`mids`). The code works fine outside of the function (as shown in two examples below), but will produce unreliable results when placed inside of a function. The function seems to keep giving results for `bmi` despite calling upon `chl`.

Any insight into this issue is appreciated. Eventually I would like this function to be able to calculate means and SDs for multiple variables at once (i.e., `bmi` and `chl`) but that is likely a separate question.

``````library(mice, warn.conflicts = FALSE)
data(nhanes)
imp <- mice(nhanes, m = 3, print = FALSE, seed = 123)

# workflow that i want to automate
# from here: https://bookdown.org/mwheymans/bookmi/data-analysis-after-multiple-imputation.html
# example 1 - bmi
impdat <- mice::complete(imp, action = "long", include = FALSE)
pool_mean <- with(impdat, by(impdat, .imp, function(x) c(mean(x\$bmi), sd(x\$bmi))))
result <- (Reduce("+", pool_mean)/length(pool_mean))
print(result)
#>  27.117333  3.980506
rm(impdat, pool_mean, result)

# example 2 - chl
impdat <- mice::complete(imp, action = "long", include = FALSE)
pool_mean <- with(impdat, by(impdat, .imp, function(x) c(mean(x\$chl), sd(x\$chl))))
result <- (Reduce("+", pool_mean)/length(pool_mean))
print(result)
#>  195.10667  39.95247
rm(impdat, pool_mean, result)

# automating the workflow
automate <- function(a, b) {
impdat <- mice::complete(a, action = "long", include = FALSE)
pool_mean <- with(impdat, by(impdat, .imp, function(x) c(mean(x\$b), sd(x\$b))))
result <- (Reduce("+", pool_mean)/length(pool_mean))
print(result)
}

automate(a=imp, b=bmi) # looks correct ... ?
#>  27.117333  3.980506
automate(a=imp, b=chl) # no, it isn't
#>  27.117333  3.980506
``````

## Solution

Two and a half problems here:

1. `b = bmi` looks like an object `bmi`, which does not exist in our global environment. We can use `deparse(susbtitute(x))` for this, to tell the function to wait with the evaluation.
2. Accessor function `\$`, see `?Extract`: Both [[ and \$ select a single element of the list. The main difference is that \$ does not allow computed indices
``````automate <- function(a, b) {
b <- deparse(substitute(b))
impdat <- mice::complete(a, action = "long", include = FALSE)
pool_mean <- with(impdat, by(impdat, .imp, function(x) c(mean(x[[b]]), sd(x[[b]]))))
(Reduce("+", pool_mean)/length(pool_mean))
}
automate(a=imp, b=bmi)
 27.117333  3.980506
automate(a=imp, b=chl)
 195.10667  39.95247
``````

To do this on a list of variables, we can rewrite it slightly to

``````automate_list <- function(a, ...){
impdat <- mice::complete(a, action = "long", include = FALSE)
lapply(list(...), function(x){
x = as.name(x)
pool_mean <- with(impdat, by(impdat, .imp, function(y) c(mean(y[[x]]), sd(y[[x]]))))
Reduce("+", pool_mean)/length(pool_mean)
}) |>
setNames(list(...))
}

automate_list(imp, "bmi", "chl")
\$bmi
 27.117333  3.980506

\$chl
 195.10667  39.95247
``````