Issue
I have the following function:
estimate = function(df, y_true) {
R = nrow(df)
y_estimated = apply(df, 2, mean)
((sqrt( (y_estimated - y_true)^2 / R)) / y_true) * 100
}
df = iris[1:10,2:4]
y_true = c(3, 1, 0.4)
estimate(df = df, y_true = y_true)
user:bird provided this and works great, however, I also need to find the means by group. So if we change the df to df= iris[,2:5]
, how to do I find the means of each column by Species to use in the function. I figured something like this would work- but not luck:
estimate = function(df, y_true, group) {
R = nrow(df)
y_estimated = df %>% group_by(group) %>% apply(df, 2, mean)
((sqrt( (y_estimated - y_true)^2 / R)) / y_true) * 100
}
df = iris[2:5]
y_true = c(3, 1, 0.4)
group=df$Species
estimate(df = df, y_true = y_true, group=group)
Using colMeans
also did not work.
This is an extension of this post which explains the purpose of each variable.
Solution
Rather than modifying your function, you can keep the function as-is and apply it group-wise to your data. If you use group_by
and then group_modify
, the input to the function you pass to group_modify
is the data frame, subset to the rows in that specific group.
estimate = function(df, y_true) {
R = nrow(df)
y_estimated = apply(df, 2, mean)
((sqrt( (y_estimated - y_true)^2 / R)) / y_true) * 100
}
df = iris[2:5]
y_true = c(3, 1, 0.4)
library(dplyr, warn.conflicts = FALSE)
df %>%
group_by(Species) %>%
group_modify(~ as.data.frame.list(estimate(., y_true)))
#> # A tibble: 3 × 4
#> # Groups: Species [3]
#> Species Sepal.Width Petal.Length Petal.Width
#> <fct> <dbl> <dbl> <dbl>
#> 1 setosa 2.02 6.53 5.44
#> 2 versicolor 1.08 46.1 32.7
#> 3 virginica 0.123 64.4 57.5
Created on 2022-02-24 by the reprex package (v2.0.1)
Answered By – IceCreamToucan
Answer Checked By – Mary Flores (BugsFixing Volunteer)