Issue
I’m trying to create a function inside mutate() + across()
that changes into factor any variable which has five or less unique values (or any arbitrary number) wit the idea of using later that factors to do some grouping. I think the logic of the function is correct but I’m getting some incorrect dimensions error (error in Spanish). For the sake of simplicity, I’m using the mtcars database.
mtcars %>%
mutate(across(1:ncol(.),
function(x) {
if_else(length(unique(x[,i]))<=5,
as.factor(x),
x)}
))
Error: Problem with `mutate()` input `..1`.
i `..1 = across(...)`.
x nĂºmero incorreto de dimensiones
Run `rlang::last_error()` to see where the error occurred.
Any help or advice will be much appreciated.
Solution
Here we need if/else
as ifelse/if_else
requires all arguments to be of equal length. The length(unique
expression returns a logical value of length 1 and this may break the condition. Also, with dplyr
, we can use select-helpers
i.e. everything()
to select all the columns
library(dplyr)
out <- mtcars %>%
mutate(across(everything(),
function(x) {
if(length(unique(x))<=5)
as.factor(x) else
x}
))
-output
> str(out)
'data.frame': 32 obs. of 11 variables:
$ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
$ cyl : Factor w/ 3 levels "4","6","8": 2 2 1 2 3 2 3 1 1 2 ...
$ disp: num 160 160 108 258 360 ...
$ hp : num 110 110 93 110 175 105 245 62 95 123 ...
$ drat: num 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
$ wt : num 2.62 2.88 2.32 3.21 3.44 ...
$ qsec: num 16.5 17 18.6 19.4 17 ...
$ vs : Factor w/ 2 levels "0","1": 1 1 2 2 1 2 1 2 2 2 ...
$ am : Factor w/ 2 levels "0","1": 2 2 2 1 1 1 1 1 1 1 ...
$ gear: Factor w/ 3 levels "3","4","5": 2 2 2 1 1 1 1 2 2 2 ...
$ carb: num 4 4 1 1 2 1 4 2 2 4 ...
In addition, the lambda function can be concise with ~
and make use of n_distinct
mtcars %>%
mutate(across(everything(),
~ if(n_distinct(.x) <=5) as.factor(.x) else .x))
Answered By – akrun
Answer Checked By – Gilberto Lyons (BugsFixing Admin)