[SOLVED] How to mask a string based on a pattern of string of same length

Issue

I have the following set of string:

core_string     <- "AFFVQTCRE"
mask_string     <- "*KKKKKKKK"

What I want to do is to mask core_string with mask_string.
Whenever the * coincide with character in core_string, we will keep that character,
otherwise replace it.

So the desired result is:

   AKKKKKKKK

Other example

core_string     <- "AFFVQTCRE"
mask_string     <- "*KKKK*KKK"
 #     result       AKKKKTKKK

The length of both strings is always the same.
How can I do that with R?

Solution

Here’s a helper function that will do just that

apply_mask <- function(x, mask) {
  unlist(Map(function(z, m) {
    m[m=="*"]  <- z[m=="*"]
    paste(m, collapse="")
  }, strsplit(x, ""), strsplit(mask, "")))
}

basically you just split up the string into characters and replace the characters that have a "*" then paste the strings back together.

I used the Map to make sure the function is still vectorized over the inputs. For example

core_string     <- c("AFFVQTCRE", "ABCDEFGHI")
mask_string     <- "*KKKK*KKK"

apply_mask(core_string, mask_string)
# [1] "AKKKKTKKK" "AKKKKFKKK"

Answered By – MrFlick

Answer Checked By – Mildred Charles (BugsFixing Admin)

Leave a Reply

Your email address will not be published. Required fields are marked *