## Issue

I have a data set that I am trying to use to generate a different data set in R. The dataset has many columns; but the three relevant columns for generating the new data set are "Reach", "Results", and "DV". Reach and results are numeric. DV is binary with 0s and 1s. In the original dataset, all rows have DV = 0.

For each row of the original data set, I am attempting to take one variable "Reach" and replicate that row "reach" number of times.

Then for this new set of rows, I want to change DV from 0 to 1 for "results" number (from the original row) of the new rows.

For example, in row 33 of the original data set: Reach = 1004, Results = 45, DV = 0. The new data set should have row 33 replicated 1004 times, for 45 of those new rows DV should be changed from 0 to 1.

The code I wrote for the task works… but it is taking 10+ hours to run because the file is so large. Any ideas for how to simplify this code so it can process more quickly

```
empty_new.video <- new.video[FALSE,]
for(i in 1:nrow(new.video)){
n.times <- new.video[i,'Reach'] #determine number of times to repeat rows
if (n.times > 0){
for (j in 1:n.times){
empty_new.video[nrow(empty_new.video) + 1 , ] <- new.video[i,]
}
}
dv.times <- new.video[i,'Results'] #creating dependent variable
if (dv.times>0){
for (k in 1:dv.times){
empty_new.video[nrow(empty_new.video) - n.times + k,'DV'] <- 1
}
}
}
```

## Solution

Rather than a loop to do everything at once, you could define a simple function that does this for one row and check the results

```
dd <- data.frame(Reach = c(5, 3), Results = c(4, 1), DV = c(0, 0))
# Reach Results DV
# 1 5 4 0
# 2 3 1 0
f <- function(data) {
nr <- data$Reach
nd <- data$Results
data <- data[rep_len(1L, nr), ]
data$DV <- rep(0:1, c(nr - nd, nd))
rownames(data) <- NULL
data
}
f(dd[1, ])
```

Then loop for every row

```
res <- lapply(split(dd, rownames(dd)), f)
do.call('rbind', res)
# Reach Results DV
# 1.1 5 4 0
# 1.2 5 4 1
# 1.3 5 4 1
# 1.4 5 4 1
# 1.5 5 4 1
# 2.1 3 1 0
# 2.2 3 1 0
# 2.3 3 1 1
```

But really all you are doing is creating a vector of row indices and 0/1 values for `DV`

, you could do that with `rep`

```
ii <- rep(1:nrow(dd), dd$Reach)
jj <- c(t(cbind(dd$Reach - dd$Results, dd$Results)))
dv <- rep(rep(0:1, nrow(dd)), jj)
within(dd[ii, ], {
DV <- dv
})
# Reach Results DV
# 1 5 4 0
# 1.1 5 4 1
# 1.2 5 4 1
# 1.3 5 4 1
# 1.4 5 4 1
# 2 3 1 0
# 2.1 3 1 0
# 2.2 3 1 1
```

Answered By – rawr

Answer Checked By – Pedro (BugsFixing Volunteer)