## Issue

I need to perform GAM on the variable "Life_expectancy" using the three variables: "Adult_Mortality", "HIV_AIDS" and "Schooling". In order to optimally tune the GAM model, I need to find the perfect combination of degrees of freedom for each variable.

To do that I need to create one for loop inside another to find the optimal combination of all variabes e.g. run the following command inside 3 for loops , one for i, one of j and one for k :

```
gam.fit <- gam(Life_expectancy ~ + s(Adult_Mortality, df = i) + s(HIV_AIDS, df = j) + s(Schooling, df = k), data=train)
```

for each combination of i,j,k and calculate the test error each time. In the end, choose the model with the lowest test error. I tried doing this with this code:

```
test.err <- rep(0, 8)
for (i in 3:10) {
for (j in 3:10) {
for (k in 3:10) {
gam.fit <- gam(Life_expectancy ~ + s(Adult_Mortality, df = i) +
s(HIV_AIDS, df = j) +
s(Schooling, df = k),
data=train)
gam.pred <- predict(gam.fit, test)
test.err[i-2] <- mean((test$Life_expectancy - gam.pred)^2)
}}}
```

but this only yields 8 test errors for degrees of freedom i from 3 to 10. How can I output degrees of freedom for every combination of i,j,k?

## Solution

The code can be modified to:

```
test.err <- array(0, c(8,8,8))
for (i in 3:10) {
for (j in 3:10) {
for (k in 3:10) {
gam.fit <- gam(Life_expectancy ~ + s(Adult_Mortality, df = i) +
s(HIV_AIDS, df = j) +
s(Schooling, df = k),
data=train)
gam.pred <- predict(gam.fit, test)
test.err[i-2, j-2, k-2] <- mean((test$Life_expectancy - gam.pred)^2)
}}}
```

A couple of notes about the method:

- You haven’t said which
`gam`

function you’ve used, there are functions in packages`gam`

and`mgcv`

and probably others. The latter can estimate appropriate degrees of freedom based on the training set - You seem to be estimating degrees of freedom based on the fit to the test dataset, which to some extent goes against the idea of having a separate training and test dataset.

Answered By – Miff

Answer Checked By – Terry (BugsFixing Volunteer)