## Issue

I am learning how to write a Maximum Likelihood implementation in `Julia`

and currently, I am following this material (highly recommended btw!).

So the thing is I do not fully understand what a **closure** is in Julia nor when should I actually use it. Even after reading the official documentation the concept still remain a bit obscure to me.

For instance, in the tutorial, I mentioned the author defines the log-likelihood function as:

```
function log_likelihood(X, y, β)
ll = 0.0
@inbounds for i in eachindex(y)
zᵢ = dot(X[i, :], β)
c = -log1pexp(-zᵢ) # Conceptually equivalent to log(1 / (1 + exp(-zᵢ))) == -log(1 + exp(-zᵢ))
ll += y[i] * c + (1 - y[i]) * (-zᵢ + c) # Conceptually equivalent to log(exp(-zᵢ) / (1 + exp(-zᵢ)))
end
ll
end
```

However, later he claims that

The log-likelihood as we’ve written is a function of both the data and the parameters, but mathematically it should only depend on the parameters. In addition to that mathematical reason for creating a new function, we want a function

only of the parametersbecausethe optimization algorithms in Optim assume the inputs have that property. To achieve both goals, we’ll construct aclosurethat partially applies the log-likelihood function for us and negates it to give us the negative log-likelihood we want to minimize.

```
# Creating the closure
make_closures(X, y) = β -> -log_likelihood(X, y, β)
nll = make_closures(X, y)
# Define Initial Values equal to zero
β₀ = zeros(2 + 1)
# Ignite the optimization routine using `nll`
res = optimize(nll, β₀, LBFGS(), autodiff=:forward)
```

From the paragraph, I understand that we **NEED** to use it because it is how `Optim`

‘s algorithm works, but I still don’t get what is a **closure** in a broader sense. I will be more than grateful if someone could shed some light on this. Thank you very much.

## Solution

In the context you ask about you can think that closure is a function that references to some variables that are defined in its outer scope (for other cases see the answer by @phipsgabler). Here is a minimal example:

```
julia> function est_mean(x)
function fun(m)
return m - mean(x)
end
val = find_zero(fun, 0.0)
@show val, mean(x)
return fun # explicitly return the inner function to inspect it
end
est_mean (generic function with 1 method)
julia> x = rand(10)
10-element Vector{Float64}:
0.6699650145575134
0.8208379672036165
0.4299946498764684
0.1321653923513042
0.5552854476018734
0.8729613266067378
0.5423030870674236
0.15751882823315777
0.4227087678654101
0.8594042895489912
julia> fun = est_mean(x)
(val, mean(x)) = (0.5463144770912497, 0.5463144770912497)
fun (generic function with 1 method)
julia> dump(fun)
fun (function of type var"#fun#3"{Vector{Float64}})
x: Array{Float64}((10,)) [0.6699650145575134, 0.8208379672036165, 0.4299946498764684, 0.1321653923513042, 0.5552854476018734, 0.8729613266067378, 0.5423030870674236, 0.15751882823315777, 0.4227087678654101, 0.8594042895489912]
julia> fun.x
10-element Vector{Float64}:
0.6699650145575134
0.8208379672036165
0.4299946498764684
0.1321653923513042
0.5552854476018734
0.8729613266067378
0.5423030870674236
0.15751882823315777
0.4227087678654101
0.8594042895489912
julia> fun(10)
9.453685522908751
```

As you can see `fun`

holds the reference to the `x`

variable from the outer scope (in this case the scope introduced by the `est_mean`

function). Moreover, I have shown you that you can even retrieve this value from outside of `fun`

as its field (this is typically not recommended but I show this to you to prove that indeed `fun`

stores a reference to the object `x`

defined in its outer scope; it needs to store this reference as the variabe `x`

is used inside the body of the `fun`

function).

In the context of estimation, as you have noted, this is useful because `find_zero`

in my case requires the function to take only one argument – the `m`

variable in my case, while you want the return value to depend both on passed `m`

and on `x`

.

What is important that once `x`

is captured in the `fun`

closure it does not have to be in current scope. For instance when I call `fun(10)`

the code executes correctly although we are outside of the scope of function `est_mean`

. But this is not a problem because `fun`

function has captured `x`

variable.

Let me give one more example:

```
julia> function gen()
x = []
return v -> push!(x, v)
end
gen (generic function with 1 method)
julia> fun2 = gen()
#4 (generic function with 1 method)
julia> fun2.x
Any[]
julia> fun2(1)
1-element Vector{Any}:
1
julia> fun2.x
1-element Vector{Any}:
1
julia> fun2(100)
2-element Vector{Any}:
1
100
julia> fun2.x
2-element Vector{Any}:
1
100
```

Here you see that the `x`

variable defined within `gen`

function is captured by the anonymous function `v -> push!(x, v)`

that I bind to the `fun2`

variable. Later when you call `fun2`

the object bound to the `x`

variable gets updated (and can be referenced to) although it was defined in the `gen`

function scope. Although we left the `gen`

scope the object bound to the `x`

variable outlives the scope because it is captured by the anonymous function we defined.

If something is unclear please comment.

Answered By – Bogumił Kamiński

Answer Checked By – Candace Johnson (BugsFixing Volunteer)