Overview
- Closure is an important feature of Python which protect the variables defined in different places against poisoning each other.
- An common usage of closure is defining an function within another function. The variables are thus defined in different scope, some covering others. So the tricks and traps with closure is talking about how can we make it convenient by using this features and avoid faults that could be caused by it.
- Closure could directly use variables defined outside of it and within the scope surrounding it. When it looks for an variable, it starts searching from local variables. When there is no local variables with the same name existed, it extends the searching space to the closure surrounding it.
- Using variables and assigning new values to variables are different. When it tries assign an value to a variable, it starts from local variables as well but it would create a local copy of it if the target doesn’t exist.
We will walk through these issues with some examples.
Closure could use variables defined outside the scope
To understand what a closure is and how variables exist among different scopes, let’s take a look at this example.
def sort_priority(values, group):
def helper(x):
if x in group:
return (0, x)
return (1, x)
values.sort(key=helper)
def main():
numbers = [8, 3, 1, 2, 5, 4, 7, 6]
group = {2, 3, 5, 7}
sort_priority(numbers, group)
print(numbers)
if __name__ == "__main__":
main()
This function is very simple. It sorts an array and checks another list at the same time, where the numbers in the list are of higher priority and will be sorted in the first order. The expected result would be as follows.
[2, 3, 5, 7, 1, 4, 6, 8]
This function could work because helper
is able to use the value of group
. We don’t need to bother passing group
into helper
even it is defined outside of it because the whole function helper
function is defined within the environment of sort_priority
where group
is defined.
Problem occurs when closure assign values to variables outside of it
Say if we want this function to return a flag indicating whether it finds anything in group
that shall be given high priority. We can simply add another boolean variable found
.
def sort_priority2(values, group):
found = False
def helper(x):
if x in group:
found = True
return (0, x)
return (1, x)
values.sort(key=helper)
return found
def main():
numbers = [8, 3, 1, 2, 5, 4, 7, 6]
group = {2, 3, 5, 7}
found = sort_priority2(numbers, group)
print("found: %s" % found))
print(numbers)
if __name__ == "__main__":
main()
If you run this script yourself, you will see something interesting:
found: False
[2, 3, 5, 7, 1, 4, 6, 8]
The value of found
is wrong whereas the numbers
is correctly sorted! The reason for this is the rule #2 we mentioned earlier. When helper
assign True
to found
, it starts looking for found locally. Certainly, there is no found
there so it created a local value found
which is different from the found
defined in sort_priority2
(you can examine this by looking at their address).
Access data in a closure
nonlocal
keyword
In Python 3.x, there is an convenient keyword nonlocal
which works in an complementary way with global
, declaring that when you looks for found
, go to upper level closure. So you can do something like this and it should yield the results as expected.
def sort_priority3(values, group):
found = False
def helper(x):
nonlocal found
if x in group:
found = True
return (0, x)
return (1, x)
values.sort(key=helper)
return found
def main():
numbers = [8, 3, 1, 2, 5, 4, 7, 6]
group = {2, 3, 5, 7}
found = sort_priority3(numbers, group)
print("found: %s" % found))
print(numbers)
if __name__ == "__main__":
main()
Encapsulate helper
One tip for Python users is that avoiding using nonlocal
as you can or only use them in small simple functions as the side-effects are hard really hard to trace.
In this case, one smart way to do this is encapsulating helper
.
class Sorter(object):
def __init__(self, group):
self.group = group
self.found = False
def __call__(self, x):
if x in self.group:
self.found = True
return (0, x)
return (1, x)
def main():
numbers = [8, 3, 1, 2, 5, 4, 7, 6]
group = {2, 3, 5, 7}
sorter = Sorter(group)
numbers.sort(key=sorter)
print(sorter.found)
print(numbers)
if __name__ == "__main__":
main()