If you're closely following the Python tag on StackOverflow, you'll notice that the same question comes up at least once a week. The question goes on like this:
Why, when run, this results in the following error:
There are a few variations on this question, with the same core hiding underneath. Here's one:
Running the statement successfully appends 5 to the list. However, substitute it for , and it raises , although at first sight it should accomplish the same.
Although this exact question is answered in Python's official FAQ (right here), I decided to write this article with the intent of giving a deeper explanation. It will start with a basic FAQ-level answer, which should satisfy one only wanting to know how to "solve the damn problem and move on". Then, I will dive deeper, looking at the formal definition of Python to understand what's going on. Finally, I'll take a look what happens behind the scenes in the implementation of CPython to cause this behavior.
The simple answer
As mentioned above, this problem is covered in the Python FAQ. For completeness, I want to explain it here as well, quoting the FAQ when necessary.
Let's take the first code snippet again:
So where does the exception come from? Quoting the FAQ:
This is because when you make an assignment to a variable in a scope, that variable becomes local to that scope and shadows any similarly named variable in the outer scope.
But is similar to , so it should first read , perform the addition and then assign back to . As mentioned in the quote above, Python considers a variable local to , so we have a problem - a variable is read (referenced) before it's been assigned. Python raises the exception in this case .
So what do we do about this? The solution is very simple - Python has the global statement just for this purpose:
This prints , without any errors. The statement tells Python that inside , refers to the global variable , even if it's assigned in .
Actually, there is another variation on the question, for which the answer is a bit different. Consider this code:
This kind of code may come up if you're into closures and other techniques that use Python's lexical scoping rules. The error this generates is the familiar . However, applying the "global fix":
Doesn't help - another error is generated: . Python is right here - after all, there's no global variable named , there's only an in . It may be not local to , but it's not global. So what can you do in this situation? If you're using Python 3, you have the keyword. Replacing by in the last snippet makes everything work as expected. is a new statement in Python 3, and there is no equivalent in Python 2 .
The formal answer
Assignments in Python are used to bind names to values and to modify attributes or items of mutable objects. I could find two places in the Python (2.x) documentation where it's defined how an assignment to a local variable works.
One is section 6.2 "Assignment statements" in the Simple Statements chapter of the language reference:
Assignment of an object to a single target is recursively defined as follows. If the target is an identifier (name):
- If the name does not occur in a global statement in the current code block: the name is bound to the object in the current local namespace.
- Otherwise: the name is bound to the object in the current global namespace.
Another is section 4.1 "Naming and binding" of the Execution model chapter:
If a name is bound in a block, it is a local variable of that block.
When a name is used in a code block, it is resolved using the nearest enclosing scope. [...] If the name refers to a local variable that has not been bound, a UnboundLocalError exception is raised.
This is all clear, but still, another small doubt remains. All these rules apply to assignments of the form which clearly bind to . But the code snippets we're having a problem with here have the assignment. Shouldn't that just modify the bound value, without re-binding it?
Well, no. and its cousins (, , etc.) are what Python calls "augmented assignment statements" [emphasis mine]:
An augmented assignment evaluates the target (which, unlike normal assignment statements, cannot be an unpacking) and the expression list, performs the binary operation specific to the type of assignment on the two operands, and assigns the result to the original target. The target is only evaluated once.
An augmented assignment expression like can be rewritten as to achieve a similar, but not exactly equal effect. In the augmented version, is only evaluated once. Also, when possible, the actual operation is performed in-place, meaning that rather than creating a new object and assigning that to the target, the old object is modified instead.
With the exception of assigning to tuples and multiple targets in a single statement, the assignment done by augmented assignment statements is handled the same way as normal assignments. Similarly, with the exception of the possible in-place behavior, the binary operation performed by augmented assignment is the same as the normal binary operations.
So when earlier I said that is similar to, I wasn't telling all the truth, but it was accurate with respect to binding. Apart for possible optimization, counts exactly as when binding is considered. If you think carefully about it, it's unavoidable, because some types Python works with are immutable. Consider strings, for example:
The first line binds to the value "abc". The second line doesn't modify the value "abc" to be "abcdef". Strings are immutable in Python. Rather, it creates the new value "abcdef" somewhere in memory, and re-binds to it. This can be seen clearly when examining the object ID for before and after the :
Note that some types in Python are mutable. For example, lists can actually be modified in-place:
didn't change after , because the object referenced was just modified. Still, Python re-bound to the same object .
The "too much information" answer
This section is of interest only to those curious about the implementation internals of Python itself.
One of the stages in the compilation of Python into bytecode is building the symbol table . An important goal of building the symbol table is for Python to be able to mark the scope of variables it encounters - which variables are local to functions, which are global, which are free (lexically bound) and so on.
When the symbol table code sees a variable is assigned in a function, it marks it as local. Note that it doesn't matter if the assignment was done before usage, after usage, or maybe not actually executed due to a condition in code like this:
We can use the module to examine the symbol table information gathered on some Python code during compilation:
So we see that was marked as local in . Marking variables as local turns out to be important for optimization in the bytecode, since the compiler can generate a special instruction for it that's very fast to execute. There's an excellent article here explaining this topic in depth; I'll just focus on the outcome.
The function in handles variable name references. To generate the correct opcode, it queries the symbol table function . For our , this returns a bitfield with in it. Having seen , generates a . We can see this in the disassembly of :
The first block of instructions shows what was compiled to. You will note that already here (before it's actually assigned), is used to retrieve the value of .
This is the instruction that will cause the exception to be raised at runtime, because it is actually executed before any is done for . The gory details are in the bytecode interpreter code in :
Ignoring the macro-fu for the moment, what this basically says is that once is seen, the value of is obtained from an indexed array of objects . If no was done before, this value is still , the branch is not taken  and the exception is raised.
You may wonder why Python waits until runtime to raise this exception, instead of detecting it in the compiler. The reason is this code:
Suppose is a function that returns , possibly due to some user input. In this case, binds locally, so the reference to it in is no longer unbound. This code will then run without exceptions. Of course if actually turns out to return , the exception will be raised. Python has no way to resolve this at compile time, so the error detection is postponed to runtime.
Guest post from Toptal Engineering Blog.
Python is an interpreted, object-oriented, high-level programming language with dynamic semantics. Its high-level built in data structures, combined with dynamic typing and dynamic binding, make it very attractive for Rapid Application Development, as well as for use as a scripting or glue language to connect existing components or services. Python supports modules and packages, thereby encouraging program modularity and code reuse.
About this article
Python’s simple, easy-to-learn syntax can mislead Python developers – especially those who are newer to the language – into missing some of its subtleties and underestimating the power of the diverse Python language.
With that in mind, this article presents a “top 10” list of somewhat subtle, harder-to-catch mistakes that can bite even some more advanced Python developers in the rear.
(Note: This article is intended for a more advanced audience than Common Mistakes of Python Programmers, which is geared more toward those who are newer to the language.)
Common Mistake #1: Misusing expressions as defaults for function arguments
Python allows you to specify that a function argument is optional by providing a default value for it. While this is a great feature of the language, it can lead to some confusion when the default value is mutable. For example, consider this Python function definition:
A common mistake is to think that the optional argument will be set to the specified default expression each time the function is called without supplying a value for the optional argument. In the above code, for example, one might expect that calling repeatedly (i.e., without specifying a argument) would always return , since the assumption would be that each time is called (without a argument specified) is set to (i.e., a new empty list).
But let’s look at what actually happens when you do this:
Huh? Why did it keep appending the default value of to an existing list each time was called, rather than creating a new list each time?
The more advanced Python programming answer is that the default value for a function argument is only evaluated once, at the time that the function is defined. Thus, the argument is initialized to its default (i.e., an empty list) only when is first defined, but then calls to (i.e., without a argument specified) will continue to use the same list to which was originally initialized.
FYI, a common workaround for this is as follows:
Common Mistake #2: Using class variables incorrectly
Consider the following example:
Yup, again as expected.
What the $%#!&?? We only changed . Why did change too?
In Python, class variables are internally handled as dictionaries and follow what is often referred to as Method Resolution Order (MRO). So in the above code, since the attribute is not found in class , it will be looked up in its base classes (only in the above example, although Python supports multiple inheritance). In other words, doesn’t have its own property, independent of . Thus, references to are in fact references to . This causes a Python problem unless it’s handled properly. Learn more aout class attributes in Python.
Common Mistake #3: Specifying parameters incorrectly for an exception block
Suppose you have the following code:
The problem here is that the statement does not take a list of exceptions specified in this manner. Rather, In Python 2.x, the syntax is used to bind the exception to the optional second parameter specified (in this case ), in order to make it available for further inspection. As a result, in the above code, the exception is not being caught by the statement; rather, the exception instead ends up being bound to a parameter named .
The proper way to catch multiple exceptions in an statement is to specify the first parameter as a tuple containing all exceptions to be caught. Also, for maximum portability, use the keyword, since that syntax is supported by both Python 2 and Python 3:
Common Mistake #4: Misunderstanding Python scope rules
Python scope resolution is based on what is known as the LEGB rule, which is shorthand for Local, Enclosing, Global, Built-in. Seems straightforward enough, right? Well, actually, there are some subtleties to the way this works in Python, which brings us to the common more advanced Python programming problem below. Consider the following:
What’s the problem?
The above error occurs because, when you make an assignment to a variable in a scope, that variable is automatically considered by Python to be local to that scope and shadows any similarly named variable in any outer scope.
Many are thereby surprised to get an in previously working code when it is modified by adding an assignment statement somewhere in the body of a function. (You can read more about this here.)
It is particularly common for this to trip up developers when using lists. Consider the following example:
Huh? Why did bomb while ran fine?
The answer is the same as in the prior example problem, but is admittedly more subtle. is not making an assignment to , whereas is. Remembering that is really just shorthand for , we see that we are attempting to assign a value to (therefore presumed by Python to be in the local scope). However, the value we are looking to assign to is based on itself (again, now presumed to be in the local scope), which has not yet been defined. Boom.
Common Mistake #5: Modifying a list while iterating over it
The problem with the following code should be fairly obvious:
Deleting an item from a list or array while iterating over it is a Python problem that is well known to any experienced software developer. But while the example above may be fairly obvious, even advanced developers can be unintentionally bitten by this in code that is much more complex.
Fortunately, Python incorporates a number of elegant programming paradigms which, when used properly, can result in significantly simplified and streamlined code. A side benefit of this is that simpler code is less likely to be bitten by the accidental-deletion-of-a-list-item-while-iterating-over-it bug. One such paradigm is that of list comprehensions. Moreover, list comprehensions are particularly useful for avoiding this specific problem, as shown by this alternate implementation of the above code which works perfectly:
Common Mistake #6: Confusing how Python binds variables in closures
Considering the following example:
You might expect the following output:
But you actually get:
This happens due to Python’s late binding behavior which says that the values of variables used in closures are looked up at the time the inner function is called. So in the above code, whenever any of the returned functions are called, the value of is looked up in the surrounding scope at the time it is called (and by then, the loop has completed, so has already been assigned its final value of 4).
The solution to this common Python problem is a bit of a hack:
Voilà! We are taking advantage of default arguments here to generate anonymous functions in order to achieve the desired behavior. Some would call this elegant. Some would call it subtle. Some hate it. But if you’re a Python developer, it’s important to understand in any case.
Common Mistake #7: Creating circular module dependencies
Let’s say you have two files, and , each of which imports the other, as follows:
And in :
First, let’s try importing :
Worked just fine. Perhaps that surprises you. After all, we do have a circular import here which presumably should be a problem, shouldn’t it?
The answer is that the mere presence of a circular import is not in and of itself a problem in Python. If a module has already been imported, Python is smart enough not to try to re-import it. However, depending on the point at which each module is attempting to access functions or variables defined in the other, you may indeed run into problems.
So returning to our example, when we imported , it had no problem importing , since does not require anything from to be defined at the time it is imported. The only reference in to is the call to . But that call is in and nothing in or invokes . So life is good.
But what happens if we attempt to import (without having previously imported , that is):
Uh-oh. That’s not good! The problem here is that, in the process of importing , it attempts to import , which in turn calls , which attempts to access . But has not yet been defined. Hence the exception.
At least one solution to this is quite trivial. Simply modify to import within:
No when we import it, everything is fine:
Common Mistake #8: Name clashing with Python Standard Library modules
One of the beauties of Python is the wealth of library modules that it comes with “out of the box”. But as a result, if you’re not consciously avoiding it, it’s not that difficult to run into a name clash between the name of one of your modules and a module with the same name in the standard library that ships with Python (for example, you might have a module named in your code, which would be in conflict with the standard library module of the same name).
This can lead to gnarly problems, such as importing another library which in turns tries to import the Python Standard Library version of a module but, since you have a module with the same name, the other package mistakenly imports your version instead of the one within the Python Standard Library. This is where bad Python errors happen.
Care should therefore be exercised to avoid using the same names as those in the Python Standard Library modules. It’s way easier for you to change the name of a module within your package than it is to file a Python Enhancement Proposal (PEP) to request a name change upstream and to try and get that approved.
Common Mistake #9: Failing to address differences between Python 2 and Python 3
Consider the following file :
On Python 2, this runs fine:
But now let’s give it a whirl on Python 3:
What has just happened here? The “problem” is that, in Python 3, the exception object is not accessible beyond the scope of the block. (The reason for this is that, otherwise, it would keep a reference cycle with the stack frame in memory until the garbage collector runs and purges the references from memory. More technical detail about this is available here).
One way to avoid this issue is to maintain a reference to the exception object outside the scope of the block so that it remains accessible. Here’s a version of the previous example that uses this technique, thereby yielding code that is both Python 2 and Python 3 friendly:
Running this on Py3k:
(Incidentally, our Python Hiring Guide discusses a number of other important differences to be aware of when migrating code from Python 2 to Python 3.)
Common Mistake #10: Misusing the method
Let’s say you had this in a file called :
And you then tried to do this from :
You’d get an ugly exception.
Why? Because, as reported here, when the interpreter shuts down, the module’s global variables are all set to . As a result, in the above example, at the point that is invoked, the name has already been set to .
A solution to this somewhat more advanced Python programming problem would be to use instead. That way, when your program is finished executing (when exiting normally, that is), your registered handlers are kicked off before the interpreter is shut down.
With that understanding, a fix for the above code might then look something like this:
This implementation provides a clean and reliable way of calling any needed cleanup functionality upon normal program termination. Obviously, it’s up to to decide what to do with the object bound to the name , but you get the idea.
Python is a powerful and flexible language with many mechanisms and paradigms that can greatly improve productivity. As with any software tool or language, though, having a limited understanding or appreciation of its capabilities can sometimes be more of an impediment than a benefit, leaving one in the proverbial state of “knowing enough to be dangerous”.
Familiarizing oneself with the key nuances of Python, such as (but by no means limited to) the moderately advanced programming problems raised in this article, will help optimize use of the language while avoiding some of its more common errors.
You might also want to check out our Insider’s Guide to Python Interviewing for suggestions on interview questions that can help identify Python experts.
We hope you’ve found the pointers in this article helpful and welcome your feedback.
Guest post from Toptal Engineering Blog.