The Python programs I have written from the time I started learning Python up to now are very simple scripts. I still consider myself a beginner, but I have started to write not so simple code that is usually very brittle and buggy.
I have a Github repo of scripts that automate common tasks on a computer and one of the scripts I wrote is a program that creates a backup of all pictures in a particular folder. This program walks up and down the directory tree in search of picture files and contains a couple of nested loops and if statements that are run for each file or folder found in the directory. While writing the program, I used a LOT of print statements to tell me what the program was doing, which folder or file it was currently checking and whether or not it managed to copy any pictures to the backup location. This worked out OK at first until I ran the program in a folder with a lot of files, there were a lot of print statements to go through.
The second problem I faced was that the program would crash whenever I tried to backup files to a nonexistent folder and it would also overwrite files many times over while it was running. It took me a while(and a many print statements) to figure this out and fix it. Eventually I did fix this problem and now the script works the way I want it to work. After that experience, I figured that there must be an easier way to debug programs and prevent problems as soon as possible. In this post, I will cover some tools and techniques you can use to find bugs quickly and solve them with less effort.
Assertions
Assertions provide a way for you to check that your code isn’t doing something it shouldn’t be doing. Here’s an example of an assert statement:
# Create letters from the alphabet alphabet = map(chr, range(97, 123)) assert len(alphabet) is 26, "The English Alphabet has only 26 letters."
The assert
statement consists of a condition under test and a string that will be displayed when the condition is False
. Assertions are useful for
- Checking parameter types.
- Checking can’t happen situations like the one in the code sample above.
- Checking the state of invariants(things that aren’t supposed to change).
Asserts help run sanity hecks on your code and make you aware of problems as soon as possible and the best part is that they can be turned off easily by calling python with the “-O” flag. Testing code thoroughly through unit tests should not be neglected however. I will cover unit testing in a future blog post.
Logging
print
statements can be combined with dir()
and placed in the source code to debug programs whenever they are run, but the biggest problem of using this approach in my opinion is that once you’re done debugging, you have to go through all your code to remove the print statements. Doing this introduces the risk that you might delete print
statements that were used for non-logging purposes. The logging module makes it easy to create a record of what your program is doing and also allows you to create log files that are written to whenever your program is running or when a certain condition occurs.
To use logging, first import the module and then create a logger.
import logging # Create the logger logging.basicConfig(level=logging.DEBUG, format='%(asctime)s - %(levelname)s - %(message)s') def squares(foo): if foo is None: logging.warning("No value parsed") for item in foo: if item < 0: logging.warning("A negative encountered: %r. Skipping" % item) continue logging.info('Processing %r' % item) print item ** 2
Running the code above in the interpreter will produce the following:
numbers = [-5, 0, 1, 2, 3, 4, 5] >>> squares(numbers) 2017-02-02 12:47:46,185 - WARNING - A negative encountered: -5. Skipping 2017-02-02 12:47:46,210 - INFO - Processing 0 0 2017-02-02 12:47:46,234 - INFO - Processing 1 1 2017-02-02 12:47:46,252 - INFO - Processing 2 4 2017-02-02 12:47:46,270 - INFO - Processing 3 9 2017-02-02 12:47:46,286 - INFO - Processing 4 16 2017-02-02 12:47:46,301 - INFO - Processing 5 25
Log messages can be categorised by level of importance using, Log Levels. There are five log levels:
- DEBUG: This is the lowest level, used to record small details.
- INFO: This is used to log general information on events occurring in your program.
- WARNING: Used to indicate a potential problem.
- ERROR: Record an error that causes a failure.
- CRITICAL: Record fatal errors. This is the highest log level
In the example above, I used logging.info()
for most messages and logging.warning()
whenever a negative number is encountered.
To disable logging completely, all you have to do is to type logging.disable(logging.CRITICAL)
before any calls to logging are made. See the example below:
import logging # Create the logger logging.basicConfig(level=logging.DEBUG, format='%(asctime)s - %(levelname)s - %(message)s') logging.disable(logging.CRITICAL) # Disable all messages up to the CRITICAL level. def squares(foo): if foo is None: logging.warning("No value parsed") for item in foo: if item < 0: logging.warning("A negative encountered: %r. Skipping" % item) continue logging.info('Processing %r' % item) print item ** 2
Running that will produce this:
>>> squares(numbers) 0 1 4 9 16 25 >>>
So, it’s possible to log messages at any level you choose or to completely disable logging altogether using a single line of code, this is much better than commenting out print
statements.
The Python Debugger
I saved the best one for last. The Python Debugger or pdb allows you to interactively control the execution of your program one line at a time. It gives you the power to check and change the value of variables in realtime, isolate parts of the program you want to test and more. This blog post here provides an excellent introduction to pdb.
The End
That’s all for now. There’s lot’s I didn’t mention like exception handling, unit testing and using version control to keep your code bug clean. I don’t know enough about all these topics right now to write about them, but I will do so as soon as I have something worth sharing. Please comment on my blog post and let me know if I missed anything or if any of the information here is incorrect.