- assert with messages
- Test runners
- Separating test files
- When to use what
- Learn more
So far, you’ve been writing your programs and running them over and over until it works in a constant cycle of change-run-repeat.
Which works fine until:
- You have hundreds of scenarios and inputs
- The program takes too long to run
- You accidentally break your program and don’t even realize it until it’s too late - trillions lost, rockets failing, people dying.
- You get annoyed
Programmers are smart and lazy - a powerful combination - so they automated testing.
Testing is a complex topic with entire professions and teams dedicated to it, but even knowing the basics goes quite far. It may feel unnatural at first, but as time goes, that will change. Also, considering the meager standards of testing companies and open source projects, just knowing this information will immediately make you a much better programmer.
The basis of testing is generally comparing what a value should be with what the program returns. One way we can do this is via
>>> sum([1, 2]) # this is a builtin function 3 >>> assert sum([1, 2]) == 3 >>> # if there is no output, that means the calculation was correct >>> assert sum([1, 2]) == 4 Traceback (most recent call last): File "<stdin>", line 1, in <module> AssertionError
The idea of an
assert is for the computer to check some conditions and complain if things aren’t right. Look at the code above and try to figure out what happened on your own.
Try to think of
assert like a special if statement. Just like if statements, you give
asserts some condition that resolves to a boolean (true or false), and based on that, it acts differently. When the condition is
true, it moves on, but if it’s false, it gives us an
Why not use if conditions?
(Actual student question)
I did mention above that these are like a particular type of if condition. So it is possible to get something similar with if conditions. The reasons not to do this are:
The syntax gets clunky - additional indentation and lots of ifs floating around.
assertsyntax lets you use test runners, which will show you helpful messages like the number of tests passed or failed. These messages help when the number of tests gets much more considerable.
assert with messages
Just giving a condition on its own is often not enough. When things go wrong, the program will just spit out
AssertionError. And when we have multiple test cases, the vagueness of that statement can start becoming an issue. We can include a message in the
assert statement to make it easier on ourselves.
assert <condition>, "this message is only shown when the assert fails."
An example with a more realistic file:
# sum_two_numbers.py def sum(a, b): return a assert sum(0, 0) == 0, "0 + 0 should be 0" assert sum(3, 0) == 3, "3 + 0 should be 3" assert sum(0, 5) == 5, "0 + 5 should be 5" print("All tests passed")
And when the program fails, it displays
AssertionError: 0 + 5 should be 5, so now we know which test case the program is going wrong on.
That’s more helpful.
Test-Driven Development (TDD) is a development pattern where you write the tests for the code before you write the code itself, then you keep working until your tests pass. TDD isn’t helpful in every single scenario, but it will often help you develop faster and understand any bugs that come up better.
This pattern is split up into:
Red - the code is not working. Some or all tests are failing. Many test programs show the actual color red.
Green - the code is now working, and all tests pass. Likewise, some test programs show green output.
Refactor - now that you are confident that your existing code works and that you have tests to ensure that it keeps working, you can add on improvements confidently.
This workflow is typical with many industry and open source projects, and this mindset can help with your projects. The order doesn’t have to be too strict since it may be difficult to write tests before code, but other times, writing tests first will be much easier.
Let’s walk through an example. This example will give you an outline of both general problem-solving and tests.
Suppose that I am tasked with writing a function that multiplies all the numbers in a list together. It should return
None if the list is empty (similar to
nil from other languages). Assume all numbers are integers.
The first step of problem-solving is to understand the problem itself.
Before we think about the answer, we need to understand what the results should be.
This plays well with the TDD approach. For this, we can break up the problem
into test cases, which are some inputs and conditions with their expected results, all verified by
For each requirement, we can derive a series of test cases:
- Multiplies all numbers of the list
- A simple basic case -
[1, 1, 1]should give
- Even giving one zero will give the final result as zero -
[0, 4253, -342]should return
- the function should support negative numbers properly -
[-1, 4, 20]should return
- Two negatives cancel each other out -
[-1, -4, 20]should return
- Any list with one item should return itself -
- A simple basic case -
None: (these types of requirements are called edge cases - we can decide how to deal with unexpected results)
- The basic case -
- The basic case -
Notice how I haven’t written any code or even thought about the code yet. Even then, I understood my problem better, and some approaches on how I can solve the problem might have even popped up in my head by this point.
Time to put this in code.
# multiply_list.py def multiply(list): return 0 # Placeholder def test_multiply(): assert multiply([1, 1, 1]) == 1, "Simple basic case should give 1" assert multiply([0, 4253, -342]) == 0, "Inclusion of zeros should give zero" assert multiply([-1, 4, 20]) == -80, "Negative numbers should work" assert multiply() == 42, "Single number should return itself" assert multiply() is None, "Empty list should give None" # Note: For checking None we use `is` over == test_multiply() print("All tests passed")
Take a moment to look at how we took our written test cases into the code. Each different point became its own
assert. While this doesn’t guarantee that our program works, we can know that it supports at least these cases. Generally, when you want to make sure a particular set of requirements are fit, you can write specific test cases around those.
Now let’s run our program:
$ python3 multiply_list.py Traceback (most recent call last): File "multiply_list.py", line 13, in <module> test_multiply() File "multiply_list.py", line 6, in test_multiply assert multiply([1, 1, 1]) == 1, "Simple basic case should give 1" AssertionError: Simple basic case should give 1
Now this gives us a starting point. We now know that the case with the input
[1, 1, 1] isn’t giving us the expected output of
1. When you write good test cases, it becomes easier to look at a problem piece by piece, as the different tests break up a large problem into multiple more minor problems.
This stage is RED - not working, as our tests aren’t passing.
Now let’s try to write the
multiply(list) function for the normal cases:
def multiply(list): product = 1 for i in range(len(list)): product = product * list[i] return product
When we run the test cases:
Traceback (most recent call last): File "multiply_list.py", line 20, in <module> test_multiply() File "multiply_list.py", line 17, in test_multiply assert multiply() is None, "Empty list should give None" AssertionError: Empty list should give None
Since a different test is failing now, we know that we’ve passed the previous one. Now, let’s fix the empty list case:
def multiply(list): if len(list) == 0: return None product = 1 for i in range(len(list)): product = product * list[i] return product
The output of test cases:
All tests passed
This stage is GREEN - working - tests passing.
Now, what if we want to change it. Maybe we want to improve the code quality or change around some stuff. Then, we need to add or change whatever we need features and run the test code again and again. This refactor step will let us ensure that we don’t break anything.
For example, let’s say I decide to use a for loop to traverse the list instead of using
range() and indexes.
def multiply(list): if len(list) == 0: return None product = 1 for item in list: product = product * item return product
When I run this code, I get
All tests passed
This means I know the code that I wrote works and didn’t break anything. Because I took the extra time to write tests at the beginning, I can spend less time testing each case. I can work faster this way.
Testing alongside programming isn’t possible all the time, but this type of development often sets apart the great developers from the good ones.
asserts are nice, but they’re also quite limited. Instead of
asserts, we can use test runners.
Some nice features of test runners:
- Supporting many Empty fails, usually the failure of one case crashes the program
- Counting of fail and pass tests
- Easier syntax for more complex features.
- Actual red and green color output for visual cues.
In practice, you will use one of these, either one you set up yourself or one that comes built-in with something like
I have picked pytest for its simplicity, but many others exist. The way I’ll approach it is pretty indicative of how testing is done in general, and you can follow the same approach with other test runners and programming languages.
$ pip install pytest
Interestingly it supports the
multiply_list.py file we wrote earlier as is
without changing anything. How it works is that it looks for any function that
test, like ours:
def test_multiply(): assert multiply([1, 1, 1]) == 1, "Simple basic case should give 1" assert multiply([0, 4253, -342]) == 0, "Inclusion of zeros should give zero" assert multiply([-1, 4, 20]) == -80, "Negative numbers should work" assert multiply() == 42, "Single number should return itself" assert multiply() is None, "Empty list should give None" # Note: For checking None we use `is` over ==
And it runs like so.
$ pytest multiply_list.py ======================================= test session starts ================================= platform linux -- Python 3.6.7, pytest-5.0.1, py-1.8.0, pluggy-0.12.0 rootdir: /home/harsh183/Experiments/saloni-teaching collected 1 item multiply_list.py . [100%] =====================================1 passed in 0.02 seconds ===============================
The last line is the color output (green here). Try running it yourself to see.
Even better, it doesn’t run your main program itself, which means that you can
keep all your
input() functions in your program without them interfering with your testing output. To demonstrate this, I added this to the end of the program:
print("pytest actually even ignores all the normal print output") print("so you can blend in your normal program and tests in the same file")
This only appears when I run the program normally with
$ python3 multiply_list.py. Output:
pytest actually even ignores all the normal print output so you can blend in your normal program and tests in the same file
This means that testing doesn’t have to interrupt your workflow or the structure of your program at all. All you have to provide is a test function, and pytest does the rest.
In summary, all we need to do for pytest is create a test function starting with
test_solution, and pytest will automatically find it.
We can still use regular
asserts to interact with pytest.
When we run the tests with pytest, we run it like
$ pytest program_name.py.
You might also encounter other types of
assertIn, etc. These just have easier syntax in some cases (often, programmers will use the term “syntax sugar”). You can just use simple
asserts for the most part, but you can use others if you’re comfortable with them. Frequently, they are parts of different runners, so it will vary what you can use.
assert should be there in pretty much everything, so we’ve only covered that.
Separating test files
So far, we’ve been writing all our code in one file. This is okay for small programs and exercises, but typically, a program is kept in separate files. This is relatively easy to do, and some frameworks will even do it for you. If you have the time, I highly suggest doing this for anything that’s more than 10-15 lines of code.
Usually, the testing file will be called either
tests.py. If there are many tests, then there might be multiple files in a folder called
See this tutorial for a basic overview of imports.
When to use what
The rule of thumb is to use a test runner for almost every case since it’s typically neater and better for larger projects and working with teams. Sometimes when it’s just scripts or small hacky experiments, you can just use
assert statements within the same file. For sanity checks, that’s fine
as long as you remove them later. That said, adding tests can go a long way, and your future self will thank you.
One more round of red-green-refactor to give you more practice with the workflow.
Find the mode in a list of integers.
- The mode is the item that occurs the most in a given list. For example, in
[1, 1, 2, 3], the mode is
- For an empty list, I want you to return
pytest and the same red-green workflow I showed earlier in this unit. It should look very similar to the structure I used. In general, this is a good approach for problem-solving. The more you practice, the more you’ll improve!
Great and quite more extensive guide on Python testing Highly recommended
Given-when-then - How to approach writing tests in a given X when Y happens Z should happen. Highly recommended
How not to test - Dangers of overtesting and understanding how testing everything isn’t the best approach