Testing 101 (with pytest)

assert
- Why not use if conditions?
assert with messages
Red-green-refactor
- Red
- Green
- Refactor
Test runners
- pytest
- Fancier asserts
Separating test files
When to use what
Exercises
Learn more

So far, you’ve been writing your programs and running them over and over until it works in a constant cycle of change-run-repeat.

Which works fine until:

You have hundreds of scenarios and inputs
The program takes too long to run
You accidentally break your program and don’t even realize it until it’s too late - trillions lost, rockets failing, people dying.
You get annoyed

Programmers are smart and lazy - a powerful combination - so they automated testing.

Testing is a complex topic with entire professions and teams dedicated to it, but even knowing the basics goes quite far. It may feel unnatural at first, but as time goes, that will change. Also, considering the meager standards of testing companies and open source projects, just knowing this information will immediately make you a much better programmer.

assert

The basis of testing is generally comparing what a value should be with what the program returns. One way we can do this is via assert.

>>> sum([1, 2]) # this is a builtin function
3
>>> assert sum([1, 2]) == 3
>>> # if there is no output, that means the calculation was correct
>>> assert sum([1, 2]) == 4
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AssertionError

The idea of an assert is for the computer to check some conditions and complain if things aren’t right. Look at the code above and try to figure out what happened on your own.

assert <condition>

Try to think of assert like a special if statement. Just like if statements, you give asserts some condition that resolves to a boolean (true or false), and based on that, it acts differently. When the condition is true, it moves on, but if it’s false, it gives us an AssertionError.

Why not use if conditions?

(Actual student question)

I did mention above that these are like a particular type of if condition. So it is possible to get something similar with if conditions. The reasons not to do this are:

The syntax gets clunky - additional indentation and lots of ifs floating around.
The assert syntax lets you use test runners, which will show you helpful messages like the number of tests passed or failed. These messages help when the number of tests gets much more considerable.

assert with messages

Just giving a condition on its own is often not enough. When things go wrong, the program will just spit out AssertionError. And when we have multiple test cases, the vagueness of that statement can start becoming an issue. We can include a message in the assert statement to make it easier on ourselves.

assert <condition>, "this message is only shown when the assert fails."

An example with a more realistic file:

# sum_two_numbers.py
def sum(a, b):
    return a

assert sum(0, 0) == 0, "0 + 0 should be 0"
assert sum(3, 0) == 3, "3 + 0 should be 3" 
assert sum(0, 5) == 5, "0 + 5 should be 5"
print("All tests passed")

And when the program fails, it displays AssertionError: 0 + 5 should be 5, so now we know which test case the program is going wrong on. That’s more helpful.

Red-green-refactor

Test-Driven Development (TDD) is a development pattern where you write the tests for the code before you write the code itself, then you keep working until your tests pass. TDD isn’t helpful in every single scenario, but it will often help you develop faster and understand any bugs that come up better.

This pattern is split up into:

Red - the code is not working. Some or all tests are failing. Many test programs show the actual color red.
Green - the code is now working, and all tests pass. Likewise, some test programs show green output.
Refactor - now that you are confident that your existing code works and that you have tests to ensure that it keeps working, you can add on improvements confidently.

This workflow is typical with many industry and open source projects, and this mindset can help with your projects. The order doesn’t have to be too strict since it may be difficult to write tests before code, but other times, writing tests first will be much easier.

Let’s walk through an example. This example will give you an outline of both general problem-solving and tests.

Suppose that I am tasked with writing a function that multiplies all the numbers in a list together. It should return None if the list is empty (similar to null or nil from other languages). Assume all numbers are integers.

Red

The first step of problem-solving is to understand the problem itself. Before we think about the answer, we need to understand what the results should be. This plays well with the TDD approach. For this, we can break up the problem into test cases, which are some inputs and conditions with their expected results, all verified by asserts.

For each requirement, we can derive a series of test cases:

Multiplies all numbers of the list
- A simple basic case - [1, 1, 1] should give 1
- Even giving one zero will give the final result as zero - [0, 4253, -342] should return 0
- the function should support negative numbers properly - [-1, 4, 20] should return -80
- Two negatives cancel each other out - [-1, -4, 20] should return 80
- Any list with one item should return itself - [42] should return 42
Inputting None: (these types of requirements are called edge cases - we can decide how to deal with unexpected results)
- The basic case - [] should give None

Notice how I haven’t written any code or even thought about the code yet. Even then, I understood my problem better, and some approaches on how I can solve the problem might have even popped up in my head by this point.

Time to put this in code.

# multiply_list.py
def multiply(list):
    return 0 # Placeholder

def test_multiply():
    assert multiply([1, 1, 1]) == 1, "Simple basic case should give 1"
    assert multiply([0, 4253, -342]) == 0, "Inclusion of zeros should give zero"
    assert multiply([-1, 4, 20]) == -80, "Negative numbers should work"
    assert multiply([42]) == 42, "Single number should return itself"
    assert multiply([]) is None, "Empty list should give None" 
    # Note: For checking None we use `is` over ==

test_multiply()
print("All tests passed")

Take a moment to look at how we took our written test cases into the code. Each different point became its own assert. While this doesn’t guarantee that our program works, we can know that it supports at least these cases. Generally, when you want to make sure a particular set of requirements are fit, you can write specific test cases around those.

Now let’s run our program:

$ python3 multiply_list.py
Traceback (most recent call last):
  File "multiply_list.py", line 13, in <module>
    test_multiply()
  File "multiply_list.py", line 6, in test_multiply
    assert multiply([1, 1, 1]) == 1, "Simple basic case should give 1"
AssertionError: Simple basic case should give 1

Now this gives us a starting point. We now know that the case with the input [1, 1, 1] isn’t giving us the expected output of 1. When you write good test cases, it becomes easier to look at a problem piece by piece, as the different tests break up a large problem into multiple more minor problems.

This stage is RED - not working, as our tests aren’t passing.

Green

Now let’s try to write the multiply(list) function for the normal cases:

def multiply(list):
    product = 1
    for i in range(len(list)):
        product = product * list[i]

    return product

When we run the test cases:

Traceback (most recent call last):
  File "multiply_list.py", line 20, in <module>
    test_multiply()
  File "multiply_list.py", line 17, in test_multiply
    assert multiply([]) is None, "Empty list should give None"
AssertionError: Empty list should give None

Since a different test is failing now, we know that we’ve passed the previous one. Now, let’s fix the empty list case:

def multiply(list):
    if len(list) == 0:
        return None

    product = 1
    for i in range(len(list)):
        product = product * list[i]

    return product

The output of test cases:

All tests passed

This stage is GREEN - working - tests passing.

Refactor

Now, what if we want to change it. Maybe we want to improve the code quality or change around some stuff. Then, we need to add or change whatever we need features and run the test code again and again. This refactor step will let us ensure that we don’t break anything.

For example, let’s say I decide to use a for loop to traverse the list instead of using range() and indexes.

def multiply(list):
    if len(list) == 0:
        return None

    product = 1
    for item in list:
        product = product * item

    return product

When I run this code, I get

All tests passed

This means I know the code that I wrote works and didn’t break anything. Because I took the extra time to write tests at the beginning, I can spend less time testing each case. I can work faster this way.

Testing alongside programming isn’t possible all the time, but this type of development often sets apart the great developers from the good ones.

Test runners

Raw asserts are nice, but they’re also quite limited. Instead of asserts, we can use test runners.

Some nice features of test runners:

Supporting many Empty fails, usually the failure of one case crashes the program
Counting of fail and pass tests
Easier syntax for more complex features.
Actual red and green color output for visual cues.

In practice, you will use one of these, either one you set up yourself or one that comes built-in with something like django or flask.

I have picked pytest for its simplicity, but many others exist. The way I’ll approach it is pretty indicative of how testing is done in general, and you can follow the same approach with other test runners and programming languages.

pytest

$ pip install pytest

Interestingly it supports the multiply_list.py file we wrote earlier as is without changing anything. How it works is that it looks for any function that starts with test, like ours:

def test_multiply():
    assert multiply([1, 1, 1]) == 1, "Simple basic case should give 1"
    assert multiply([0, 4253, -342]) == 0, "Inclusion of zeros should give zero"
    assert multiply([-1, 4, 20]) == -80, "Negative numbers should work"
    assert multiply([42]) == 42, "Single number should return itself"
    assert multiply([]) is None, "Empty list should give None"
    # Note: For checking None we use `is` over ==

And it runs like so.

$ pytest multiply_list.py
======================================= test session starts =================================
platform linux -- Python 3.6.7, pytest-5.0.1, py-1.8.0, pluggy-0.12.0
rootdir: /home/harsh183/Experiments/saloni-teaching
collected 1 item

multiply_list.py .                                                                   [100%]

=====================================1 passed in 0.02 seconds ===============================

The last line is the color output (green here). Try running it yourself to see.

Even better, it doesn’t run your main program itself, which means that you can keep all your print() and input() functions in your program without them interfering with your testing output. To demonstrate this, I added this to the end of the program:

print("pytest actually even ignores all the normal print output")
print("so you can blend in your normal program and tests in the same file")

This only appears when I run the program normally with $ python3 multiply_list.py. Output:

pytest actually even ignores all the normal print output
so you can blend in your normal program and tests in the same file

This means that testing doesn’t have to interrupt your workflow or the structure of your program at all. All you have to provide is a test function, and pytest does the rest.

In summary, all we need to do for pytest is create a test function starting with test, e.g., test_multiply, test_solution, and pytest will automatically find it.

We can still use regular asserts to interact with pytest.

When we run the tests with pytest, we run it like $ pytest program_name.py.

Fancier asserts

You might also encounter other types of asserts, like assertEqual, assertFalse, assertIn, etc. These just have easier syntax in some cases (often, programmers will use the term “syntax sugar”). You can just use simple asserts for the most part, but you can use others if you’re comfortable with them. Frequently, they are parts of different runners, so it will vary what you can use. assert should be there in pretty much everything, so we’ve only covered that.

Separating test files

So far, we’ve been writing all our code in one file. This is okay for small programs and exercises, but typically, a program is kept in separate files. This is relatively easy to do, and some frameworks will even do it for you. If you have the time, I highly suggest doing this for anything that’s more than 10-15 lines of code.

Usually, the testing file will be called either test.py or tests.py. If there are many tests, then there might be multiple files in a folder called test/ or tests/.

See this tutorial for a basic overview of imports.

When to use what

The rule of thumb is to use a test runner for almost every case since it’s typically neater and better for larger projects and working with teams. Sometimes when it’s just scripts or small hacky experiments, you can just use assert statements within the same file. For sanity checks, that’s fine as long as you remove them later. That said, adding tests can go a long way, and your future self will thank you.

Exercises

One more round of red-green-refactor to give you more practice with the workflow.

Find the mode in a list of integers.

The mode is the item that occurs the most in a given list. For example, in [1, 1, 2, 3], the mode is 1.
For an empty list, I want you to return None

Use pytest and the same red-green workflow I showed earlier in this unit. It should look very similar to the structure I used. In general, this is a good approach for problem-solving. The more you practice, the more you’ll improve!

Learn more

Great and quite more extensive guide on Python testing Highly recommended
Given-when-then - How to approach writing tests in a given X when Y happens Z should happen. Highly recommended
How not to test - Dangers of overtesting and understanding how testing everything isn’t the best approach
Why most unit testing is waste
Another good list of resources

« VSCode Ideas »