Software contribution and development guide

Testing

As the software grows, it gets increasingly harder to manually check that changes didn’t accidentally change behaviour of the code. You want to always, fearlessly, be able to make changes to your code without causing problems for your users.

Getting started with pytest

Follow the package structure discussed in detail at making your package installable, specifically that of a separate tests directory. This primer assumes you have read that, set up a virtual environment, a canary test, pytest.ini and tests/conftest.py.

# activate your virtual environment
source bin/activate
pip install pytest
# install your own package in editable mode
pip install -e .
# pytest picks up default arguments from pytest.ini
pytest

A good practice is to integrate pytest to be callable from your editor, as running the tests often (all the time) helps you catch errors early. If your editor supports running the tests on each save (e.g. pycharm), consider activating that feature. This also encourages you to keep your tests optimised, as nobody wants to sit around waiting for tests to run.

Naming tests

Test names should show intent, not what and how we do (that should be clear from the implementation) but why we do it. The whole code base should read as a news article; answering the questions

  1. what
  2. where
  3. when
  4. who
  5. why
  6. how

Some of these are answered in implementation, others in tests and in worst case only in documentation.

A test name (the test function name) should complement the implementation and the implementation of the test itself. By explaining the intent it becomes possible to triangulate where the error is. If the test implementation and the test name agree, the error is probably in the implementation. If the implementation and test name agree, the error is probably in the way we test it, and so on.

Tests as feature requests

When you need a library to have and continue to support a critical feature, provide a regression test (see below) for the lib developers to implement and keep green. This way, it is less dangerous for you to update the library version, and it is easier for the lib devs to know how their software is actually used.

Organising

Organise your tests by the concepts your software solves rather than implementation details, such as strictly testing function by function. This makes it easier to understand what your code is supposed to do, rather than understand how you do it.

Specification by example

Sometimes it is easier to explain a concept by writing examples of the input versus the output. You might not yet be able to name/phrase the generic rule, but you can give enough examples of input vs expected output. This is good. Write parameterized tests to verify each example, without having to repeat yourself in creating a new test for each example. Numpy arrays, masks and assert_allclose are your friends.

def test_stars_with_ze_45_deg_or_less_are_within_90_deg_pinhole_fov_towards_zenith(self):
    cam = Camera()
    cam.verticalFocalLength = .5
    cam.horizontalFocalLength = .5
    ze = np.array([
        0,
        0.0001,
        # side note: check where your edge cases are
        pi / 4 - .000001,
        pi / 4,
        pi / 4 + .000001,
        pi / 2,
        2])
    mask = [1, 1, 1, 1, 0, 0, 0]
    az = np.zeros(ze.shape)
    npt.assert_array_equal(cam.within_field_of_view(az, ze), mask)

Yield to reduce repetition

Parametrised tests are useful to reduce repetition. When the assertions are the same or similar in all tests and the difference is only in setup, the assertion can be part of the fixture code, by letting the fixture yield, as below.

class TestProjectionTransferEdgeCases:
    @pytest.fixture(autouse=True)
    def transfer_and_its_inverse_with_edge_cases(self):
        # code to run before each test
        self.camera = Camera()
        edges = [
            np.full((shape[0],), 0, dtype=np.int64),
            np.arange(shape[1], dtype=np.int64),
            np.full((shape[0],), shape[1] - 1, dtype=np.int64),
            np.arange(shape[1], dtype=np.int64)[::-1],
        ]
        shape = (1024, 1024)
        self.shape = shape
        self.u = np.hstack(edges)
        self.v = np.hstack(edges)

        # each test function runs here, instead of yield
        yield

        # common assertions
        npt.assert_allclose(self.u_prime, self.u, atol=12)
        npt.assert_allclose(self.v_prime, self.v, atol=12)


    def test_pinhole_model(self):
        self.camera.optical_transfer = pinhole
        self.camera.inverse_optical_transfer = inv_pinhole
        az, ze = self.camera.inv_model(self.u, self.v, self.shape)
        self.u_prime, self.v_prime = self.camera.model(az, ze, self.shape)

    # and other tests like the above it but with similar setup

Regression testing

Tests that prevent accidental changes are called regression tests and are usually at a fairly high level in the code. They should test concepts that are important in your software, a “use case” if you will, or a “feature request” that somebody posted; something that should still work between released versions. Value these tests highly, treat them as a first class member of your product, as highly as the implementation code.

They are also vital part of the documentation of your software, as they are the best proof that your code does what it says on the tin, as well as verified usage examples for you and your user base.

Write the tests against functions close to where the user would interact with your library/software, i.e. against primarily high level functions.

If you have a proven working piece of software, taking note of its inputs and outputs and turning them into tests can be useful for refactoring, even if you do not know the precise inner workings. It means you can still make changes and have a “truth” baseline to compare to between changes.

def test_calculates_sky_object_az_ze_to_relative_positions_in_image_using_optical_transfer_function(self):
    self.camera.optical_transfer = allsky2
    angles = np.array([
        10 / 180 * pi,
        20 / 180 * pi,
        # side note: zero is great input in many test cases
        0,
        5 / 180 * pi])
    az = angles
    ze = angles

    u, v = self.camera.project_directions(az, ze)

    # expected values picked from matlab/scilab implementation
    npt.assert_allclose(u, np.array(
        [0.491814, 0.488797, 0.472377, 0.468228]), atol=6)
    npt.assert_allclose(v, np.array(
        [0.505829, 0.501863, 0.475483, 0.432745]), atol=6)

Test driven design (or -development) (TDD)

Tests can help you design your software to be testable. If you end up having problems writing or coming up with an automated test, it might be a sign that the design does not allow for testing. By writing the test first, it forces you to think about how to interact with the implementation in an easy way. Your test is the first actual user of the code/function. If it is hard to write the first test, consider changing the function signature or abstraction level to make it easier to test.

By writing the test first, you avoid a lot of false positives, where the test you wrote doesn’t actually test what you think it does. A test that starts out passing is hard to verify. Try changing the implementation somewhat and verify that your test fails for the right reasons.

Pay attention to the code smell of having a lot of setup code in order to run simple tests. These are no longer simple. Aim to write tests that have tiny input and tiny output to verify. This is easier within a functional programming paradigm than in an object-oriented one.

Thought process:

  1. “I wish there was a function that did X in a way that is easy to interact with and verify”.
  2. Create do_X(simple_argument, possibly_language_primitive) # return float.
  3. Write a test that passes some simple arguments.
  4. Implement that (and only that) as simply as you can. Do not write features that don’t have tests.
  5. Refactor!
  6. Repeat from (3).

Glossary

Unit test
Unit tests are (usually) automated tests written to ensure that the smallest units of the code meets its design and behaves as intended. There can be multiple tests for the same unit. E.g. One test for a Cartesian coordinates to spherical coordinates function could be to see if the function returns zero azimuth angle for a vector that is parallel to the z-axis.
Functional test
A large “black box” test of the entire/large part of the code base. This test usually involves tens, if not hundreds of different functions but only checks a single test case. E.g. A functional test for a meteor analysis software could be: given a small test set of raw voltage radar data, can the analysis software find the meteor in this raw data and is it in the correct location? This test does not see if e.g. the coordinate transformations do what they are intended to. There could be one coordinate transformation that rotates clock-wise instead of counterclockwise that results in the meteor location being wrong, but the function test does not indicate where this error is. A suite of unit tests would. However, unit tests cannot discover if the analysis is constructed correctly, hence both are useful.