Software contribution and development guide

Documentation

If someone ask you a question about the code or opens an issue because they can’t understand how it works: update and improve the documentation while you answer the question/resolve the issue. This will make the documentation self-correct as a function of how well-used it is.

Writing Avoiding comments

Code is like humor. When you have to explain it, it’s bad.

– Cory House

Comments have the tendency to deviate from what they comment on. When code is changed or moved, comments often stay and just add to the confusion of the reader. Instead of comments, use clear variable names and functions, as in the examples below.

Renaming variables

By renaming variables to be self-explanatory, they no longer need a comment. If this results in verbose function calls or math illegible math expressions, rename them again locally in the function:

# there could have been a #horizontal focal length comment here...
fx = self.horizontal_focal_length
fy = self.vertical_focal_length
alpha = self.pinhole_deviation
u, v = self.optical_transfer(phi, theta, fx, fy, alpha)

Extract functions

Write functions such that inline comments can be avoided. If you find yourself putting whitespace around a few lines of code and writing a comment above as a reminder of why they belong together, you need a function instead:

# convert to spherical coordiantes
e_x = sin(phi) * sin(theta)
e_y = cos(phi) * sin(theta)
e_z = cos(theta)

This allows the higher levels of code to read with much more intent than mathematical detail, even if the math itself is “simple”.

e_x, e_y, e_z = to_spherical(phi, theta)

Use version control

Instead of writing a comment on why a particular thing was changed, make a note of the “why” in a version control commit comments instead. This allows the reader to see the change in context with an explanation, while keeping the current solution clean.

# use of matlib.repmat is discouraged according to https://numpy.org/doc/stable/user/numpy-for-matlab-users.html
# R2 = repmat(R2, 3, 1)
R2 = tile(R2, (3, 1)) 

Commenting out code like this is also discouraged, place it as a message in version control instead:

git commit -m "refactor: Using tile() instead of repmat() \n\n ref: https://numpy.org/doc/stable/user/numpy-for-matlab-users.html"

Then use git blame or built in history/blame functionality in your IDE/editor/gitlab whenever you question why a particular change was made.

Python

The general guideline is that a function or class shall be completely usable for its purpose from only its docstring, i.e. the multi-line comment at the start of the function. This approach also gives a good benchmark for function granularity (scope of function).

Use the Numpy-style docstrings as they are more human-readable than the standard sphinx-style docstrings. A good docstring should consist of four parts:

Below is an example of a Numpy-style docstring for a python function. This documentation style relies on the numpydoc package available at PyPI.

def generalized_match_function(parameters, transmitted_sig, recived_sig, radar):
    '''Calculates the generalized match function by shifting the transmitted 
    waveform according to a given phase offset model, dependent on the input 
    parameters in `x`, and convolving with the received waveform[1]_.

    Parameters
    ----------
    parameters : numpy.ndarray or list
        Phase offset model parameters, must contain three elements.
    transmitted_sig : numpy.ndarray
        The transmitted waveform.
    recived_sig : numpy.ndarray
        The received waveform.
    radar : this_package.Radar
        Radar instance containing the radar specific information.

    Returns
    -------
    float
        Negative of the generalized match function absolute value.

    .. [1] Markkanen, J., Lehtinen, M., Landgraf, M., 2005. 
        Real-time space debris monitoring with EISCAT, 
        https://doi.org/10.1016/j.asr.2005.03.038
    '''
Remember the white-spaces around ` : `, otherwise it will not style correctly.

When listing arguments as above, one can either have a blank newline between each argument or not (as is done above). This becomes a tradeoff between a more compact documentation and a more readable list. With newlines, a function with ten input arguments might not even fit on a screen while not being much more readable at all. On the other hand, a function with only four arguments, each having a long description, might be much more readable while sacrificing only three extra lines. Hence, this is left to a case by case basis with the focus being readability.

When it comes to specifying additional information about expected parameters of the input arguments, such as the shape or ndim of numpy ndarrays, it is recommended to document it first in the argument description inside code-quotes. This makes it clear what argument configurations, code wise, is expected without ambiguity. One can also use variables inside these definitions to denote free parameters, for example below where a shape=(3,1) array and a shape=(3,n) are expected where the fact that n can be any integer is implicit.

    Parameters
    ----------
    camera_position : numpy.ndarray
        `shape=(3,1)` array for camera position (x,y,z) in the local 
        Cartesian coordinate system in km
    point : numpy.ndarray
        `shape=(3,n)` array of point coordinates (x,y,z) in the local 
        Cartesian coordinate system in km, rows are coordinates and 
        columns are different points.

Sphinx setup

We provide a reference configuration of Sphinx, in a separate repository, including IRF extensions and templates. It is highly recommended to use it as-is to begin with, as Sphinx can be tricky to get started with. As you get more experience, feel free to change settings to suit your needs.

See separate instructions on IRF-sphinx.

Other useful additions and notes are:

Remember, when typing math inside a notebook cell (from pandoc documentation):

Anything between two $ characters will be treated as TeX math. The opening $ must have a non-space character immediately to its right, while the closing $ must have a non-space character immediately to its left, and must not be followed immediately by a digit.

C/C++/FORTRAN

For C/C++/FORTRAN we recommend using doxygen for documenting libraries and large api interfaces. For smaller projects a simple collection of md or rst files that can be compiled into a static website trough e.g. pandoc together with in-code comments is sufficient as documentation. Doxygen can also compile down to man pages which is useful when building e.g. a command-line interface (CLI). It should be noted that pandoc can also compile to man-pages and in the case these pages are fairly short, a simple markdown file is sufficient. In the case of CLI it should also have a -h help argument with proper argument documentation available. In python this is achieved with argparse.

Coming soon: example setup of a doxygen documentation for a C/C++ project

Sphinx and Doxygen

If your python repository contains a significant amount of C/C++ code that needs to be documented alongside the python code breathe is the way to go for you. Following the instructions on their docs will make Doxygen style in-code documentation appear in your Sphinx compiled output.

Rust

Rust comes with it’s own built in documentation compiling system called rustdoc.

Too learn more see the official documentation.

Referring here

If you wish contributors and collaborators to be aware of the guidelines here, please add this to your README.md or an equivalent section:

# Contributing

Please refer to the style and contribution guidelines documented in the 
[IRF Software Contribution Guide](https://danielk.developer.irf.se/software_contribution_guide/). 
Generally external code-contributions are made trough a "Fork-and-pull" 
workflow, while internal contributions follow the branching strategy outlined 
in the contribution guide.

Also remember to add an equivalent section in the webpage documentation of the code if one exists.