Writing code for Humans to read and Machines to run
This handout is tied to the following lecture
5. Writing code for Humans to read and Machines to runCoding principles
Clean code
Like everything there are pros and cons also to “Clean code”. I put this concept into quotes here because as far as I know the term was coined together with the publishing of a book by Robert C. Martin called “Clean Code: A Handbook of Agile Software Craftsmanship”. This book has been mentioned a lot and was praised for its push towards more readable code, which I agree with, and principles such as SOLID and DRY (see below). But as with any set of rules, there should be exceptions and there are downsides. One of the major ones being performance, for a nice walk-trough of this issue
- see the below talk:
Molly Rocket - “Clean” Code, Horrible Performance
SOLID
The SOLID principles stand for:
- Single responsibility
- Open–closed
- Liskov substitution
- Interface segregation
- Dependency inversion
Awesome-coding - SOLID Principles Explained
DRY
The Don’t repeat yourself, or DRY, principle is exactly what it sounds like. It is better to extract repeated code into a function, procedure, or class instead of copy-pasting the same code into 20 places. It makes it easy to test that code, to edit that code, and to overview the function of that code. However, this principle should not be taken as a holy quest to abstract all code so that no code that is even remotely similar is ever repeated. Rather it should be kept in mind and applied in moderation where needed.
Documentation building
There are a few tools that build documentation specifically for Python projects. However, really to build documentation all you need is a good static website generator! A few examples are
For a more complete list, see e.g. myles/awesome-static-generators.
There are also tools to simply convert structured text to HTML, such as pandoc, that can be used for essentially the same purpose, but you would have to set up multi-file and templates yourself.
Notably sphinx has specialized tools for making Python documentation, but currently it can get very
complicated very fast. Hence my current recommendation is mkdocs as its very simple to setup and
configure and uses markdown by default.
docstrings
Having standardized docstring styles is probably a good idea as you will be encouraged to be consistent and you can use community tools for e.g. parsing and rendering your docstrings.
Some standard docstring formats are:
- numpydoc (recommended)
- sphinx.autodoc
There are also tools for helping you write docstrings such as vim-doge that automatically generates a well formatted docstring according to a configured style that you then just has to fill out.
Rules and guidelines
There are several styles and guidlines for writing code, on all levels that were mentioned in the lecture. The most known one within Python is PEP 8. PEP stands for Python Enhancement Proposals and this particular proposal was originally made in 2001. PEP 8 together with proposals such as the one for docstrings make up the rules which most linters follow. Other static code analyzers such as mypy turn styles into rules, e.g. here that type hints (according to PEP 484) must be adhered to.
However, there are other sets of rules that might be good to read trough just to get an idea of how enforcing these rules can change the code you produce. One such document I can recommended reading trough is the NASA The Power of Ten – Rules for Developing Safety Critical Code.
Resources for making readable projects
- Semantic Versioning - how to handle your versions
- Keep a Changelog - “This project aims to be a better changelog convention.”
- Git tags - How to tag in git
Complexity
Another static code analyzer for python is radon. It measures
raw metrics: SLOC, comment lines, blank lines, &c. Cyclomatic Complexity (i.e. McCabe’s Complexity) Halstead metrics (all of them) the Maintainability Index (a Visual Studio metric)
One of the metric mentioned above is Cyclomatic complexity. This concept basically means how many linearly independent paths you can take trough the code you executed. This is something that might pop up if you have an LSP that measures it and it usually means that you have a lot of branching and code-calling going on in your code. Sometimes this is just a necessary evil, but often it can be reduced to make the calling structure of your code simpler.
Examples
Writing good examples is a bit of an art and an exercise in knowing who your user base is. I would recommended some simple guidelines to start with:
- Solve a problem in your example
- Make it non-trivial so that the reader gains understanding of the code
- Make it simple enough that its easy to comprehend
- Make it easy to run and modify so one can play around with it
- Make sure your examples are up to date
I remember reading a post on the The Developer Advocacy Handbook called Write excellent code examples a while back and it might spark some ideas for you as well.
Homework assignments
- Make documentation for your code
- Make a
docsfolder inside your course repository and build a small wiki-page of your notes. - Compile the docs to html using mkdocs.
- Write docstrings
- Write a docstrings for your last homework. Both module level and function level.
- Advanced documentation
- If you want a more complicated example, use the mkdocstrings
- package to compile an API page of the package you made earlier in the course.
- If you don’t already have a favourite format, use
- numpydoc. You can also try the
- material theme or just stick with the standard
mkdocstheme.
Recommended watching
CodeAesthetic - Naming Things in Code
CodeAesthetic - Don’t Write Comments