To directly quote launchdarkly:
A “branching strategy” refers to the strategy a software development team employs when writing, merging, and shipping code in the context of a version control system like Git. Software developers working as a team on the same codebase must share their changes with each other. But how can they do this efficiently while avoiding malfunctions in their application? The goal of any branching strategy is to solve that problem and to enable teams to work together on the same source code without trampling on each other. A branching strategy defines how a team uses branches to achieve this level of concurrent development.
Other than this short introduction we will assume the reader is familiar with version control using git and common problems that arise from multiple people developing the save code-base.
There are several strategies for managing the branching and merging of code development, some of the most common are
These strategies overlap and diverge at different points. However, one important thing to note is that most of these strategies were developed for maintaining and extending on already proved, and huge code-bases, where a feature branch touches (maybe) less then 1% of the actual code.
Nevertheless, for research code, there is often an initial sprint of iterative prototyping that is as chaotic as it needs to be, to figure out the vital aspects. Things that are answered by the iterative prototyping sprint are usually: what the software is actually supposed to do, what the challenges are, what bottlenecks can be expected, which axis should be extensible and which should be scalable, what interfaces and data flow patters should be used, ect.
After this phase comes the crucial part where a proper coherent branching strategy is desirable: the refactoring, testing, CI/CD and slow and steady climb towards a deployment ready version. This climb is often tied to some science goals that one tries to achieve using the developed software when its ready to deploy. During this climb, the code is substantially rewritten and every feature branch usually touches multiple parts of the code-base and modifies >20%, interfaces change rapidly and behaviour is unpredictable for long periods of time.
Once the science goals have been achieved and a “stable version” has been produced, this first deployment is not the end of the major development process. Even though at this point, following semantic versioning becomes highly important, so that interfaces does not change unless the version changes, as the code is used bug-fixes and compatibility needs to be maintained. Also, follow up and validating research projects have a tenancy to materialize that want to extend the reach of the software in some way or apply it on a novel situation. Often again, requiring large changes to the code base.
To summarize: the branching strategy needs to account for this kind of development cycle while at the same time minimize the chaos of merge conflicts left and right while allowing for “ease of life” features like CI/CD and automatic testing.
No matter the details, the point is to stay up to date with everyone else to make many small conflict resolutions instead of a few large ones.
We base the git branching strategy on the GitLab flow strategy, that is a flavor of trunk-based development.
Usually the release branch is main and the trunk branch
is develop. However, there are projects that do not warrant
a release branch. In that case, the trunk will be main and
there will be no develop branch.
The important takeaways of working in this flow is that you should:
At regular intervals, checkout the trunk, git pull and
merge/rebase the trunk with your local branch. Similarly, at regular
intervals, merge your branch with the trunk and git push.
Example workflow:
git checkout my_feature_branch
# stay up to date
git fetch
# it is ok to rebase your local branch, putting everyone elses commits before any of your new ones...
git rebase origin/main
# ... unless you have pushed that branch, in which case you should merge instead
git merge origin/main
# -- resolve conflicts --
# -- develop new stuff --
# at end of day (or more often), given that your feature branch builds and passes tests
git checkout main
git pull
git merge my_feature_branch
# -- resolve conflicts --
git pushBelow is an example git branching diagram of a typical situation of two people simultaneously developing the same code. Let us first start off with a simple situation of a repository that does not need a release branch and development is fairly segregated.
The main idea in practice is to minimize the severity of merge conflicts through frequent trunk-merges and communication. Below exemplifies a merge confict:
The same situation but with long lived feature branches:
You can also use language features to have work in progress code in trunk that is not yet connected to anything else. This gives your collaborators an early heads up on what is to come, while not breaking any builds or interactions while the feature is unfinished.
These can be functions and related tests that are not called by
anything else, or maybe a html page with some unfinished documenation
that is not linked to from the main docs yet. Or maybe a new variant of
a function mostly copy-pasted from somewhere else and renamed
calculator_v2 or the like.
If you really want to understand how git functions here are a few good resources to get started: