SemVer Branching
Semantic Versioning or SemVer is a popular way to identify versions of software. It is mandated in some package managers, such as NPM, and widely used in the projects utilizing the build tools not mandating it. In some cases, even though the versioning scheme used by the package manager may be not fully compatible with SemVer, the teams sometimes opt to use a common subset of both to follow SemVer in spirit. It just works.
It only makes sense to have a branching strategy that is specifically designed to work with SemVer. This article is a description of such a strategy. I use it in the software projects I manage, and I’m pretty content with it.
Introduction to SemVer
In SemVer, a version number consists of three numeric components separated by a dot, like this: 2.7.18. Incrementing each component in the version number carries certain meaning. The rightmost component is called the patch number. It is incremented when we do not alter the specified behavior of the software but rather address a defect in it. The middle component is called the minor number, and it is incremented when the changes made to the software change its specified behavior in a backward-compatible way. For example, a net addition of a new functionality qualifies, while a removal does not. Finally, the leftmost component is called the major number, and it gets incremented when a not backward-compatible change is done to the software.
You can read the specification for the precise definitions and rules governing the scheme. The specification talks about changes to the public API, and it makes sense. In my projects I tend to extend the notion of changes to the API to all interfaces, not just “programming” ones, since often the software product is not a reusable component but rather an application.
Branch Tips
In SemVer, there is a notion of pre-release versions. They are denoted by having a hyphen followed by a string of dot-separated identifiers after the patch number, for example 2.7.18-beta.2. The interpretation of such versions is that they are produced in preparation of the version before the hyphen but are not quite done yet.
We need to define a special kind of pre-release versions, which we’ll use to identify the tips of the branches. These are moving targets, snapshots, and they are produced in preparation of some future version. Maven has a convention to use-SNAPSHOT suffix for such versions. I will follow this convention in this article, especially since I’m not aware of any other established conventions for such versions.
These version numbers are to be stored in configuration files of the build system, such as pom.xml for Maven, package.json for NPM and so on. They define how the resulting artifacts will be named: the artifacts will have the version number embedded in their names. The build can also arrange for the version numbers to be included in binaries and be accessible at runtime.
Long-lived Branches
There are three numbers in SemVer versions, and each one can be incremented semi-independently, that is to say that changes to the patch number are completely independent, changes to the minor number also reset the patch number, and changes to the major number reset the minor and the patch number to zero.
Most General Case
Let’s look at the most general case first. Suppose the highest (by SemVer’s definition) version number in our project is 2.7.18. Then it has three possible “next” versions: the next major version, 3.0.0, the next minor version, 2.8.0, and the next patch version, 2.7.19.
To be able to schedule and implement features targeting these versions, we need to have three active long-lived branches. “Long-lived” means that their lifetime is longer than the lifetime of a single feature implementation. They are used to integrate multiple features and support the production of several versions in the most general case.
The version numbers stored at the tips of these branches then will be 3.0.0-SNAPSHOT, 2.8.0-SNAPSHOT, and 2.7.19-SNAPSHOT correspondingly. You can interpret them as “work in progress to get the mentioned version out.” Here master tracks 3.0.0-SNAPSHOT. Branch releng/2 is created from master and tracks 2.8.0-SNAPSHOT. Finally, branch releng/2.7 is created from releng/2 and tracks 2.7.19-SNAPSHOT. The version tag for the highest version, v2.7.18, is created on releng/2.7 branch.
Figure 1 above illustrates the tagged version and the branches as described. The branches grow from left to right. Significant commits are represented as circles. Tags are shown in boldface below the commits, while the version numbers in the build system configuration files are shown above the commits. The branch names are shown in boldface to the right of the depicted branches.
On Naming
The naming conventions given for the tags and branches are just a suggestion. You don’t really have to follow it exactly, but you must have a system. For example, you can use main instead of master and release/* instead of releng/*. You will be fine as long as you designate specific names and prefixes to represent long-lived branches and version tags. This will become important to implement branch and tag protection rules in your version control system.
Greenfield Project
Before we can get version 2.7.18 out, we need to produce version 1.0.0 first. When you just start a new project, all you need is the master branch. You grow the branch, sometimes for a prolonged time, until you are ready to produce version 1.0.0. At this point, the master branch is tracking 1.0.0-SNAPSHOT. First, from master you create a branch named releng/1, which continues to track 1.0.0-SNAPSHOT, then change the version number in master to 2.0.0-SNAPSHOT. From this point, the work that targets version 2.0.0 can start on the master branch.
Next, from the releng/1 branch you create a branch named releng/1.0, which still tracks 1.0.0-SNAPSHOT, then change the version number in the releng/1 branch to 1.1.0-SNAPSHOT. This enables you to start implementing features targeting version 1.1.0.
After that, you change the version number in the releng/1.0 branch to 1.0.0 and tag this commit with v1.0.0. Congratulations, this is your first version of the project!
Because you may still need to produce patch versions from the releng/1.0 branch, you change the version number in this branch to 1.0.1-SNAPSHOT.
Figure 2 represents the situation after the first version is produced and the project is now ready for work on versions 2.0.0, 1.1.0, and 1.0.1. Please note how the version numbers change in the branches.
Supporting Multiple Versions
Because we can have multiple major branches (that is, branches that follow releng/X pattern), it is possible to provide support for multiple major versions at the same time. This need is often overlooked by teams developing in-house applications, when they have only one production environment and only one “production” version needs to be maintained. However, the world is more complex than that. If you maintain an open source project or produce software commercially, it is very likely that you would need to support several versions.
Generally speaking, by keeping releng/X branch in the repository, you declare your willingness to keep releasing minor versions for the X.0.0 major version: X.1.0, X.2.0, X.3.0, etc. The SemVer specification describes what kind of changes can be included in these versions.
Likewise, it is possible to have multiple minor branches (branches following releng/X.Y pattern). By keeping releng/X.Y branch in the repository, you declare your willingness to release patches for the X.Y.0 minor version: X.Y.1, X.Y.2, X.Y.3, and so on. These patches would normally include critical bug fixes and security updates as per the SemVer specification. Once you end support, delete all unnecessary releng/* branches.
Figure 3 shows that 1.Y.Z and 2.Y.Z versions are still supported, with the 1.2.0 and 2.1.0 scheduled minor versions correspondingly. It also supports 1.1.x versions with the 1.1.4 scheduled patch version, and 2.0.Z versions with the 2.0.2 scheduled patch version. Of course, the feature work can continue on the master targeting 3.0.0. The support for 1.0.Z versions ended with the release of 1.0.7 and the releng/1.0 branch is now deleted.
Regression Avoidance
Have another look at Figure 3. See how the branches are ordered bottom to top, from releng/1.1 to master? Let’s assume that we found a bug in the version 1.1.3. There is a good reason to believe that the same bug may exist in all long-lived branches above releng/1.1. In this branching strategy we fix the bug in the lowest relevant minor branch, in this case in releng/1.1. This makes perfect sense: there you have the most stable code readily available for time-sensitive work.
To prevent regression, the patch branch with the fix must be merged to any other long-lived branch above it in the order presented in Figure 3. That is releng/1.1 to releng/1, then releng/1 to releng/2.0, then releng/2.0 to releng/2, and finally releng/2 to master.
I call such merges regression avoidance merges, even though they are not strictly related to bug fixes. Any change made to a lower branch needs to be propagated up until it reaches the master. This shows the real cost of supporting multiple versions: the more versions you support, the more merges you’ll need to perform for changes made to the supported versions.
Merges After Version Creation
Above I wrote that the version number is stored in the build system configuration files. It follows that you need to perform a regression avoidance merge right after, because when you create a new version you create two new commits: one to set the configuration to the version number being created, and another one to set it to the next snapshot version.
There are two points worth mentioning about these merges. One, you will face merge conflicts, because it so happens that you update the version number in both branches. It helps if you can limit the number of times the version number is mentioned in the configuration files. (Alas, for example, for a multimodule Maven project it may be a bit tricky.) Consider this to be a public service you perform to the next person who will need to make a change.
Two, the version update is the only change made in the commits, and you will not accept this change during the merge. Therefore, the merge commit will have no changes.
Short-lived Branches
This is an optional part of the strategy. You only need to worry about short-lived branches if you want (or are forced) to use pull requests. The way it works, for every feature you work on, you create a short-lived feature branch from the appropriate long-lived branch and then work on it.
Once the work is done, you arrange for the changes to be merged back to the same long-lived branch. I say “arrange” because the merge is normally done by submitting a pull request, which is then taken through code review, and either approved and merged or rejected. The branch is deleted after the merge.
Short-lived branches must follow a pattern distinct from long-lived branches. This is necessary to the branch protection rules to work correctly. For example, use feature/N, where N is some kind of feature identifier or headline. Some teams have conventions to use different prefixes for different types of short-lived branches. I find it unnecessary, because essentially, what happens to a feature branch is the same regardless of its type.
Do Not Publish!
You must not publish artifacts built from short-lived branches to any shared repository. You must not deploy them to any shared testing environment. The version numbers in these branches are inherited from the long-lived branches upon creation, and the work in these branches is incomplete. Moreover, it is very likely that at least occasionally the team will have multiple feature branches created from the same long-lived branch (like feature/two and feature/three in Figure 5 above), and therefore sharing the same version number.
Now, imagine they all get published to the same shared repository or deployed to the same shared environment. You will experience incomplete features and flip-flopping, when a feature appears and disappears seemingly at random. Publish to shared repositories from long-lived branches only, where the code is complete and integrated.
Synchronize Often
Once you start working on a feature branch, you stop seeing updates made to the source long-lived branch. That branch gets updated when other feature branches merge into it. It is in your best interest to maintain your feature branch in sync with the source long-lived branch. This reduces the severity of the merge conflicts, you address them more often, but in much smaller chunks. The risk of having mistakes when merging is greatly reduced.
Additionally, if you merge from the source branch just before submitting the pull request, you can use a fast-forward merge for the pull request. This will prevent conflicts in the pull request merge. In Figure 5 above, the merge from feature/two to releng/2 can and should be a fast-forward merge.
Rules
Below is the summary of the rules of the branching strategy. X, Y, and Z are placeholders for the major, minor, and patch version numbers.
- There are three types of long-lived branches:
master— tracks changes scheduled for next major version;releng/X— tracks changes scheduled for next minor version based on theX.0.0major version;releng/X.Y— tracks changes scheduled for the next patch version based on theX.Y.0minor version.
- Optionally, there can be short-lived branches to track work on features and pull request reviews, approvals, and merges. They follow
feature/Npattern, whereNis some kind of feature identifier. - Version tags are created on
releng/X.Ybranches, and have thevX.Y.Zformat. - Tips of all branches have their version numbers in
X.Y.Z-SNAPSHOTformat. - Tagged versions have their version numbers in
X.Y.Zformat.
Origins
The principal idea of using multiple release branches to track multiple releases came from The Release Engineering of FreeBSD 4.4 by Murray Stokely. This may look like a natural way of using branches today, but back in the day it was common to apply tags to the main line of development. These were the early 2000s, the time when CVS still ruled the world. Branches were exotic and expensive.
Once SemVer became popular, the strategy got updated to three long-lived branches, one per degree of freedom in the version number. Before that, it used just two.
Short-lived branches are a later bolt-on addition to the strategy, c. 2018. This was a response to my employer mandating branch protection, pull requests and code reviews. The strategy basically follows GitHub Flow, with the stipulation that a feature branch can be based on any long-lived branch, not just master or main.
Conclusion
Here are the key takeaways from the article.
- There are three degrees of freedom in SemVer version numbers. This calls for three types of long-lived branches to be able to move to the next major, minor, and patch versions independently.
- The lifetime of long-lived branches defines how long we are willing to support the corresponding versions.
- The long-lived branches have a natural order, from the lowest supported
releng/X.Ybranch to themaster. Changes are scheduled for implementation in the lowest relevant branch and then propagated up using regression avoidance merges. - Version tags are applied only to
releng/X.Ybranches. - Short-lived feature branches are optional and don’t change much about how the strategy works. Use them only when you need them.