Releasing Software

13 minute read

“Release” is such an overused word. Even when we limit our scope to software engineering it still means many different things to different people. This situation causes confusion and misunderstanding, and I see it beneficial to define in the most precise terms what we do when we release software.

Source of Confusion

In the context of software development, depending on who you ask and when, the word “release” may signify the following distinct meanings:

The act of making a software product available to the customer. Also, it may mean the copy of the product that got created as a result of such act.
The act of installing a software product to the target systems as well as the copy of the product running in production. While this meaning is often observed in organizations that do in-house software development, sometimes it leaks to the wider world: see, for example, what Helm calls the installed copies of charts.
The act of enabling a certain subset of the product’s functionality to the end user. This is a relatively new meaning, you can observe it in the publications on DevOps. It is related to the concept of feature toggles, a technique specifically designed to launch certain product’s features after the installation of the binaries that implement them.

There may be more, something I have not heard yet, something less common. Still, it must be obvious, the meanings above represent completely different concepts, and more importantly, are associated with different events happening at different times.

The product installations can only happen once the product’s binaries have been made available, and the feature toggle can only be activated in a running software. Using “release” interchangeably when talking about these three separate events can only lead to confusion. What is more aggravating is that such use is completely unnecessary: one can always say “software installation” or “enabling the feature” and be understood perfectly well.

Terminology

I don’t expect everyone all of a sudden to start using the definitions below, even though it would reduce the amount of chaos in the world. For the purposes of the articles I write, however, it is essential that we agree on the terms.

Assumptions

I take for granted that software products may have both the source code and the binary representations and that we can control the source code. In trivial cases, the binary representation can be the same as the source code, for example, for interpreted languages.

I expect the source code to live in a version control system. To make things easier without losing much generality, I assume it to be Git, and I will use the terminology of Git. You will see though that nothing described below should really depend on the choice of the VCS.

I expect the binary representation to be packaged for publishing or delivery, that is to say, that multiple files may be bundled together in a single artifact for ease of handling. Such artifacts are either published in software repositories, made available for downloads, or delivered on physical media.

Definitions

These are the definitions that we’ll be using going forward.

to build — to run a process of conversion of the source code representation of a software product to the form suitable for the installation of the product;
build — a result of building a software product;
snapshot — the tip of a long-lived branch in VCS as well as a binary representation obtained from the source code by building it; every time a new commit is added to the branch a new snapshot is produced;
version — a snapshot of a software product with a human-readable unique persistent identifier assigned to it (known as version ID or version number);
to make accessible – to publish at a download location accessible by the customer, such as a binary repository, or to deliver a physical media with the artifacts;
to release — to make a version of software irrevocably accessible by the customer;
release — a released version of a software product, also, an act that produces a release.

A few finer points here. A build process when executed on the same source code multiple times is not guaranteed to produce exactly the same bit-by-bit binary representation, hence “a result”. Any build will do, but once selected it becomes the canonical binary representation.

I consider Git hashes to be too inconvenient to be called “human-readable”. This means something shorter, easier to remember, and carrying some semantic meaning.

A customer is not necessarily the end user, and it does not mean a paying customer. Simply put, it is someone having interest in applying the software in some way, for example, installing it on a computer system. When I say “irrevocable” I want to stress that once a software is released you cannot unrelease it. It is a one-way operation.

Not Releases

From the definitions above it should be clear that I want to limit the meaning of “release” just to the first meaning in the original list. It is consistent with other uses of the word when it comes to intellectual property: a music album release, a movie release, and so on.

What about the others? The suggestion is to make a conscious effort to stop using the word in connection to these meanings. English has plenty of words that can describe what is going on there and the idea is just to stick to those.

For example, when a software version is installed to a target environment, let’s try to be careful not to call this process “release” and instead call it “installation”, “deployment”, or even “change”, if you happen to be ITIL-minded.

Likewise, when someone enables a release feature toggle in an installed version, let’s just say “enabling the feature” and leave it at that.

Release Preparation

With the terminology out of the way, let’s have a look at what is happening during a preparation of a software release. In a nutshell, we start with a snapshot, the tip of a branch, and ensure that it gets the version number assigned. Then we use the version number to build and publish the release artifacts. If physical delivery is needed, the artifacts can be placed on some media and delivered to the customer.

Pre-release Checks

Once a software product is released, the release is more or less carved in stone. Because of that, you will get exactly one chance to get things right. It follows that the more you can validate before cutting the release, the higher the chance of success.

The first step is to ensure that the code at the tip of the branch does not depend on any unreleased software, be that the libraries it uses or the build system configuration or plugins. These must be remediated if found.

Next, it makes sense to perform a trial build. This will confirm that all dependencies are resolved correctly, the code is buildable, and all automated tests pass. In addition to building the code, the trial build must execute all activities the final release build would run: package the source code, generate documentation, etc.

Once everything looks in order, we can proceed to creating the version.

Version Creation

A good number of build systems provide a way to identify the version in its configuration files. For example, project.version element in pom.xml for Maven, VersionPrefix and VersionSuffix for MSBuild, version property in package.json for NPM and so on.

Some build systems have no way to specify the version. One notable example of this would be Go. Even for situations when the build system does not cooperate, it would make sense to come up with a convention that allows to store the version number in a file and have the file managed by the VCS. The reason to have the version number stored with the code (as opposed to injecting it from an external source, such as a VCS tag) is simple: the build needs to work even when we do not have access to the code repository, when all we have is a tarball with the code. Going forward, we’ll take it for granted that the version number is stored in the source code and the build knows how to access the value.

If we look at the tip of any branch, we expect to find a snapshot version number there. This is consistent with our definition of snapshot above. The snapshot version number is what Semantic Versioning calls a pre-release version, which is denoted by hyphen in it. What follows the hyphen is not standardized. Personally, for consistency, I try to follow the Maven convention if I can help it: for snapshot versions the suffix after the hyphen is SNAPSHOT.

Either way, in order to produce the version from a snapshot, we need to strip the hyphen with the suffix from the version number. The actual progression of the steps follows:

Edit the version number to strip the pre-release suffix. For example, 2.4.1-SNAPSHOT becomes just 2.4.1.
Commit but do not push.
Perform a full build of the code in the repo as if it were the final build of the version’s artifacts. Make sure the artifacts are neither published nor installed to the local repository, they must remain only in the build area. If any issues are discovered at this stage, the process can be aborted by resetting the branch to the previous commit.
Tag the tip of the branch, using the example version above the tag will be v2.4.1. You may want to sign the tag. Do not push the tag yet.
Update the version number to start tracking the next snapshot. For example, 2.4.1 becomes 2.4.2-SNAPSHOT.
Push the branch and the tag. Note, up until this final push the process is abortable and can be reverted to the state where we started.

It is highly advised to make version tags immutable and not removable in your central VCS repository. This will enforce that once the tag is pushed, the version number is burned, i.e. it cannot be reused should there be any issues with the version. A new version number would be needed in this case.

The steps outlined above can and should be automated. In fact, some build systems may already come with plugins performing the sequence or something very similar to it. For example, see Maven Release Plugin.

Artifact Publication

Technically speaking, the source code tag is all that matters when we talk about a version. It is customary, however, to prepare the source code and binary artifacts for delivery to the customer.

The key point when creating the artifacts is to always check out the tagged code into a new location before running the final build (the above-mentioned Maven Release Plugin does exactly that). The final build will run all the tests, package and publish the artifacts to the repository.

The artifacts produced by the build must include their version number in the name when published to the repository. For standard components, such as JAR, NuGet, NPM packages, etc., the build system usually takes care of that. However, if you do something custom, for example, packaging the whole application as a tarball, you will need to take this into consideration. It is also a good idea to bake the version number into the binaries so that it could be accessed at runtime.

While it is possible to publish artifacts manually, the automation goes a long way here. It greatly reduces the possibility of human error during this process as well as strengthens the connection between the tag and the published artifacts, which helps to address certain compliance concerns.

In particular, in VCS that support the notion of releases, such as GitHub, you can trigger the publication of the artifacts by the event associated with publishing a release. Alternatively, it may be possible to associate the build for the version with the event for pushing a tag. Your continuous integration software may offer Build with Parameters feature, in which case you can pass the version tag as a parameter to the build pipeline. (Jenkins has it.)

Post Version Creation Tasks

Once the artifacts are in the binary repository it is tempting to call it a day. Alas, we are not done yet. Certain tasks need to be performed right after the version is produced in order to keep things tidy.

Now it is a good time to mark the version in your planning software as “done” (the actual term in the software is likely to be “released”, but then again, this is a slight abuse of terminology: because we have created a version does not mean we made it available to anyone).

Also, and it may not be entirely obvious, while preparing the version we might have done some little changes to the codebase. To ensure these don’t get lost in the future versions, it is essential to perform what I call “regression avoidance merges” to all “higher” branches. For example, if we prepared version 4.2.1 from the tip of release/4.2 branch, we need to merge it to release/4 branch and then release/4 branch to master.

Releasing

What you have at this point is a version of the software we may call a “release candidate”. It builds and passes all automated tests, but are we comfortable enough releasing it? (Remember: irrevocable!)

Version Certification

And this is where every organization is different. Let’s just say, each one has its own checklist to certify the version to be production worthy. It may contain no items—after all, all automated tests passed—or may be a mile long. It may involve the customer or may be strictly internal. I don’t judge, do what is important for you.

One thing though, which is crucial. All tests, certifications, sign-offs must be done of the specific version we produced. The version must be identified by its version number and the same version number must be referenced in all supporting documents. Failure to do it will render all the countless tests and approvals worthless.

This does not mean that a snapshot cannot be subjected to the same checklist as the version, it just these activities are then done on a moving target and cannot be considered final and official. Organizations heavy on manual testing may find this arrangement problematic. The alternative is worse though. Not being able to certify that what you tested and what you released is the same thing really looks bad. The solution is to shift left. Use more automated tests during build and less manual testing.

What happens if the version does not pass the checklist? Well, we’ve burned the version number. What we can do at this point is to document the issues and work to fix them in the same branch. Once addressed, we can attempt another release, using the next version number.

Delivery

Once certified, the version is ready to be released. Only one step left: to make it irrevocably accessible to the customer. Then details may differ. For example, may copy the artifacts to a different repository, or grant read permission on them making them visible to the customers, or put them on the media and mail to the customers. Once this last step happens we consider the version truly released.

This where the process ends. I must stress: “deployment to production” is not a part of the release process. Why not?

For lack of generality: not every software product is an application that can be deployed. We also may release libraries and other components that cannot run on their own and can only be consumed by other software. “Deployment to production” simply does not apply in such situations.
For optionality: the deployment to production tends to be done at the discretion of the customer at the time decided by the customer. We simply cannot control if and when in many cases.

Conclusion

Here are the key ideas from the article.

The meaning of the word “release” is overloaded. It makes sense to limit what we understand by “release” to making a version of software irrevocably accessible to the customer.
The key activity of the release process is preparing a version. The version is fixed by assigning a version number to it, which is effectively achieved by creating an immutable tag in VCS.
Building artifacts for the version is done only using the tag.
The version certification is only done on the artifacts built from the tag.
The certified version is what we deliver to the customer and this is what we call a “release”.
“Deployment to production” is not a part of the release process.

Alexei Yashkov