Spectragate





Why open-source Git strategies are harming your closed-source project

Category : Tutorials · by Jan 12th, 2020

Note: This article was originally written in 2017. Since then continuous integration branches have become very popular, following similar ideas to the post below. If you like the ideas below, be sure to do some research and give them a try.

While source control solutions like Git have many techniques for branch management, once they are battle tested in closed-source projects a number of problems start to arise. Unfortunately these problems often only become apparent hours before you’re trying to deploy a build, and worse still developers often consider it “just the way source control is”.

In this article I’m going to look at just one problem with source control which can can have dire consequences on an entire project, then explain how I went about solving it.

The Single Problem

You’re a web developer working at a studio with 5-6 other developers. You’ve got your brand new Git repo setup and you start following a branching convention similar to most visual guides on the internet. Or perhaps you want to be fancy and you setup GitFlow so you can automatically follow best standards. Things are going great, you’re checking things in, you’re writing good commit messages, you’re creating and closing branches. You and your coworkers create new feature branches, close those features and after a week your build release day approaches. You open a new Release branch, the code looks good and you prepare to push the build out. Your tree now looks like this:

 

Your boss then enters and says an issue has come up and its going to take weeks to fix. Let’s say the client changed their mind on the requirements for some new CSS, so Feature-C (which we will now call Feature-X) needs to be removed from the deployment, but the other 3 features can still go live.

You realize that you can just make a new release branch from the master branch, merge in the 3 features still going live and push that build out. You start merging in branches and it looks like this:

Git Tree - 2

 

It’s a bit messier, but on merging the Feature-D branch you run into a problem. It’s a single problem that has two unfortunate consequences:

Disaster Number 1

Looking at 3 features still going live (Feature-A, Feature-B and Feature-D), we realize that Feature-D was actually developed after the now dead Feature-X. Because the developer of Feature-X thought the feature was complete, he merged it into the development branch, becoming the CSS that Feature-D was developed off. The developer of Feature-D had no idea the CSS was a brand new feature, he just used it because it was the most recent. If we remove Feature-X, Feature-D is going to look like broken CSS garbage.

We’ve painted ourselves into a corner. We’ve built new features on code that is not yet in production, which if pulled from the release impacts every other feature that was built on top of it. Having a single development branch that all developers are building from creates a single point of failure for future changes. At best your project looks like this:

 

Branch Impact 2

At worst it looks like this:

Branch Impact 2

 

This is a situation that is common in closed-source studios where newly developed features tend to get bundled up over a week or two into a single release. This is different to how open source projects on Github tend to work (which is where Git and Gitflow really shine), where contributors often work in isolation from each other, one feature having little to impact on the next. After the owner of the project decides 4 contributed features are to be included in the current version, the branches are updated and reflected immediately on Github. Since users are building and/or running the projects locally, there’s no delay in deployment where changes need to be pushed out to the world in delayed bundled packages.

The fact that Github (and by extension, GitFlow) don’t have to deal with the deployment sides of projects, just the source control branches, often gets overlooked when researching branch strategies.

TLDR: Every feature that gets merged back into a development branch becomes the base for any future development, cascading with changes when one feature down the line is removed.

 

Disaster Number 2

You’re getting close to fixing this mess. Feature-D has also been removed from the build and the new release branch is done. Even though the build was meant to go live hours ago it’s finally deploying to the server. Now that it’s time to update the master and development branches by merging release into them. But as your finger hovers over the merge button, you notice something is wrong.

The release branch isn’t actually what you want people developing from since you don’t want to completely remove Feature-X and Feature-D from the codebase. The release branch was just something you hacked together so you could deploy with the missing features.  You can’t merge Release back in without wrecking everyone’s work, so instead of merging it back in and damaging the development branch, you just kind of…leave it open. A little dangling branch to remind you off these failures, maybe you add a little “dead-end 🙁 “tag to it. You tell everyone to just continue working on the development branch and you’ll make a new release branch next deployment.

Git Tree - 3

But now there’s an even bigger issue – your codebase (development and master) don’t actually match what is in production. If someone goes to work on that problematic CSS which caused all this mess, is what they are seeing on the development branch actually what is in production? If someone in QA notices a bug on the live site, do you create a Hotfix branch from your current master/development branch or the old dead-end release branch?

TLDR: The gap between what code is on the live site and what your developers are working with gets wider as more last minute changes can’t be rolled back into the master and development branches.

 

How to fix this

There is a way around this nightmare of a situation, but it requires changing how the development branch works and requires all developers to follow this system. So here’s the single concept we are going to use to avoid this.

You will never merge a feature back into the development branch. Every developer should have full trust that if they start a new feature, they are working off a snapshot of the current live website. 

Simple!

So what happens when we start and finish a feature? The new pattern is this:

  • Rule #1: New features are created from the latest development branch.
  • Rule #2: When features are completed, they will only be merged into a QA branch (based off the latest development branch).
  • Rule #3: QA will eventually be merged into a new release branch
  • Rule #4: Once deployment is completed, the release is merged back into master and development, starting a new cycle.
  • Rule #5: Undeployed features are updated to the latest build by having the development branch merged back into them (more on this in questions below).

Sidenote: The master and development branches are always identical. The reason for having a development branch is to make it easier to hook into CI servers, which usually expect certain naming conventions. Plus it has the side benefit of being less confusing for new developers who are used to working off a development branch.

Using this approach, our branches look more like this:

Git Tree - 4

Taking the original problem into account, the story above now look more like this:

  1. We have a stable development branch and master branch, both of which are identical to the live production build/site.
  2. A developer starts work on Feature-A. They checkout the development branch and create “feature/Feature-A” branch.
  3. The developer finishes their work, but they do not merge back into the development branch. Instead of polluting that branch, we spin up a new QA branch from development and merge our changes into that. If another developer has already created a new QA branch based off the most recent development branch we can use that instead.
  4. An email is sent to QA to run some tests on the QA branch. If the teams are small enough, an email to your team so everyone knows to check the QA build when they have time.
  5. The other developers create Feature B, C and D and merge into QA.
  6. Release day! We get news Feature C is being removed. We spin up a new QA branch and merge features A, B and D into it. Once a new round of QA is complete, we merge into a release branch.
  7. Once deployed, we merge Release back into Master and Development. The old feature branches are then deleted. You can then either delete your QA branch until a new QA round starts, or you can create one off your master/development branch and just keep it open between deployments. I prefer to just delete it once a deployment is complete.

 

Questions

What happens to Feature-X in this model? Isn’t it now one release behind?

As per rule #5 above, after a release is deployed a check should be done for any features that are still open. If a feature skips a release or two, every time the development branch is updated, the feature branch can be rebased back onto the latest development. Any conflicts or updates that need to be performed are now handled by the person writing the feature, not by the deployment team at the 11th hour who know nothing of the code. This gives the feature developer ample time to incorporate any conflicting changes into his design. Once his feature is complete, when its merged back into QA (and subsequently, development and master) we can be sure he wont overwrite anything accidentally with outdated code. To update Feature-X, our branch looks like this:

Git Tree - 5

Developers can theoretically keep updating like this forever, always sure they wont lose work and are working off the latest snapshots of the live build/site.

Is GitFlow really that bad? It seems that in GitFlow if you always work off the development branch for new features, you are always moving forward in the codebase. If a feature is removed, shouldn’t that just be considered a new feature and incorporated into the build/branches?

In theory, yes! However, this is why I say GitFlow is only perfect in a perfect world. While thinking that removing a feature should just be considered a feature in itself and be developed (undeveloped?) just as any other feature would, this is rarely possible given the time when features are most likely to be removed – hours before a deployment. If you can guarantee that anytime a feature is removed you have ample time to restructure the app/website around this removal without impacting timelines, then by all means go for it.

There is a world where this is possible, and its for people using Git for what it was intended for – open source projects with public commits from a huge number of different sources. These projects do tend to only move forward with their branches – if a feature is included and later needs to be removed, that in itself is considered a development task and will probably show up in patch notes. But what works for the open source GitHub world doesn’t work, in my opinion, for day to day development in studios which have a lot less control over features and have to handle deployments.

What happens if a feature is being developed that is dependent on another feature branch? Since the development branch wont be updated until after release and we can’t build off QA, how do I get those changes?

Ideally you should do a release before starting the new feature based on another. Since you should only be building off what is the latest snapshot of production, you should only be building off another developers feature once its been committed to the latest build.

Now, this isn’t always possible. If you are working on a feature that is being built on top of unreleased changes from another developers feature work:

  1. Create a feature branch off their existing feature branch. You know that your work is dependent on that feature, so if theirs doesn’t go live you already know your feature isn’t going live either. You should be periodically merging the parent feature into yours to make sure your working off their latest changes.
  2. Less ideally, build your branches all the way up to the release branch, merge it back into development and treat it as a virtual-deployment. The drawback of this method is that if the features you just merged into development need to be removed, you will have to spin up a new feature branch to undevelop those code changes (similar to the GitFlow approach in the previous question). This was the problem we were trying to avoid, so it’s advised not to do this often.

 


That’s it, hopefully this makes your deployments and branching strategies easier. I’ve used this pattern at my old office for around 7 months and it solved many of the headaches we were having with deployments. Since then I’ve seen similar techniques to this popup with integration branches. If you have any feedback on this approach or can think of some edge cases that I didn’t cover, please feel free to comment.

SHARE :