Version Control

Version control systems (VCS) make large software projects with large teams possible. Version control lets you manage your code base. Amazingly, Google keeps ALL of its code in a single code repository.

Major Version Control Systems

There are 4 major version control systems that you will encounter:

  • CVS - created 1990 - Concurrent Versioning System
  • SVN - created 2000 - Subversion
  • TFS - created 2005 - Team Foundation Server (Microsoft)
  • GIT - created 2005 - named by Linus Torvalds "the stupid content tracker"

Most code bases created after 2010 will use GIT for code management. Microsoft shops may use TFS, and some older code bases will use SVN. Very large, and very old, companies may have custom/proprietary code management systems.

Concurrent Versions System was created in the 1980s by Dick Grune along with many other contributors. It is free to use but the latest version was released in 2008 and you will rarely encounter it in use today.

Subversion was developed as part of the Apache Software Foundation. Branches in SVN are hard links (unix) to the master branch. From the documentation, "Subversion has no internal concept of a branch—it knows only how to make copies".

Team Foundation Server is tightly integrated into the .NET development environment and Azure cloud deployment.

GIT was created by Linux Torvalds out of frustration with existing version control systems. In GIT, only the differences are stored in a branch. This makes branching very fast and "cheap." Git does not use much disk space or resources to create branches.

Because of this, you can create and combine (merge) branches very rapidly.

GIT is the most popular and most code bases will either use GIT or migrate an older version control system to git. GIT can be used with .NET code bases and integrates with Visual Studio. Most of this post will focus on GIT as it is the most popular system. Almost any code management, deployment service or IDE can integrate with GIT.

Version Control Concepts

There are two main ways to work with version control to allow teams to work on different segments of the code base at the same time. One is Forking, the other is Branching. They can be used at the same time but it is good to understand the differences and strengths of each.

Forking Branching
Process Copy/Clone Diffs Only
Dependent on Master branch? Independent Dependent
Control over merging High Low
Level of Trust Low High

Forking Strategy

Forking is the same as cloning or copying. A fork is independent of the repository. Most open source projects use a forking model. You will download your own copy of the repository to work on it.

When code is forked and modified, if you want the changes to be pulled into the master branch you make a pull request. Depending on the system used (BitBucket, GitHub, etc.) the manager of the repository will get a notification that you want to do this and then accept or reject your changes.

Branching Strategy

Branching is supported in all Version Control Systems to some degree. A branch in GIT or TFS only stores the differences between the master branch and the current branch (called a "diff"). In SVN a "branch" is a full copy, so it behaves more like a fork.

Branches can also use pull requests as a merging process. As you can see, the workflows are nearly identical. The main difference is how you start your work: You will "clone" to begin a forked workflow and "branch" to begin a branched workflow.

Forking Workflow Branching Workflow
Clone the master branch Branch off master branch
Name the fork with the ticket name and/or a description eg : bug_243_missing_button Name the branch with the ticket name and/or a description eg : bug_243_missing_button
Modify/add code in the Fork Modify/add code in the Branch
Send fork to QA Send branch to QA
This may include deploying the fork to QA machines. This may include deploying the branch to QA machines.
Send a Pull Request with your fork to the code librarian Send a Pull Request with your branch to the code librarian
on success, code is merged into the Master branch and deployed on success, code is merged into the Master branch and deployed

Merging Strategy

Merging is the process of combining code from branches or forks and joining them in to other branches (ultimately "master" or "trunk").

This is frequently performed by a single individual under the role of the code librarian. This person will perform a final code review and check for any merge issues. Merging to various branches can also be handled by automated deployment systems such as Jenkins.

Deployment Strategy

At small and mid sized companies deployment will frequently be manual or even automated FTP. There are advantages to this as you don't need to have a VCS (Version Control System) client on each server.

Without version control on the production servers:

  • Less software running
  • Simpler deployment, both conceptually and in terms of speed
  • Very dependent on human interaction
  • Can be difficult to detect differences between production machines and the code base. Unused files need to be removed manually.

With version control running on the production servers:

  • Deployment can be harder for humans but easier to automate
  • can scale servers with automated deployment
  • Easy to detect differences between production machines and the code. Unused files are removed automatically.

Hopefully you learned something new about version control! If you have any questions or want to learn more, let me know in the comments!