Why one should use version control like GIT or SVN for nearly everything

07 Oct 2021 - tsp
Last update 07 Oct 2021
Reading time 9 mins

Introduction

First of - what is everything in the context of this blog entry and what is a version control system? And who is this article targeted at? It’s not targeted at the experienced software developer who manages his code already using git or SVN. It will be boring and sound somewhat strange in this case. It’s targeted at people who currently don’t use SCM for any task. By everything I mean stuff like:

The stuff that I don’t mean are large binary files such as your media collection, photo collection, etc. and temporary files that can easily be regenerated at a later time as well as large databases, scraped data, extracted data that can be regenerated, etc.

What is version control? Version control systems (sometimes also called revision control, source control or source code management system) allow one to centrally or decentrally manage collections of files in different versions each. Imagine you change something in your computer programs source code or in your thesis and want to look into the old version later on. Often one sees people calling their files thesis, thesis_final, thesis_finallyfinal, thesis_finallyfinal_really and so on. And then shifting around the files on external storage devices such as external harddisks or USB flash drives, many times with colliding names and then later on overwriting much of their new work or not locating the most current version, not being able to locate comments, etc. Version control systems solve that problem including the moving around on USB sticks - they usually provide a blaming feature that even can show who changed what and when in case one’s working in a team. And they usually allow for seamless interoperation by including merge tools - if many people modify the same file at different positions they’re usually able to automatically merge (if using proper file formats) differences or at least highlight merging conflicts. And you never loose any old content - so think about what you put inside a repository, usually if everything goes right nothing will ever be deleted and most systems do not even support that without major hacking around in their internal representation.

As already mentioned they’ve been mainly developed for software development but the problem of revision management is as old as writing itself - and these systems are really great to be applied to all textual content in a highly efficient way. In fact this web page is built out of a source control system.

Different models and basic operations

There exist two different main models for source control (but only two really popular software packages though there are is a huge number of different tools out there).

First there are centralized version control systems. These are built around a central repository that’s usually hosted on a server that’s reachable on the network or via the internet. A typical representative is Subversion (SVN). One creates a repository on the server (should do automated backups there) and then checks out (copies) the version or branch one requires from the server using the svn tool. Changes are stored locally and then commited (copied back onto the server) into the central storage. One only stores the working copy in one fixed version locally. The main advantage of a centralized version control system is that one only checks out a given version or a given subset of the project, is able to perform centralized rule checking and centralized linting of the commits. To use SVN one usually only needs to know 3 different commands:

In addition SVN also supports locking and unlocking resources so one can negotiate who modifies which resources but usually this is not needed. Another operation that one might need is Revert that reverts a file to an older revision previously stored discarding any newer changes. The blame utility helps identifying modifications.

Then there are distributed version control systems such as the really popular GIT (note that this is not directly related to the well known GitHub hosting service though that’s an really easy starting point for newcomers) or the less well known older darcs. Git provides the ability to run in distributed mode by keeping an own complete local repository including all versions - but also allows one to synchronize to remote ones like in the centralized case. This makes using git a little bit more cumbersome and harder to think about than using SVN - but for source code in the open source environment it’s currently more popular than SVN due to it’s distributed nature. You can simply take the whole repository with you offline, you have a whole copy (solves the backup problem if you simply clone / pull the repositories on different machines and keeps them in sync).

To use git one requires at least the following commands:

The previously mentioned GitHub service is a nice external storage solution for your git repositories if they are either public or should be shared only with a small number of collaborators or a small group.

Previously I’ve written a short git cheat-sheet that should provide a nice summary on how to do common stuff using git. It’s really worth it and other than centralized systems it does not require one to perform proper server administration for the central repository.

Problems solved by using these tools

Drawbacks

This article is tagged:


Data protection policy

Dipl.-Ing. Thomas Spielauer, Wien (webcomplains389t48957@tspi.at)

This webpage is also available via TOR at http://rh6v563nt2dnxd5h2vhhqkudmyvjaevgiv77c62xflas52d5omtkxuid.onion/

Valid HTML 4.01 Strict Powered by FreeBSD IPv6 support