Wednesday, June 5, 2013

Source Code Control

Background


I always think, we programmers are like writers and artists. We create software that could be powerful and beautiful. There is no lack of abstraction in our work either!writer

There is one difference, though. When an Author writes a book, she could revise her work so many times at her desk, but only the final outcome becomes public. Not so with our work! A programmer comes up with several revisions of his/her program before it goes to production. Even so, there may be a bug or something was missed that he/she has to revise it again.

A software could also be "backed out" to a prior version. (Even software giant like Microsoft does this once in a while. Remember when people rejected Windows Vista, their solution was to go back to XP, the prior version of Windows! I myself went back to Gnome desktop on my Ubuntu 12.04 desktop, because of some issues with Unity desktop). And of course, someone might point out that something worked in an earlier version, that you will have to dig up the sources for that version, find out what changed and try to re-incorporate that code back into the new version!! I can't imagine an artist having to go through this type of revising of his/her painting! (I agree a book may have editions, but it's always move forward for them.)

So, if you are a software developer you need to worry about keeping track of the history of the program. Not just the history, but all the files that made up a particular version! And to do this we resort to so many ingenious(?) ways. How many times, have you seen files with .bak, .sav, .org etc extensions on a developer's machines? (guilty!). Some may even organize their files in dated directories, so they can go back to a specific date (guilty as well). But when your program is made up of so many files (gone are the days when you wrote all your code in one file!), such rudimentary methods of keeping track of files won't work! Even more so, if you are working (collaborating) with other developers.

Source Code Control or Version Control


Wouldn't it be better if a Software can track these information at the source level, so it's easy to pull up any version of the software sources and rebuild it? Such a software tool exists for over 40+ years and is known variously as Source Code Control Software (SCCS), Version Control System (VCS), Revision Control System (RCS) and even Software Control Management (SCM). Dozens of tools have been developed with these acronyms in their names. Some of the most popular ones over the years are SCCS, RCS, PVCS, Perforce, Visual Source Safe (VSS), Mercury etc. And in the open source arena, we have had CVS, SVN and Git. Please refer to the wiki site for comparison of many more tools available on the market. The difference is not only in the names, but in techniques and technology involved. Newer tools support concurrency better.

The least a VCS can do is to keep track of history of files and thus offers reversibility. With various versions saved in the repository, it is only natural that we require the VCS to provide a good compare (diff) utility or at least the capability of using external diff tools.  Any type of files can be saved in the repository including binary files. Though, the diff utility is typically geared for text files.

More traditional VCS offers a centralized repository. SCCS, PVCS etc belong to this category. More modern ones offer more distributed approach. With developers from around the world working on open source projects, distributed repositories more sense. And the other characteristics that go with the territory is the concurrency. Again, modern VCS have more and more concurrent support built in. For e.g., CVS refers to Concurrent Version System.

Further Reading

Please refer to the Eric Sink's website (his book is available in PDF format on his site) for detailed discussion on the SCCS tools. This blogger recaps the history of Source Code control in a funny way! And the wiki page like always delivers a nice introduction to this topic. Eric Raymond's paper explains various generations of Version Control Systems nicely.

http://www.ericsink.com/vcbe/index.html
http://www.catb.org/esr/writings/version-control/version-control.html#why_vcs
https://code.google.com/p/pysync/wiki/VCSHistory