Thursday, August 13, 2009

playing with git

I have set up a CVS-to-git export cron job for the glite data management and CASTOR CVS modules that we can try git.

On RedHat derivatives you can install from the DAG repository

yum install git git-cvs git-svn qgit

On Ubuntu you can install the dependencies as

sudo apt-get install git-core git-svn git-cvs qgit giggle

And then you can check out a module:

$ time git clone
git://lxtank02.cern.ch/org.glite.data.srm-util-cpp
...
real 0m0.803s
user 0m0.340s
sys 0m0.044s

At this point my checkout was a bit lost with the branches,
so needed a bit of help to find back to the true path:

$ cd org.glite.data.srm-util-cpp
$ git merge origin/origin

And this point you can start branching at wish to play with the GFAL code.

You can sync up later with CVS using
$ git pull

Once we move from CVS to SVN committing through git would become also feasible.


To see the efficiency of the storage here is a small experiment:

$ git clone git://lxtank02.cern.ch/CASTOR2
$ du -sh CASTOR2/.git
37M CASTOR2/.git
$ cd CASTOR2; time git pull
...
real 0m0.273s
user 0m0.092s
sys 0m0.124s

$ rm -rf CASTOR2
$ cvs -d ':pserver:anonymous@isscvs.cern.ch:/local/reps/castor' co CASTOR2
$ du -sh CASTOR2
58M CASTOR2
$ cd CASTOR2; time cvs up
...
real 0m2.700s
user 0m0.156s
sys 0m0.136s

In plain words the storage size of all the versions going back to 1999 (37MB) is smaller than the workspace (58MB).

1 comment:

  1. Bad sign for your code. Git detects duplicated files ( or blocks ) and stores only a single copy of them ( like a hard link ). When you checkout your code with git checkout, duplicates are re-created in the workspace.

    ReplyDelete