Wednesday, November 25, 2009

FTS releases

The current, 'WLCG approved' version of FTS is FTS 2.1.

gLite has already released FTS 2.2 (i.e. FTS 2.2.0), however Atlas has
discovered some shortcomings with the checksum suppport in the
Pilot service, which were fixed in the upcoming FTS 2.2.1 and
FTS 2.2.2 releases.

FTS 2.2.2 has been certified by gLite and installed on the CERN FTS
Pilot service and will be running until the beginning of December
to reach the 'WLCG approved' status.

Currently we are working on FTS 2.2.3 to address a few more issues.

Have a look at the FTS patch status page for more details on older releases!

releases

My magic URL for upcoming data management releases points into Savannah.

We usually create a 'patch' (i.e. release) when we have a draft idea of what should
go into a release. For example FTS 2.2.3 and GFAL 1.11.13 are created with all the
bugs attached that we intend to implement by the release date.

The first noteworthy state is 'Ready for certification', when the developers have
finished their work and there are already RPMs created. At this point we usually
upload the packages into our Release Candidate repository for the convenience of
early testers.

The next noteworthy state is 'Certified', when the release has passed all regression
tests and the new features seem to be working.

After this state there is a few weeks of testing (i.e. waiting if there is any unexpected
behaviour) in the pre-production testbed (PPS) and then comes the gLite release.

Thursday, November 19, 2009

Unifying LGC_Util and GFAL version numbers

A usual source of confusion: which LCG_Util version requires which GFAL library version. Almost after each release somebody installed the wrong packages somewhere. Now, the confusion is over: from the next release on, we always release those two components together, under the same version numbers (but with different tag prefix, certainly). We will create the first such a release pair this week, with version numbers 1.11.12-1 (the next GFAL version number). It means, that there will be a gap in LCG_Util case: version 1.7.8-1 will jump up to 1.11.2-1. Keep tuned.

Wednesday, November 4, 2009

Debugging tricks

When you want to debug the command-line tools of the projects, you find immediately that the commands are in fact shell scripts. They are libtool wrapper files actually and set up several things before calling the binaries themselves. You need to invoke gdb in the following way:

libtool gdb _command_

From this point, everything should work as usual.

Next, you may run into the following trouble when debugging:
[Thread debugging using libthread_db enabled]
Error while reading shared library symbols:
Cannot find new threads: generic error
For me, it occured on SLC5, when the code used the dynamic linking library (dlopen, etc.). You can eliminate the problem by linking libpthread directly to your executable. For example:

lcg_del_LDADD = $(COMMON_LIBS) -lpthread

Good luck!

Tuesday, November 3, 2009

GFAL and LCG_Util test bed developments

Currently, the GFAL test suite contains integration and regression tests only. The certification process develops and executes those tests. We need something more and flexible: basically, we need unit/white box tests that checks GFAL code validity until the boundaries of its dependencies. We started to create unit test suite for both GFAL and LCG_Util, for debugging and internal validation purposes. The unit test suite requires some redesign for the code (redesign for testability). The pattern behind is dependency injection. The code covered by unit tests will never call external library code directly (except for the standard C library functions), they will do it by replaceable function pointers. In production, the pointers point to the original functions, however, a white-box test can replace a set of functions for dummy ones simulating a scenario.

We do not change the whole GFAL code, to avoid regression. We change the code gradually as we solve the Savannah tasks. All the appropriate Savannah tasks will go with unit tests as well, and only the code affected by a task will be changed. What we will get is a "hybrid": in some cases, functions will be called directly, in some cases indirectly. In ideal case, we reach full unit test coverage when the function call methods get unified.

We will demonstrate the power of the unit test suite with solving RFE: Extra parameter in lcg-cp for a better TURL construction.

After unit tests, we have to have a controllable regression/integration test suite. As we cannot control the certification test bed and it is tightly bound to the certification test environment, we create our own test bed better integrated into the development environemnt. We do it by copying the cert. tests into our source tree, adapting to our environment, then we start adding tests covering our tasks and purposes.