Showing posts with label fts. Show all posts
Showing posts with label fts. Show all posts

Monday, May 31, 2010

FTS 2.2.4 released


Actually, the moment when an FTS version is released is a bit obscure, because we release/certify independently to different platforms, gLite releases, etc. So, in this blog, I will announce "released" when:

- It has been certified / verified for at least one of the supported platforms
- The YUM repository is prepared, so users can install it.

FTS 2.2.4 has fulfilled the above criterion, so it is released :) You can find the release notes here:

FTS 2.2.4 release notes

The release notes have a new format: we use SVN Trac to publish the notes for the future (instead of TWiki), and the release notes will contain a lot more information on one single page than a list of Savannah bug titles.

Tuesday, May 11, 2010

Tuesday, April 20, 2010

FTS at MAGIC collaboration

Another FTS user: the MAGIC (Major Atmospheric Gamma-ray Imaging Cherenkov Telescope) collaboration. The MAGIC-II experiment consists of a system of two Imaging Cherenkov Telescopes located on the Canary Island La Palma, Spain. They have been designed to study the universe and discover new Gamma-ray sources in the energy range from 50 GeV to 5 TeV. These telescopes have a 17 m diameter reflector, being the largest Cherenkov telescopes in the world, and will be operated in stereoscopic mode for an enhanced sensitivity.

The MAGIC data center hosted by the Port d'Informaci— Cient’fica (PIC) in Barcelona is migrating its services to Grid as part of an upgrade needed to deal with the increased data volume. After migrating the data to a Grid filesystem, they have ported FTS. In the last months they set up an SRM endpoint in the observation site and ported all the data transfer tools to use Grid file transfers.

Monday, March 29, 2010

FTS 2.2.3 deployment completed

On 25 March, all the Tier1 sites reported successful deployment of FTS 2.2.3. CERN (Tier0) also upgraded the service. The official LHC scientific program will start on 30 March, with 7 TeV collisions, so we were able to deliver and deploy in time. Mission completed :)

Tuesday, March 23, 2010

FTS at Belle experiment (KEK, Japan)

Nice to know that the world beyond CERN and the LHC Computing Grid also uses FTS. An example is the Belle experiment at the National Laboratory for High Energy Physics (KEK), in Japan. Belle studies the origin of CP violation phenomena, and their results lead to the Nobel Prize for Makoto Kobayashi and Toshihide Maskawa, in 2008.

Do you know about other applications of FTS, GFAL and LCG_Util? We would like to collect them, so in this case let us know please :)

Wednesday, February 10, 2010

FTS 2.2.2: rolling out

FTS version 2.2.2 (SL4) proceeded to "rolling out" phase. The main updates:

- By default, SRM/gridFTP actions are no longer split. One can re-enable the FTS 2.2.0 behaviour by setting the

FTA_TYPEDEFAULT_URLCOPY_AGENT_SRM_GRIDFTP_SPLIT


Yaim configuration option. See the related Savannah task.

- More liberal checksum handling: algorithm names should follow the specifications, however, some endpoints do not follow them yet. Temporarily, we enabled using their conventions with warnings in the logs. See the related Savannah task.

- Some relax on requirements on how long an SRM TURL is kept valid.

Thursday, February 4, 2010

Thanks for everything, Ákos!

Ákos has left the LHC Computing Grid projects. He has coordinated the development and support of the EGEE/LCG grid data management systems, and lead the FTS project since 2002. As he was always short, straight and to-the-point, following his style, I simply should not write more than

that :)

Your comments here are always welcome!

FTS 2.2.1: rolling out

FTS version 2.2.1 proceeded to "rolling out" phase. This version contains the finalization of the checksum support. The database schema has been changed, one has to follow the instructions of the Yaim script to do the upgrade. The API between the transfer-agents and transfer-url-copy has changed, so one has to stop and drain the channels before doing an upgrade.

FTS 2.2.3 certified

Today, the FTS 2.2.3 version has been certified. The pilot service has been updated with the new version. There have been no problems reported so far. The following changes are included:


They are the infamous "agent crash" and "proxy delegation" problems that prevented from FTS 2.2 production deployment.

Monday, January 18, 2010

The old delegation race condition is back

As it turns out the old delegation race condition came back hunting us again in FTS 2.2, even thought the fix was released to production almost a year ago.

There is a new 'hand built' glite-data-tranfsfer-fts v3.7.0-3 RPM to provide a quick fix for FTS 2.2 as well and hopefully the proper glite-security-delegation-java v1.6.0 will be also included in the gLite build configuration that future releases will not have this problem again.

Friday, December 18, 2009

As wrapping up the year we have a GFAL/lcg_util v1.11.13, FTS 2.2.4 and DPM/LFC v1.7.4.

Details are in the data management relese notes.

This FTS snapshot already has a secure preview of the administrative web interface:

Wednesday, November 25, 2009

FTS releases

The current, 'WLCG approved' version of FTS is FTS 2.1.

gLite has already released FTS 2.2 (i.e. FTS 2.2.0), however Atlas has
discovered some shortcomings with the checksum suppport in the
Pilot service, which were fixed in the upcoming FTS 2.2.1 and
FTS 2.2.2 releases.

FTS 2.2.2 has been certified by gLite and installed on the CERN FTS
Pilot service and will be running until the beginning of December
to reach the 'WLCG approved' status.

Currently we are working on FTS 2.2.3 to address a few more issues.

Have a look at the FTS patch status page for more details on older releases!

releases

My magic URL for upcoming data management releases points into Savannah.

We usually create a 'patch' (i.e. release) when we have a draft idea of what should
go into a release. For example FTS 2.2.3 and GFAL 1.11.13 are created with all the
bugs attached that we intend to implement by the release date.

The first noteworthy state is 'Ready for certification', when the developers have
finished their work and there are already RPMs created. At this point we usually
upload the packages into our Release Candidate repository for the convenience of
early testers.

The next noteworthy state is 'Certified', when the release has passed all regression
tests and the new features seem to be working.

After this state there is a few weeks of testing (i.e. waiting if there is any unexpected
behaviour) in the pre-production testbed (PPS) and then comes the gLite release.

Wednesday, November 4, 2009

Debugging tricks

When you want to debug the command-line tools of the projects, you find immediately that the commands are in fact shell scripts. They are libtool wrapper files actually and set up several things before calling the binaries themselves. You need to invoke gdb in the following way:

libtool gdb _command_

From this point, everything should work as usual.

Next, you may run into the following trouble when debugging:
[Thread debugging using libthread_db enabled]
Error while reading shared library symbols:
Cannot find new threads: generic error
For me, it occured on SLC5, when the code used the dynamic linking library (dlopen, etc.). You can eliminate the problem by linking libpthread directly to your executable. For example:

lcg_del_LDADD = $(COMMON_LIBS) -lpthread

Good luck!

Monday, September 14, 2009

IPv6 compliance in FTS and GFAL

There were several Savannah tasks targeting IPv6 compliance, they have been resolved now. The list of the related tasks:

#41844: IPv6 bug; LCG-utils client functionality immediately broken by IPv6

#41278: IPv6 bug: non compliant address in source code (hard coded IPv4: 127.0.0.1)
#41585: [FTA] IPv6 bug: non compliant name resolving function (gethostbyname_r)
#41586: [FTA] IPv6 bug: non compliant name resolving function in source code (gethostbyname_r)
#41278: IPv6 bug: non compliant address in source code (hard coded IPv4: 127.0.0.1)

See the resolution details in the comments of the individual tasks. Basically, the general solution was:

- remove dependency on the pre-compiled gSoap library
- take stdsoap2.c directly from the gSoap sources
- compile the above file with WITH_IPV6 defined

The release tags including the IPv6 compliance are:

glite-data-srm-api-c_R_1_1_0_12
glite-data-srm2-api-c_R_2_2_0_6
glite-data-transfer-cli_R_3_7_2_1
glite-data-transfer-agents_R_3_4_2_1
glite-data-gfal_R_1_11_11_1

It lists the affected components as well, actually they are the ones that implement SOAP communication with gSoap. The IPv6 functionality is encapsulated into gSoap completely, so we did not have to change the implementation, it was only configuration issue.

Tuesday, September 8, 2009

FTS from Python

I was writing a Python binding for the transfer-cli functionality.

This is how far I got:


import fts

f = fts.FileTransferService()
print "# FTS using endpoint: %s" % f.endpoint()
print "# FTS service version: %s" % f.version()
print "# FTS interface version: %s" % f.interface_version()
print "# FTS schema version: %s" % f.schema_version()

c = fts.ChannelManagement()
print "# CM using endpoint: %s" % c.endpoint()
print "# CM service version: %s" % c.version()
print "# CM interface version: %s" % c.interface_version()
print "# CM schema version: %s" % c.schema_version()
print c.channel_names()


And the output was:

# FTS using endpoint: https://lxvm0307.cern.ch:8443/glite-data-transfer-fts/services/FileTransfer
# FTS service version: 3.7.0-1
# FTS interface version: 3.7.0
# FTS schema version: 3.4.1
# CM using endpoint: https://lxvm0307.cern.ch:8443/glite-data-transfer-fts/services/ChannelManagement
# CM service version: 3.7.0-1
# CM interface version: 3.7.0
# CM schema version: 3.4.1
('ASGC-CERN', 'BNL-CERN', 'CERN-ASGC', 'CERN-BNL', 'CERN-CERN', 'CERN-FNAL', 'CERN-GRIDKA',
'CERN-IN2P3', 'CERN-INFN', 'CERN-NDGF', 'CERN-NIKHEF', 'CERN-PIC', 'CERN-RAL', 'CERN-SARA',
'CERN-TRIUMF', 'FNAL-CERN', 'GRIDKA-CERN', 'IN2P3-CERN', 'INFN-CERN', 'NDGF-CERN', 'NIKHEF-CERN',
'PIC-CERN', 'RAL-CERN', 'SARA-CERN', 'TRIUMF-CERN', 'CERN-STAR')


The minimum goal is to have submit and status checking working.

Friday, August 7, 2009

small fixes: transfer-agents and transfer-cli

There were a couple of other updates:

glite-data-transfer-agents v3.4.1-1

Really fixing #47507: SRMv2.2 as default.
This is a two character fix, which finally made it to the release.

If you cannot wait then there are some workarounds. The original idea of adding
FTA_GLOBAL_ACTIONS_SRMVERSION="2.2"
worked only for the VO agents, so one also has to add the following lines to the Yaim config:
FTA_TYPEDEFAULT_SRMCOPY_ACTIONS_SRMVERSION="2.2"
FTA_TYPEDEFAULT_URLCOPY_ACTIONS_SRMVERSION="2.2"

... or simply submit a full SURL to FTS including the endpoint of the SRMv2 server.


glite-data-transfer-cli v3.7.1-1
  • Fixing a regression: overwrite flag (-o) should not require an argument, which problem was introduced in 2008 March as a regression.
  • Updated the test suite to the latest FTS service.

transfer-url-copy version 3.2.1-rel2 released

The affected module is: org.glite.data.transfer-url-copy.

The changes are:
  • Warning removal
  • The result of the code review implemented partially: descriptive enum-s to signal the actual checksum checking use case.
The new release tag is:

glite-data-transfer-url-copy_R_3_2_1_2

The functionality and the behaviour have not been changed.

See the component in the CVS.

Thursday, August 6, 2009

Checksum code review

The latest FTS development was adding checksum support to verify if the data has been transferred properly, and the source/destination files has not been altered. The related requirement specification can be found in the wiki:

FtsChecksums

The feature has been transferred to the package certification process.

Today, we had a code review with Rosa and Ákos, we reviewed the checksum-related code. After a discussion about some fancy C++, Boost, STL features + some potential Google interview questions :), we had two findings that will be changed:

- The system determines the actual checksum use case and stores it in bool variables - enum-s should be used instead, with descriptive names.
- The asynchronous SRM operations called synchronously, so the same send/poll pairs go always together in the code. Should be merged into one function that encapsulates the new exponential backoff functionality as well.

We found no bugs and the changes above will not modify the behaviour, so we do not need a new release now.