hi all,
here's a quick rundown of checksumming in GFAL and the various storage systems used throughout biggrid:
dCache ====== - source code not available - support (HEP-preferred?) ADLER32 checksums only - checksum is calculated when the file is uploaded and stored in a database; this means that it is not possible to verify whether the file on disk/tape actually matches the checksum. A dCache operator can trigger a command to re-calculate the checksum
DPM ==== - supports ADLER32, MD5 and CRC32 checksums - checksum is calculated if not present in the DPM namespace and then stored there; this means that the first time you request a checksum it can take a while (and can put quite some stress on the DPM worker nodes) - uses a plain gridftp server with a DPM plugin to calculate the ADLER32 and CRC32 checksums. - I browsed through the sources to check/find out some of this
StoRM ===== - does not seem to support checksums in a consistent manner at all - uses a plain gridftp server which only supports MD5 checksums yet lcg-cr .... --checksum --checksum-type md5 always fails - it is possible to calculate the checksum of a file on the storm gridftp server but it is not stored anywhere for fast access: this makes it ideal for a Denial-of-Server-Storage attack :) - is actually only SRMv2.1 compliant - also does not support ACLs/permissions - no new version has been released since 2007 - I browsed through the sources to check/find out some of this
Hopefully I've missed a few things on the storm front, otherwise things don't look too good ...
share and enjoy,
JJK
Hello Jan Just,
Luckily (for us) there are a few mistakes in your lines about StoRM. The checksum behaviour is nevertheless exactly what I also notice. I will have to report this to the developers. Here are some corrections for the other parts:
- First a link to the StoRM homepage with up-to-date information: http://storm.forge.cnaf.infn.it/home - SRMv2.2 compliant - Latest release 1.4.0 (15/05/2009) - With respect to ACLs I'm not completely sure yet what the precise capabilities are. If missing it should be fairly straightforward to implement these because StoRM already makes use of extended ACLs on the file system, and these could be made more fine grained. - The plain GridFTP is the VDT 1.6.1 version also used in gLite 3.1. - Checksumming support is currently being improved. Adler32 seems to be optional according to the release notes, but I will have to look into this.
Best wishes,
Fokke
Jan Just Keijser wrote:
hi all,
here's a quick rundown of checksumming in GFAL and the various storage systems used throughout biggrid:
dCache
- source code not available
- support (HEP-preferred?) ADLER32 checksums only
- checksum is calculated when the file is uploaded and stored in a
database; this means that it is not possible to verify whether the file on disk/tape actually matches the checksum. A dCache operator can trigger a command to re-calculate the checksum
DPM
- supports ADLER32, MD5 and CRC32 checksums
- checksum is calculated if not present in the DPM namespace and then
stored there; this means that the first time you request a checksum it can take a while (and can put quite some stress on the DPM worker nodes)
- uses a plain gridftp server with a DPM plugin to calculate the
ADLER32 and CRC32 checksums.
- I browsed through the sources to check/find out some of this
StoRM
- does not seem to support checksums in a consistent manner at all
- uses a plain gridftp server which only supports MD5 checksums yet
lcg-cr .... --checksum --checksum-type md5 always fails
- it is possible to calculate the checksum of a file on the storm
gridftp server but it is not stored anywhere for fast access: this makes it ideal for a Denial-of-Server-Storage attack :)
- is actually only SRMv2.1 compliant
- also does not support ACLs/permissions
- no new version has been released since 2007
- I browsed through the sources to check/find out some of this
Hopefully I've missed a few things on the storm front, otherwise things don't look too good ...
share and enjoy,
JJK _______________________________________________ Ct-grid mailing list Ct-grid@nikhef.nl https://mailman.nikhef.nl/cgi-bin/listinfo/ct-grid
Hi all,
I've written up two Wiki pages on the fun I've had with different SRM systems (dCache, DPM, StoRM, CASTOR) over the last few weeks:
Fun with access control: http://www.nikhef.nl/pub/projects/grid/gridwiki/index.php/How_to_control_acc...
Fun with checksumming: http://www.nikhef.nl/pub/projects/grid/gridwiki/index.php/Checksumming_suppo...
share and enjoy,
Jan Just
-----Original Message----- From: Ron Trompert Sent: woensdag 19 augustus 2009 17:55 To: 'Jan Just Keijser'; ct-grid@nikhef.nl Subject: RE: [Ct-grid] action point: Look at checksumming in GFAL
Hi Jan-Just,
Here is a comment regarding dCache
dCache
- source code not available
You can get it if you ask Patrick Fuhrmann nicely.
- support (HEP-preferred?) ADLER32 checksums only
It also supports md5
- checksum is calculated when the file is uploaded and stored in a
database; this means that it is not possible to verify whether the file on disk/tape actually matches the checksum. A dCache operator can trigger a command to re-calculate the checksum
A checksum is computed when a file is received (on the fs), if there is a mismatch the transfer will fail. Hence you are sure, that when the file transfer is successful, the file is a correct state on disk at that time. If it stays that way is another matter but dCache can be configured so that when a file is transfers to a client or from pool to pool or restored from tape a checksum is computed and compared with the one in the database. If there is a mismatch a CRCException is thrown. AFAIK a file transfer action will fail in that case which means that you can configure dCache so that if you get the file, the file is OK. Otherwise you don't get it at all. I am not 100% sure about the last statement but I have seen pool to pool transfers fail because of this so I presume it is the same for read and restore actions.
Cheers,
Ron
Hi Ron,
Ron Trompert wrote:
-----Original Message----- From: Ron Trompert Sent: woensdag 19 augustus 2009 17:55 To: 'Jan Just Keijser'; ct-grid@nikhef.nl Subject: RE: [Ct-grid] action point: Look at checksumming in GFAL
- support (HEP-preferred?) ADLER32 checksums only
It also supports md5
lcg-cr -l /grid/pvier/janjust/my-dcache-file2 -d srm.grid.sara.nl file:/user/janjust/myfile --checksum --checksum-type md5 gsiftp://bee31.grid.sara.nl:2811//pnfs/grid.sara.nl/data/pvier/generated/2009-08-20/file73117add-a2ac-4d30-b3db-955399c5ae67: CKSM (checksum) operation not supported
guid:20091fb4-a189-4585-ac2d-088084766149 lcg_cr: Operation not supported
how do I upload a file using an MD5 checksum?
cheers,
Jan Just
Hi Jan-Just,
dCache support md5 if I configure it instead of adler23.
Cheers,
Ron
-----Original Message----- From: Jan Just Keijser [mailto:janjust@nikhef.nl] Sent: donderdag 20 augustus 2009 10:51 To: Ron Trompert Cc: ct-grid@nikhef.nl Subject: Re: [Ct-grid] action point: Look at checksumming in GFAL
Hi Ron,
Ron Trompert wrote:
-----Original Message----- From: Ron Trompert Sent: woensdag 19 augustus 2009 17:55 To: 'Jan Just Keijser'; ct-grid@nikhef.nl Subject: RE: [Ct-grid] action point: Look at checksumming in GFAL
- support (HEP-preferred?) ADLER32 checksums only
It also supports md5
lcg-cr -l /grid/pvier/janjust/my-dcache-file2 -d srm.grid.sara.nl file:/user/janjust/myfile --checksum --checksum-type md5 gsiftp://bee31.grid.sara.nl:2811//pnfs/grid.sara.nl/data/pvier/generated/2009-08-20/file73117add-a2ac- 4d30-b3db-955399c5ae67: CKSM (checksum) operation not supported
guid:20091fb4-a189-4585-ac2d-088084766149 lcg_cr: Operation not supported
how do I upload a file using an MD5 checksum?
cheers,
Jan Just
Ron Trompert schreef:
Hi Jan-Just,
dCache support md5 if I configure it instead of adler23.
^^^^^^^^^^
This has me completely stumped.
Am I correct to understand that if you make this change, adler23 will no longer be supported? Won't this upset other users?
How long have we been making multi-user systems? 30 years? (Please tell me I'm misunderstanding things.)
--Dennis