New subject: [Ct-grid] FW: [Condor-team] Hadoop and Condor

20 Oct 2008

      Hi Brian,
The performance and scalability of HDFS is very interesting.
A few questions:
- How do you mount HDFS on a system? Is FUSE the only option for a
regular mount? As I understand the information on the Hadoop website it
would require application to use the Java Virtual FS bindings to Hadoop
FS or redirect trough FUSE.
- If FUSE is the option, what would the performance penalty be?
- For Grid-enabling HDFS do you refer to the Java version of the
GridFTPd or the C implementation?
thanks,
Oscar
Brian Bockelman wrote:
...
Hey Jeff,
I'd be happy to come and talk about Hadoop and quantify my comments.  It
really is an amazing, well-engineered file system.  We currently have a
30TB install (completely from scratch space in worker node disks).  Some
of the highlights:

Resource consumption: If I push it as hard as I can (several hundred

CMSSW processes reading the same file via POSIX I/O), it maybe uses a
1/3 of a CPU.  Can't make its memory consumption show up in graphs.

Performance: Doing the same test with *one file* and 2 replicas, I can

hit 80Gbps.  Yahoo claims that their biggest clusters get up to
2.2Tbps.  I haven't been able to make the namenode break a sweat

Manageability: Straightforward.  The namenode is journaled, and can

checkpoint.  There's only two types of nodes (namenode and datanode);
both are straightforward to manage.  Out of the box, there are no knobs
to tweak, no optimizations to perform to get decent performance.  Just
works.  If you lose a data node, it starts re-replicating files within
about 10 seconds.  [As a side note, data recovery is actually pretty
straightforward.  We lost the disk which holds the namenode's info this
weekend, and restoring from an automatically-created checkpoint was easy.]
If you want to grid-enable it, you reuse the Globus GridFTP server (plus
bindings) and the BeStMan SRM server.  I've clocked it at 8.2Gbps for
WAN traffic.
For us, the best potential upside is the manageability, although the
fact that performance-wise, it does things dCache can't do.  If it can
prove to require far less admin time, we are hoping to expand use beyond
the current test bed.
Brian
On Oct 18, 2008, at 1:50 AM, Jeff Templon wrote:
...
Yo,
I'm talking to Brian Bockleman about other things, maybe if I can
convince him to come, he can tell us about his experiences with HDFS.
Brian, now you have another reason to come ... give us a short
technical seminar about HDFS?  It could just be you with a piece of
chalk and a blackboard, as informal as you like?
                JT

Oscar Koeroo wrote:
...
Hi Sander,
I'm very interested in the details, opportunities and experiences with
Hadoop, but I don't have a clue about any of these things.
Hadoop: http://hadoop.apache.org/core/
Hadoop FAQ

What is Hadoop?

[WWW] Hadoop is a distributed computing platform written in Java. It
incorporates features similar to those of the [WWW] Google File System
and of [WWW] MapReduce. For some details, see HadoopMapReduce.
    Oscar
Sander Klous wrote:
...
Hi
What is Hadoop exactly? Reading through the specs it doesn't look
like a
competitor for dCache. So, can anybody comment on the remark below:
"HDFS is
much better than dCache"?
Thanks,
Sander
...
Plan:

Gain experience w/ the Hadoop File System via a large

installation on the cluster in B240.

Assuming it really works according to our expectations,

investigate how to best "pseudo-daemoncore-ify" the HDFS service
so  that runs under the condor_master, responds to Condor
administrative  commands such as on, off, restart, reconfig, etc,
and hopefully has  similar behavior for a debug log and config
settings.  Note: some  challenges here include HDFS service is
implemented in Java, perhaps  relies on ssh for network
authentication, etc.

Enable the Condor file transfer service to stage files in/out

of  HDFS, in addition to just the shadow.
That's really interesting.
As one datapoint: I was talking to Brian Bockelman about HDFS. He
has  been deployed HDFS at his OSG site in Nebraska, and thinks that
it is  great, much better than dCache.
An interesting question, once we get far enough along with this
work:  can the interface to HDFS be made pluggable? That is, can
Condor call  out to a file transfer service? I don't know if it
makes sense, but we  might be able to plug in other services one day.

Ct-grid mailing list
Ct-grid@nikhef.nl
https://mailman.nikhef.nl/cgi-bin/listinfo/ct-grid

Ct-grid mailing list
Ct-grid@nikhef.nl
https://mailman.nikhef.nl/cgi-bin/listinfo/ct-grid

Re: [Ct-grid] FW: [Condor-team] Hadoop and Condor