Dear all,
First let me shortly introduce myself. I am a Java developer at Logica, at the moment involved in a project in which we strive to introduce the power of agent technology in grid country. The main focus is the improvement of problem handling and user feedback.
Could you please confirm the correctness of the drawing I've made for the grid?
Does it effectively visualize the working of the grid (storage)? Do you have any suggestions?
I have some more questions.
1 Which identifiers are allowed in JDL for storage references (lfn, guid, surl, turl)? 2 Is nagios present at all systems that make up the grid? 3 Can someone describe the process of scheduling and load balancing for the WMS and the CE, or point me to documentation? I am mainly interested in how and when these processes use information on the availability of the resources needed by a job.
Thanks in advance,
Eduard
------------------------------------------------ Eduard Drenth Logica Groningen Java / XML specialist 06-20943428
Please help Logica to respect the environment by not printing this email / Pour contribuer comme Logica au respect de l'environnement, merci de ne pas imprimer ce mail / Bitte drucken Sie diese Nachricht nicht aus und helfen Sie so Logica dabei die Umwelt zu schuetzen / Por favor ajude a Logica a respeitar o ambiente nao imprimindo este correio electronico.
This e-mail and any attachment is for authorised use by the intended recipient(s) only. It may contain proprietary material, confidential information and/or be subject to legal privilege. It should not be copied, disclosed to, retained or used by, any other party. If you are not an intended recipient then please promptly delete this e-mail and any attachment and all copies and inform the sender. Thank you.
Drenth, Eduard wrote:
Could you please confirm the correctness of the drawing I've made for the grid?
The picture is largely OK, but it gets some of the details wrong:
- There is no user interaction from WMS to CE, or from SE to tape.
- There can be direct interaction from a UI with the CE
- The interface between WMS and CE is either CondorG (for LCG-CE) or ICE (for CREAMCE).
- The interface between the CE and the WN is through another box, the batch system head node (a.k.a. LRMS or Local Resource Management System). This is either the globus-jobmanager interface (for LCG-CE) or BLAH (for CREAMCE) and can talk to different kinds of batch systems such as Torque/Maui, Condor, Sun Grid Engine, LSF, or others.
- SRM is not a box of its own, it is an interface offered by the SE.
- There can be direct interfacing between UI and SE
- The LFN, SURL and TURL are all used in conjunction, for the same data but for different context:
* an LFN is a symbolic name, which the LFC can resolve to one or more SURLS (for different SEs)
* The SURL identifies a file uniquely at a single SE.
* For data retrieval, ask the SE to give you a TURL for a specific SURL; the TURL can, in priniple, be used only once and for a limited time.
Does it effectively visualize the working of the grid (storage)? Do you have any suggestions?
Maybe another kind of diagram (e.g. a sequence diagram) may shed more light on it.
Have you seen the grid tutorial handouts? They can be found on http://www.nikhef.nl/~dennisvd/.
I have some more questions.
1 Which identifiers are allowed in JDL for storage references (lfn, guid, surl, turl)?
Not sure about guid and surl, that would be a fun test. Like I mentioned, a turl is a temporary thing. But plain gsiftp urls should work.
2 Is nagios present at all systems that make up the grid?
No, this is a site-local decision. Any well-managed site should have something like Nagios, though.
3 Can someone describe the process of scheduling and load
balancing for the WMS and the CE, or point me to documentation?
See http://egee-jra1-wm.mi.infn.it/egee-jra1-wm/index.shtml
The basics are not hard: The requirements in the JDL are matched by the WMS with a set of CEs; then a ranking is made based on the lowest 'estimated response time'.
I am mainly interested in how and when these processes use information on the availability of the resources needed by a job.
AFAIK, all info comes from the information system. I have not heard of sites that publish live downtime from their monitoring systems to the information system, but sites will make sure that they update the (non)availability when downtime is planned ahead (e.g. by publishing zero available worker nodes).
HTH,
Dennis