[Ct-grid] action point: Look at CIC portal concerning downtime notifications.

List overview All Threads
Download

newer

older

[Ct-grid] Next BiG Grid Meeting:...

[Ct-grid] BiG Grid meetings in...

Jan Just Keijser

21 Aug 2009 21 Aug '09

12:44 p.m.

hi all,

the other action point from last monday's grid-overleg: why are downtime notifications sent only a day before they actually occur?

The short answer is: this is exactly as specified. If you read

https://edms.cern.ch/file/829986/0.1/EGEE-downtime-notification-procedure.pd... which in turn refers to https://edms.cern.ch/file/829986/0.1/EGEE-intervention-procedures.pdf then most downtime notifications fall into category C: − Send broadcast 1 day in advance − Broadcast targets: all ROCs, affected VOs, all production sites in the related ROC so the actual broadcast happens only 1 day in advance !!

of course , this makes the downtime notification procedure kinda useless for us. As far as I see there are 2 options:

1) create an email list infrastructure-announce@biggrid.nl to which all BigGrid downtime notifications must be sent (on penalty of death-by-tickling or something like that)

2) the GOCDB, where all site downtimes are entered, luckily has a programming interface where you can retrieve all downtimes for a particular site after a particular date: we could thus create a cron job that periodically checks the GOCDB and broadcasts any messages , as appropriate. For example, if I do curl --cert $X509_USER_PROXY --key $X509_USER_PROXY --CApath $X509_CERT_DIR -k \

"https://goc.gridops.org/gocdbpi/public/?method=get_downtime&topentity=SA..." I get <?xml version="1.0"?> <ROOT> <DOWNTIME ID="45205458" CLASSIFICATION="SCHEDULED"> <HOSTNAME>srm.grid.sara.nl</HOSTNAME> <SEVERITY>OUTAGE</SEVERITY> <DESCRIPTION> The tape backend will be down for the whole day due to maintenance. srm.grid.sara.nl will be up but data on tape cannot be accessed. </DESCRIPTION> <START_DATE>1251705600</START_DATE> <END_DATE>1251748800</END_DATE> <FORMATED_START_DATE>2009-08-31 08:00</FORMATED_START_DATE> <FORMATED_END_DATE>2009-08-31 20:00</FORMATED_END_DATE> </DOWNTIME> <DOWNTIME ID="43305437" CLASSIFICATION="SCHEDULED"> <HOSTNAME>ce.gina.sara.nl</HOSTNAME> <SEVERITY>OUTAGE</SEVERITY> <DESCRIPTION> maintenance to broken switch hardware </DESCRIPTION> <START_DATE>1251190800</START_DATE> <END_DATE>1251219600</END_DATE> <FORMATED_START_DATE>2009-08-25 09:00</FORMATED_START_DATE> <FORMATED_END_DATE>2009-08-25 17:00</FORMATED_END_DATE> </DOWNTIME> </ROOT>

so this API _does_ work... now we need a little XML scripting magic to turn it into a broadcast email - any takers? The upside of this approach is that we have a single entry point for downtimes , i.e. less chance of errors.

share and enjoy,

JJK

Show replies by date

Ammar Benabdelkader

21 Aug 21 Aug

2:07 p.m.

New subject: [Ct-grid] action point: Look at CIC portal concerning downtime notifications.

Hi Jan,

Within my group, we may take this task, but we need at first to discuss it.

Some of the questions are: - Is the downtime registration interface part of the API? - Do we need an interface where users can specify their notification preferences. - Does the GODB data model support the flexibility required by the users? - Which programming/Scripting language to be used? - etc.

Regards,

Ammar,

Jan Just Keijser wrote:

...

hi all,

the other action point from last monday's grid-overleg: why are downtime notifications sent only a day before they actually occur?

The short answer is: this is exactly as specified. If you read

https://edms.cern.ch/file/829986/0.1/EGEE-downtime-notification-procedure.pd...

which in turn refers to https://edms.cern.ch/file/829986/0.1/EGEE-intervention-procedures.pdf then most downtime notifications fall into category C: − Send broadcast 1 day in advance − Broadcast targets: all ROCs, affected VOs, all production sites in the related ROC so the actual broadcast happens only 1 day in advance !!

of course , this makes the downtime notification procedure kinda useless for us. As far as I see there are 2 options:

create an email list infrastructure-announce@biggrid.nl to which all

BigGrid downtime notifications must be sent (on penalty of death-by-tickling or something like that)

the GOCDB, where all site downtimes are entered, luckily has a

programming interface where you can retrieve all downtimes for a particular site after a particular date: we could thus create a cron job that periodically checks the GOCDB and broadcasts any messages , as appropriate. For example, if I do curl --cert $X509_USER_PROXY --key $X509_USER_PROXY --CApath $X509_CERT_DIR -k \

"https://goc.gridops.org/gocdbpi/public/?method=get_downtime&topentity=SA..."

I get

<?xml version="1.0"?>

<ROOT> <DOWNTIME ID="45205458" CLASSIFICATION="SCHEDULED"> <HOSTNAME>srm.grid.sara.nl</HOSTNAME> <SEVERITY>OUTAGE</SEVERITY> <DESCRIPTION> The tape backend will be down for the whole day due to maintenance. srm.grid.sara.nl will be up but data on tape cannot be accessed. </DESCRIPTION> <START_DATE>1251705600</START_DATE> <END_DATE>1251748800</END_DATE> <FORMATED_START_DATE>2009-08-31 08:00</FORMATED_START_DATE> <FORMATED_END_DATE>2009-08-31 20:00</FORMATED_END_DATE> </DOWNTIME> <DOWNTIME ID="43305437" CLASSIFICATION="SCHEDULED"> <HOSTNAME>ce.gina.sara.nl</HOSTNAME> <SEVERITY>OUTAGE</SEVERITY> <DESCRIPTION> maintenance to broken switch hardware </DESCRIPTION> <START_DATE>1251190800</START_DATE> <END_DATE>1251219600</END_DATE> <FORMATED_START_DATE>2009-08-25 09:00</FORMATED_START_DATE> <FORMATED_END_DATE>2009-08-25 17:00</FORMATED_END_DATE> </DOWNTIME> </ROOT>

so this API _does_ work... now we need a little XML scripting magic to turn it into a broadcast email - any takers? The upside of this approach is that we have a single entry point for downtimes , i.e. less chance of errors.

share and enjoy,

JJK _______________________________________________ Ct-grid mailing list Ct-grid@nikhef.nl https://mailman.nikhef.nl/cgi-bin/listinfo/ct-grid

Tom Visser

3:17 p.m.

New subject: [Ct-grid] action point: Look at CIC portal concerning downtime notifications.

Hi,

It seems there is a different option. https://savannah.cern.ch/support/?106465

But not implemented yet ??

Mentions the possibility to configure certain types of resources as core services from a user / vo perspective. This could for example be done for all storage elements.

This can, as stated be done using the vo-id card in the cic ops portal. (user personalized notification perspective)

But could be done in the GOCDB as well this last thing is something one might not wan't to do since then all vo's would be affected by that. (service delivery perspective)

Regards,

tom

Tom Visser Phone: +31617411603 Mail: tom.visser@sara.nl SARA Computing & Networking Services High Performance Computing and Visualization http://www.sara.nl

-----Original Message----- From: ct-grid-bounces@nikhef.nl [mailto:ct-grid-bounces@nikhef.nl] On Behalf Of Ammar Benabdelkader Sent: Friday, August 21, 2009 4:08 PM To: Jan Just Keijser Cc: ct-grid@nikhef.nl Subject: Re: [Ct-grid] action point: Look at CIC portal concerning downtime notifications.

Hi Jan,

Within my group, we may take this task, but we need at first to discuss it.

Regards,

Ammar,

Jan Just Keijser wrote:

...

hi all,

the other action point from last monday's grid-overleg: why are downtime notifications sent only a day before they actually occur?

The short answer is: this is exactly as specified. If you read

https://edms.cern.ch/file/829986/0.1/EGEE-downtime-notification-procedure.pd...

which in turn refers to https://edms.cern.ch/file/829986/0.1/EGEE-intervention-procedures.pdf then most downtime notifications fall into category C: − Send broadcast 1 day in advance − Broadcast targets: all ROCs, affected VOs, all production sites in the related ROC so the actual broadcast happens only 1 day in advance !!

of course , this makes the downtime notification procedure kinda useless for us. As far as I see there are 2 options:

create an email list infrastructure-announce@biggrid.nl to which all

BigGrid downtime notifications must be sent (on penalty of death-by-tickling or something like that)

the GOCDB, where all site downtimes are entered, luckily has a

programming interface where you can retrieve all downtimes for a particular site after a particular date: we could thus create a cron job that periodically checks the GOCDB and broadcasts any messages , as appropriate. For example, if I do curl --cert $X509_USER_PROXY --key $X509_USER_PROXY --CApath $X509_CERT_DIR -k \

"https://goc.gridops.org/gocdbpi/public/?method=get_downtime&topentity=SA..."

I get

<?xml version="1.0"?>

<ROOT> <DOWNTIME ID="45205458" CLASSIFICATION="SCHEDULED"> <HOSTNAME>srm.grid.sara.nl</HOSTNAME> <SEVERITY>OUTAGE</SEVERITY> <DESCRIPTION> The tape backend will be down for the whole day due to maintenance. srm.grid.sara.nl will be up but data on tape cannot be accessed. </DESCRIPTION> <START_DATE>1251705600</START_DATE> <END_DATE>1251748800</END_DATE> <FORMATED_START_DATE>2009-08-31 08:00</FORMATED_START_DATE> <FORMATED_END_DATE>2009-08-31 20:00</FORMATED_END_DATE> </DOWNTIME> <DOWNTIME ID="43305437" CLASSIFICATION="SCHEDULED"> <HOSTNAME>ce.gina.sara.nl</HOSTNAME> <SEVERITY>OUTAGE</SEVERITY> <DESCRIPTION> maintenance to broken switch hardware </DESCRIPTION> <START_DATE>1251190800</START_DATE> <END_DATE>1251219600</END_DATE> <FORMATED_START_DATE>2009-08-25 09:00</FORMATED_START_DATE> <FORMATED_END_DATE>2009-08-25 17:00</FORMATED_END_DATE> </DOWNTIME> </ROOT>

so this API _does_ work... now we need a little XML scripting magic to turn it into a broadcast email - any takers? The upside of this approach is that we have a single entry point for downtimes , i.e. less chance of errors.

share and enjoy,

JJK _______________________________________________ Ct-grid mailing list Ct-grid@nikhef.nl https://mailman.nikhef.nl/cgi-bin/listinfo/ct-grid

_______________________________________________ Ct-grid mailing list Ct-grid@nikhef.nl https://mailman.nikhef.nl/cgi-bin/listinfo/ct-grid

Ron Trompert

28 Aug 28 Aug

7:18 a.m.

New subject: [Ct-grid] action point: Look at CIC portal concerning downtime notifications.

Have you guys seen this: https://cic.gridops.org/index.php?section=vo&page=SDnotification_v2

There you can subscribe to downtime notifications of site of your liking. You can also choose between an RSS feed and email.

Indeed the programmatic interface (as Gilles Mathieu puts it) of the GOCdb works fine too.

It is annoying that a site admin has no control anymore over when downtime notifications are send. In WLCG there is an agreement that full day downtimes are announced at least a week in advance. The removal of this features makes it difficult for a site admin to fulfill this agreement. The LHC VOs use the programmatic interface to be informed about downtimes. I can ask in the weekly operations meeting if that feature can be put back in. My preference would be that a downtime notification would be send twice. The submitter should be allowed to select one point in time to send the notification and it should be send 24 hours in advance.

Ron

On Fri, 2009-08-21 at 14:44 +0200, Jan Just Keijser wrote:

...

hi all,

the other action point from last monday's grid-overleg: why are downtime notifications sent only a day before they actually occur?

The short answer is: this is exactly as specified. If you read

https://edms.cern.ch/file/829986/0.1/EGEE-downtime-notification-procedure.pd... which in turn refers to https://edms.cern.ch/file/829986/0.1/EGEE-intervention-procedures.pdf then most downtime notifications fall into category C: − Send broadcast 1 day in advance − Broadcast targets: all ROCs, affected VOs, all production sites in the related ROC so the actual broadcast happens only 1 day in advance !!

of course , this makes the downtime notification procedure kinda useless for us. As far as I see there are 2 options:

create an email list infrastructure-announce@biggrid.nl to which all

BigGrid downtime notifications must be sent (on penalty of death-by-tickling or something like that)

the GOCDB, where all site downtimes are entered, luckily has a

programming interface where you can retrieve all downtimes for a particular site after a particular date: we could thus create a cron job that periodically checks the GOCDB and broadcasts any messages , as appropriate. For example, if I do curl --cert $X509_USER_PROXY --key $X509_USER_PROXY --CApath $X509_CERT_DIR -k \

"https://goc.gridops.org/gocdbpi/public/?method=get_downtime&topentity=SA..." I get

<?xml version="1.0"?>

<ROOT> <DOWNTIME ID="45205458" CLASSIFICATION="SCHEDULED"> <HOSTNAME>srm.grid.sara.nl</HOSTNAME> <SEVERITY>OUTAGE</SEVERITY> <DESCRIPTION> The tape backend will be down for the whole day due to maintenance. srm.grid.sara.nl will be up but data on tape cannot be accessed. </DESCRIPTION> <START_DATE>1251705600</START_DATE> <END_DATE>1251748800</END_DATE> <FORMATED_START_DATE>2009-08-31 08:00</FORMATED_START_DATE> <FORMATED_END_DATE>2009-08-31 20:00</FORMATED_END_DATE> </DOWNTIME> <DOWNTIME ID="43305437" CLASSIFICATION="SCHEDULED"> <HOSTNAME>ce.gina.sara.nl</HOSTNAME> <SEVERITY>OUTAGE</SEVERITY> <DESCRIPTION> maintenance to broken switch hardware </DESCRIPTION> <START_DATE>1251190800</START_DATE> <END_DATE>1251219600</END_DATE> <FORMATED_START_DATE>2009-08-25 09:00</FORMATED_START_DATE> <FORMATED_END_DATE>2009-08-25 17:00</FORMATED_END_DATE> </DOWNTIME> </ROOT>

so this API _does_ work... now we need a little XML scripting magic to turn it into a broadcast email - any takers? The upside of this approach is that we have a single entry point for downtimes , i.e. less chance of errors.

share and enjoy,

JJK _______________________________________________ Ct-grid mailing list Ct-grid@nikhef.nl https://mailman.nikhef.nl/cgi-bin/listinfo/ct-grid

5787

Age (days ago)

5794

Last active (days ago)

ct-grid@nikhef.nl

3 comments

4 participants

tags (0)

participants (4)

Ammar Benabdelkader
Jan Just Keijser
Ron Trompert
Tom Visser