---------------------------------------------------------------------------------------------------------------
EGI BROADCAST TOOL : https://operations-portal.egi.eu/broadcast/send
---------------------------------------------------------------------------------------------------------------
Publication from : francesco fabozzi <francesco.fabozzi(a)na.infn.it>
----------------------------------------------------------------------------------------------------------------
Dear VO Managers,
the site INFN-NAPOLI-CMS is under decommissioning.
The site will be put in downtime to prevent new activities.
----------------------------------------------------------------------------------------------------------------
link to this broadcast : https://operations-portal.egi.eu/broadcast/archive/1599
----------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------
EGI BROADCAST TOOL : https://operations-portal.egi.eu/broadcast/send
---------------------------------------------------------------------------------------------------------------
Publication from : Alessandro Paolini <alessandro.paolini(a)egi.eu>
----------------------------------------------------------------------------------------------------------------
======= Content ========
1) UMD releases
2) Decommission of mon.egi.eu and cloudmon.egi.eu
======================
1) On Nov 23rd two revisions of UMD 3.14.6 (SL6) and UMD 4.3.1 (SL6/CentOS7) have been released:
a) UMD 3.14.6 includes lcmaps-plugins-vo-ca-ap, needed for supporting the IGTF IOTA profile of CAs
b) UMD 4.3.1 includes:
*** CentOS7
lcas-lcmaps-gt4-interface 0.3.0-0.3.1
lcmaps 1.6.6
lcas 1.3.19
glExec 1.2.3
glExec-WN 1.3.0
lcmaps-plugins 1.7.1
*** SL6
ARGUS 1.7 (regular 4.3.0 shipped only CentOS7 version)
2) Decommission of mon.egi.eu and cloudmon.egi.eu
a) on 29 November all the cloud probes were moved to the central servers, and cloudmon.egi.eu was dismissed on Dec 1st.
All the probes are executed using the following certificate subjects:
/DC=EU/DC=EGI/C=HR/O=Robots/O=SRCE/CN=Robot:argo-egi@cro-ngi.hr
/DC=EU/DC=EGI/C=GR/O=Robots/O=Greek Research and Technology Network/CN=Robot:argo-egi@grnet.gr
b) On Dec 6th 2016 the old SAM GridMon box mon.egi.eu, housing central ATP and POEM, was decommissioned. These services became obsolete when we switched to central monitoring instances in July.
- The VO SAM instances will not be affected as they are using local ATP and POEM.
- Remaining NGI SAM instances rely on central ATP and will no longer get topology updates, so this gives their administrators extra incentive to decommission them.
----------------------------------------------------------------------------------------------------------------
link to this broadcast : https://operations-portal.egi.eu/broadcast/archive/1597
----------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------
EGI BROADCAST TOOL : https://operations-portal.in2p3.fr/broadcast/send
---------------------------------------------------------------------------------------------------------------
Publication from : Frederic Schaer <frederic.schaer(a)cea.fr>
----------------------------------------------------------------------------------------------------------------
Dear VOs and users,
It was found by the CMS experiment that a WN at the GRIF/IRFU site was silently corrupting files (thanks, CMS).
After investigations, it appears that a CPU on the machine was silently corrupting files while they were beeing compressed on the machine, only if the compression task was beeing run on core #8 of the CPU socket #0, in addition to it's sibling hyperthreaded core #28.
Unfortunately, this hardware issue remained unnoticed because uncaught by the various hardware and software system checks - neither Dell nor Intel diagnostic tools could find and report it.
Unfortunately also, root files seem to be affected. Or at least files created by the CMS software which includes root and recompiled copies of various compression tools.
It was found also that files compressed with the "bzip2" system tool was also corrupted, but not files created with the system lzma or gzip tools for instance.
Final bad news : we have no way to identify which files -your files- were produced on that machine.
We would therefore like to warn you about this problem, giving you as much details as possible.
The machine name is : wn328.datagrid.cea.fr
The ethernet MAC address of the main ntework interface is : 00:8C:FA:F2:93:1E
The host IPs are : 192.54.205.14 (v4) and 2001:660:3031:110:10::328/64 (v6)
The host entered production on Sep. 21 @ 9H49.
The host is running an up to date SL 6.8
Off course, the host was finally taken out of production (thanks again to cms ;) ) on November 25 2016@10H01 CET time, and the bad cpu should be changed this week.
We would like to apologize for this unwelcome hardware failure, as we already know finding the affected files will be a hard work that you would all have prefered to avoïd.
Best regards
The GRIF/IRFU admins
----------------------------------------------------------------------------------------------------------------
link to this broadcast : https://operations-portal.in2p3.fr/broadcast/archive/1591
----------------------------------------------------------------------------------------------------------------