biomed-shifts:mom-shift-2015-03-23

Biomed shift takeover phone conference

Date: Mar. 23rd 2015

Attendees:

  • Arad Alper (IsraGrid)
  • Franck Michel (I3S)

Next conference: Mar. 30th, 2015, 10h00.

Bug in CE: The bug that for months caused this alarm…

- Transfer to CREAM failed due to exception: CREAM Register returned error “MethodName=[jobRegister] Timestamp=[Tue|09 Dec 2014 16:24:28] ErrorCode=[0] Description=[system|error] FaultCause=[real|1.000000000000000E+07 in string context]”

…seems to be solved. See ticket #110636. The alarm did not appear this week.

Nagios: The Nagios box is still providing incorrect data

Remind about best practices:

  • Start by following up on open tickets and verifying solved tickets, before submitting new ones.
  • Before submitting a ticket verify that:
    • Another ticket does not already exist for the same problem
    • The resource is not in downtime and is in production status ⇒ you can use the brand new VAPOR portal to do so
    • The alarm can be reproduced manually
  • For CEs: ignore the alarms in the cases described here.
  • no new tickets opened.
  • there seems to be a problem caused by the fact that the nikhef site is now working with a Digicert certificate. There are many alarms on the site and Arad suspects it is due to this problem. It appears that updating the ca-policy-egi-core package solves the problem. Franck will check it and arrange a mail broadcast about it.
  • biomed-shifts/mom-shift-2015-03-23.txt
  • Last modified: 2016/02/05 09:20
  • by fmichel