Differences
This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
biomed-shifts:argo [2017/08/28 09:18] – created fmichel | biomed-shifts:argo [2017/08/29 07:38] (current) – fmichel | ||
---|---|---|---|
Line 4: | Line 4: | ||
The biomed [[https:// | The biomed [[https:// | ||
+ | |||
+ | **Probes documentation**: | ||
For monitoring results, go to Service Groups → Summary → service name (SERVICE_SRM_V2 or SERVICE_CREAM-CE), | For monitoring results, go to Service Groups → Summary → service name (SERVICE_SRM_V2 or SERVICE_CREAM-CE), | ||
Line 12: | Line 14: | ||
Clicking on the SE/CE host name gives information on the scheduled downtimes (host state information section). <color red>Only critical problems (showing in red) may lead to ticket submission</ | Clicking on the SE/CE host name gives information on the scheduled downtimes (host state information section). <color red>Only critical problems (showing in red) may lead to ticket submission</ | ||
- | A description of ARGO probes is available from [[https://tomtools.cern.ch/confluence/display/SAMDOC/grid-monitoring-probes-org.sam|the SAM wiki]]. Source code can also be found from the [[https:// | + | A [[https://argo-mon-biomed.cro-ngi.hr/nagios/cgi-bin/status.cgi? |
- | A [[https:// | + | The figure below depicts important graphical elements in ARGO referring to downtimes and comments: {{: |
- | + | ||
- | The figure below depicts important graphical elements in Nagios | + | |
===== Information for administrators ===== | ===== Information for administrators ===== | ||
- | ==== Paths and configuration ==== | + | The ARGO instance is using the following POEM profile: https:// |
- | __Topology__: | + | The topology |
+ | * http:// | ||
+ | * http:// | ||
- | The VO feed, biomed.xml, is then copied to a web server at 0h00 and used by Nagios to build the list of resources monitored. | + | **Soft/Hard states vs. max_check_attempts**: |
- | + | ||
- | Consequently, | + | |
- | + | ||
- | ==== Paths and configuration ==== | + | |
- | + | ||
- | * Documentation: | + | |
- | * Configuration: | + | |
- | * Probes path: / | + | |
- | * Actual code of probes: / | + | |
- | + | ||
- | **Soft/Hard states vs. max_check_attempts**: | + | |
* normal_check_interval 60 | * normal_check_interval 60 | ||
Line 44: | Line 35: | ||
**Passive checks**: they are initiated and performed by external applications/ | **Passive checks**: they are initiated and performed by external applications/ | ||
- | |||
- | ==== Stop/start Nagios ==== | ||
- | |||
- | As root, run: service nagios restart | ||
- | |||
- | ==== Changing the grid certificate ==== | ||
- | |||
- | When the grid certificate of the user used to run tests is renewed once a year, copy the userkey.pem and usercert.pem files to .globus like on any UI. To do so, follow those steps: | ||
- | |||
- | 1. Copy the pem files to the gate machine grid11.lal.in2p3.fr: | ||
- | |||
- | < | ||
- | eval `ssh-agent` | ||
- | ssh-add | ||
- | gsiscp -P 2222 user*.pem grid11.lal.in2p3.fr: | ||
- | </ | ||
- | |||
- | 2. Then log into grid11.lal.in2p3.fr and copy the pem files to the Nagios box grid4: | ||
- | |||
- | < | ||
- | gsissh -AX -p 2222 grid11.lal.in2p3.fr | ||
- | scp user*pem fmichel@grid04.lal.in2p3.fr:/ | ||
- | </ | ||
- | |||
- | 3. Then test the new pem files: | ||
- | |||
- | < | ||
- | ssh fmichel@grid04.lal.in2p3.fr | ||
- | voms-proxy-init --voms biomed | ||
- | </ | ||
- | |||
- | ==== Proxy certificate renewal ==== | ||
- | |||
- | Ssh to the any UI or on Nagios server: grid04.lal.in2p3.fr | ||
- | |||
- | * Create a valid proxy certificate: | ||
- | |||
- | < | ||
- | $ voms-proxy-init --voms biomed | ||
- | </ | ||
- | |||
- | * Renew the proxy | ||
- | |||
- | < | ||
- | $ myproxy-init --cred_lifetime 672 --credname NagiosRetrieve-grid04.lal.in2p3.fr-biomed --pshost myproxy.grif.fr --username nagios --regex_dn_match --retrievable_by_cert "/ | ||
- | </ | ||
- | |||
- | * Check the proxy: | ||
- | |||
- | < | ||
- | $ myproxy-info -l nagios -s myproxy.grif.fr | ||
- | </ | ||
- | |||
- | * Test the proxy retrieval probe: | ||
- | |||
- | < | ||
- | Ssh to the Nagios server: grid04.lal.in2p3.fr | ||
- | $ sudo su - nagios | ||
- | $ / | ||
- | </ | ||
- | |||