biomed-shifts:practices

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
biomed-shifts:practices [2021/10/13 14:07]
sorina [Identify the problems]
biomed-shifts:practices [2022/05/19 16:29]
sorina [Reproduce the problem]
Line 57: Line 57:
 ====== ​ Identification of issues ​ ====== ====== ​ Identification of issues ​ ======
  
 +Link to Biomed ARGO page: [[https://​biomed.ui.argo.grnet.gr/​]]
 =====  VOMS server ​ ===== =====  VOMS server ​ =====
 The proxy certificate creation should work: The proxy certificate creation should work:
Line 63: Line 64:
 The VOMS administration interface should be available. From a UI, run the command: The VOMS administration interface should be available. From a UI, run the command:
 <​code>​voms-admin --vo=biomed --host ​ voms-biomed.in2p3.fr --port 8443 list-cas</​code>​ <​code>​voms-admin --vo=biomed --host ​ voms-biomed.in2p3.fr --port 8443 list-cas</​code>​
- 
-=====  LFC server ​ ===== 
-Command "''​time lfc-ls /​grid''"​ should return in less than 30 seconds. 
  
 =====  Monitoring SEs  ===== =====  Monitoring SEs  =====
Line 87: Line 85:
 From the biomed-ui.fedcloud.fr VM, where gfal2 is already installed : From the biomed-ui.fedcloud.fr VM, where gfal2 is already installed :
  
-i) build the Storage URL following the model "srm://​marsedpm.in2p3.fr:​8446/​dpm/​in2p3.fr/​home/​biomed" +1. Build the Storage URL following the model <​code>​srm://​marsedpm.in2p3.fr:​8446/​dpm/​in2p3.fr/​home/​biomed</​code>​ 
- +   
-NOTE 1: the model works for DPM SEs, not sure about storm or dCache+NOTE 1: the model works for DPM SEs, not sure about storm or dCache ​(a storm example is srm://​storm-01.roma3.infn.it:​8444/​srm/​managerv2?​SFN=/​biomed)
  
 NOTE 2: would be interesing to use the probe for building this URL NOTE 2: would be interesing to use the probe for building this URL
  
-ii) use gfal-ls to check that we can list the folder +2. Use gfal-ls to check that we can list the folder
 <​code>​gfal-ls srm://​marsedpm.in2p3.fr:​8446/​dpm/​in2p3.fr/​home/​biomed/​user/​s/​scamarasu </​code>​ <​code>​gfal-ls srm://​marsedpm.in2p3.fr:​8446/​dpm/​in2p3.fr/​home/​biomed/​user/​s/​scamarasu </​code>​
- +3. Use gfal-copy to copy a file (in this case, job.jdl) to the above URL
-iii) use gfal-copy to copy a file (in this case, job.jdl) to the above URL +
 <​code>​gfal-copy job.jdl srm://​marsedpm.in2p3.fr:​8446/​dpm/​in2p3.fr/​home/​biomed/​user/​s/​scamarasu/ ​ <​code>​gfal-copy job.jdl srm://​marsedpm.in2p3.fr:​8446/​dpm/​in2p3.fr/​home/​biomed/​user/​s/​scamarasu/ ​
-Copying file:///​home/​spop/​dirac/​job.jdl ​  ​[DONE] ​ after 17s </​code>​ +  ​Copying file:///​home/​spop/​dirac/​job.jdl ​  ​[DONE] ​ after 17s </​code>​ 
- +4. Check the copy was copied and is now listed
-iv) check the copy was copied and is now listed+
 <​code>​gfal-ls srm://​marsedpm.in2p3.fr:​8446/​dpm/​in2p3.fr/​home/​biomed/​user/​s/​scamarasu <​code>​gfal-ls srm://​marsedpm.in2p3.fr:​8446/​dpm/​in2p3.fr/​home/​biomed/​user/​s/​scamarasu
-job.jdl </​code>​+  ​job.jdl </​code>​
  
 Note that in some cases, the gfal-ls may work (as well as gfal-mkdir),​ but not the gfal-copy: ​ Note that in some cases, the gfal-ls may work (as well as gfal-mkdir),​ but not the gfal-copy: ​
-<​code>​gfal-mkdir srm://​clrlcgse01.in2p3.fr:​8446/​dpm/​in2p3.fr/​home/​biomed/​scamarasu ​</​code>​ +<​code>​gfal-mkdir srm://​clrlcgse01.in2p3.fr:​8446/​dpm/​in2p3.fr/​home/​biomed/​scamarasu  
-<​code>​gfal-ls srm://​clrlcgse01.in2p3.fr:​8446/​dpm/​in2p3.fr/​home/​biomed/​scamarasu ​</​code>​ +gfal-ls srm://​clrlcgse01.in2p3.fr:​8446/​dpm/​in2p3.fr/​home/​biomed/​scamarasu 
-<​code>​gfal-copy dirac/​job.jdl srm://​clrlcgse01.in2p3.fr:​8446/​dpm/​in2p3.fr/​home/​biomed/​scamarasu/​+gfal-copy dirac/​job.jdl srm://​clrlcgse01.in2p3.fr:​8446/​dpm/​in2p3.fr/​home/​biomed/​scamarasu/​
 gfal-copy error: 70 (Communication error on send) - Could not open destination:​ globus_xio: Unable to connect to clrlcgse01.in2p3.fr:​2811 globus_xio: System error in connect: Connection refused globus_xio: A system call failed: Connection refused </​code>​ gfal-copy error: 70 (Communication error on send) - Could not open destination:​ globus_xio: Unable to connect to clrlcgse01.in2p3.fr:​2811 globus_xio: System error in connect: Connection refused globus_xio: A system call failed: Connection refused </​code>​
  
Line 126: Line 120:
 When a SE is to planned for decommissioning,​ launch the specific [[Biomed-Shifts:​decommisioning|SE decommissioning procedure]]. When a SE is to planned for decommissioning,​ launch the specific [[Biomed-Shifts:​decommisioning|SE decommissioning procedure]].
  
 +Older decommissionning page is available here  [[Biomed-Shifts:​old:​old-decommisioning|Old SE decommissioning procedure]].
 =====  Monitoring CEs  ===== =====  Monitoring CEs  =====
  
 ====  Identify the problems ​ ==== ====  Identify the problems ​ ====
-The ARGO box is the best way to identify faulty resources. ​You may use the following straight link: [[https://​argo-mon-biomed.cro-ngi.hr/​nagios/​cgi-bin/​status.cgi?​servicegroup=SERVICE_CREAM-CE&​style=detail&​servicestatustypes=16&​sorttype=2&​sortoption=6|Critical issues for service group CREAM-CE]]+The ARGO box is the best way to identify faulty resources. ​ 
 +====  ​Reproduce the problem ​ ====
  
-Probes documentation is available at https://​wiki.egi.eu/​wiki/​ROC_SAM_Tests.+1Manual ARC CE submission
  
-====  Reproduce the problem ​ ==== +- see https://​www.nordugrid.org/​arc/​arc6/​users/​submit_job.html for more details and a job description example 
-Reproduce the problem by one of the two methods below.+ 
 +- submit with "​arcsub job.xrsl -c CENAME"​ 
 + 
 +Further ARC CE documentation available in French : https://​grand-est.fr/​support-utilisateurs/​documentation-en-ligne/​guide-dutilisation-de-arc-ce/​
  
-Download this {{:biomed-shifts:test.jdl|test JDL}} (or {{:biomed-shifts:​test2.jdl|this one}}, since the 1st one seems to fail) , rename it as test_ce_noreq.jdl and submit it to the concerned CE. Check the BDII (lcg-infosites) to get the full name of a queue on that CE and run the command: +and DIRAC :https://grand-est.fr/support-utilisateurs/​documentation-en-ligne/guide-dutilisation-de-dirac/
-<​code>​ glite-ce-job-submit -a -r <CE hostname>:<​port>​/<​queue_name>​ test_ce_noreq.jdl</​code>​ +
-Then check that the status and the output when the submit command has completed:​ +
-<​code>​glite-ce-job-status <​jobId><​/code>+
  
-Reminder: __before submitting a ticket make sure one is not open yet__.+2Manual HTCndorCE submission
  
 +TO BE DONE
 ==== Ignored alarms ==== ==== Ignored alarms ====
 Shifters shall focus on failed job submissions in priority: probes ''​emi.cream.CREAMCE-AllowedSubmission''​. Shifters shall focus on failed job submissions in priority: probes ''​emi.cream.CREAMCE-AllowedSubmission''​.
  • biomed-shifts/practices.txt
  • Last modified: 2022/05/19 16:32
  • by sorina