[Ncep.list.fv3-announce] SLURM workflow updates now in global-workflow master!

Kate Friedman - NOAA Federal Kate.Friedman at noaa.gov
Tue May 14 16:28:43 UTC 2019


More updates:

A) So the rocoto viewer should still work if you have the right versions of
rocoto and python loaded on Theia. I imagine most folks having issues have
one or more of the following not set or missing:


   1. rocoto: use rocoto/1.3.0-RC5 module (it also worked
   with rocoto/1.3.0rc2 for me, either rocoto/1.3.0-RC3 or rocoto/1.3.0-RC4
   may also work but advise using rocoto/1.3.0-RC5 anyway)
   2. python: confirm you are using python2 (it also works
   with python/3.6.1-emc loaded)
   3. your TERM=xterm

B) Fixes for free-forecast mode are progressing and the global-workflow
code management team hopes to bring them into the global-workflow master
very soon! Cycled mode should still be working and is unaffected by this
oversight.

Thanks all!

Kate Friedman (formerly Howard)
NOAA/NWS/NCEP/EMC Engineering and Implementation Branch


On Mon, May 13, 2019 at 3:13 PM Kate Friedman - NOAA Federal <
Kate.Friedman at noaa.gov> wrote:

> Additional updates:
>
> - Free-forecast mode not working after SLURM update, patch for that is in
> the works and will come back to master ASAP!
> - The rocoto_viewer no longer works on Theia after SLURM update, there is
> no support for this script currently so until we can either replace or fix
> please use rocoto commands to view the status of your experiments on Theia.
> See slide 43 of the FV3GFS how-to for the most used commands or other
> online rocoto command documentation for the full set:
>
>
> https://docs.google.com/presentation/d/1DegO_WU8i_3BcyaxrbXaeQ2w3s-yQgLNRNNyaZMMEJk/edit#slide=id.g2c24c0b1fb_0_1
>
>
> Thanks!
>
> Kate Friedman (formerly Howard)
> NOAA/NWS/NCEP/EMC Engineering and Implementation Branch
>
>
> On Fri, May 10, 2019 at 4:59 PM Kate Friedman - NOAA Federal <
> Kate.Friedman at noaa.gov> wrote:
>
>> Additionally the following locally installed copies of the
>> global-workflow master have been updated:
>>
>> Theia: /scratch4/NCEPDEV/global/save/glopara/git/global-workflow/master
>>
>> WCOSS-Dells: /gpfs/dell2/emc/modeling/noscrub/emc.glopara/git/global-workflow/master
>>
>> Kate Friedman (formerly Howard)
>> NOAA/NWS/NCEP/EMC Engineering and Implementation Branch
>>
>>
>> On Fri, May 10, 2019 at 4:11 PM Kate Friedman - NOAA Federal <
>> Kate.Friedman at noaa.gov> wrote:
>>
>>> All,
>>> Thank you for your patience! The SLURM updates for the global-workflow
>>> are now in its master! Thank you to everyone who worked hard to test and
>>> resolve any final issues this week! Some additional action items were
>>> produced during the final testing phase so look for that mentioned below
>>> and in upcoming commits. Please let me know if you have any issues with the
>>> workflow after this commit.
>>>
>>> One very important note is that some developers (our workflow team
>>> included) noticed differences in SLURM runs done before and after this
>>> week's Theia maintenance. These differences are NOT seen between repeated
>>> runs done after the maintenance with the exact same inputs and settings.
>>> These differences are very small (butterflies) but will be discussed with
>>> the machine admins. Any needed fixes will be implemented ASAP.
>>>
>>> SLURM workflow updates for R&D machines
>>> <https://vlab.ncep.noaa.gov/redmine/news/767>
>>>
>>> SLURM workflow updates for R&D machines (Theia - no impact on WCOSS)
>>>
>>> Redmine Issue: https://vlab.ncep.noaa.gov/redmine/issues/58894
>>>
>>> Summary of changes:
>>>
>>>    - env/THEIA.env - change launcher format for SLURM
>>>    - modulefiles/module_base.theia - updates to add slurm prod_util
>>>    module
>>>    - ush/rocoto/rocoto.py - added partition statements
>>>    - ush/rocoto/setup_expt.py - set icsdir option requirement to false
>>>    - ush/rocoto/setup_workflow.py - add SLURM checks for Theia parts
>>>    - ush/rocoto/workflow_utils.py - add SLURM checks for Theia parts
>>>    - util/sorc/compile_gfs_util_wcoss.sh - small fixes
>>>    - updated fit2obs jobcard for SLURM
>>>
>>> Things still to address in later commits:
>>>
>>>    - Downstream jobs (e.g. gempak) are not yet supported/tested on R&D
>>>    machines or under SLURM
>>>
>>> Caveats:
>>>
>>>    - Differences have been observed between SLURM tests done before and
>>>    after May 7th Theia maintenance. This will be investigated and any
>>>    necessary fixes will be tested and committed to the global-workflow master
>>>    ASAP. Please report any similar behavior observed!
>>>
>>> How to incorporate changes into your own copy. Your options:
>>>
>>>    1. Sync merge the global-workflow master into your branch
>>>    (preferred/advised option)
>>>    2. Apply changes via patch:
>>>
>>>    See changes in following file on Theia:
>>>
>>>
>>>    /scratch4/NCEPDEV/global/save/glopara/utilities/global-workflow_slurm_master.diff
>>>
>>>    Apply changes to your branch while within clone:
>>>
>>>    git apply --reject --whitespace=fix
>>>    /scratch4/NCEPDEV/global/save/glopara/utilities/global-workflow_slurm_master.diff
>>>
>>>
>>> Kate Friedman (formerly Howard)
>>> NOAA/NWS/NCEP/EMC Engineering and Implementation Branch
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://www.lstsrv.ncep.noaa.gov/pipermail/ncep.list.fv3-announce/attachments/20190514/34217109/attachment.html 


More information about the Ncep.list.fv3-announce mailing list