[Ncep.list.fv3-announce] SLURM workflow updates now in global-workflow master!

Kate Friedman - NOAA Federal Kate.Friedman at noaa.gov
Mon May 13 19:13:09 UTC 2019


Additional updates:

- Free-forecast mode not working after SLURM update, patch for that is in
the works and will come back to master ASAP!
- The rocoto_viewer no longer works on Theia after SLURM update, there is
no support for this script currently so until we can either replace or fix
please use rocoto commands to view the status of your experiments on Theia.
See slide 43 of the FV3GFS how-to for the most used commands or other
online rocoto command documentation for the full set:

https://docs.google.com/presentation/d/1DegO_WU8i_3BcyaxrbXaeQ2w3s-yQgLNRNNyaZMMEJk/edit#slide=id.g2c24c0b1fb_0_1


Thanks!

Kate Friedman (formerly Howard)
NOAA/NWS/NCEP/EMC Engineering and Implementation Branch


On Fri, May 10, 2019 at 4:59 PM Kate Friedman - NOAA Federal <
Kate.Friedman at noaa.gov> wrote:

> Additionally the following locally installed copies of the global-workflow
> master have been updated:
>
> Theia: /scratch4/NCEPDEV/global/save/glopara/git/global-workflow/master
>
> WCOSS-Dells: /gpfs/dell2/emc/modeling/noscrub/emc.glopara/git/global-workflow/master
>
> Kate Friedman (formerly Howard)
> NOAA/NWS/NCEP/EMC Engineering and Implementation Branch
>
>
> On Fri, May 10, 2019 at 4:11 PM Kate Friedman - NOAA Federal <
> Kate.Friedman at noaa.gov> wrote:
>
>> All,
>> Thank you for your patience! The SLURM updates for the global-workflow
>> are now in its master! Thank you to everyone who worked hard to test and
>> resolve any final issues this week! Some additional action items were
>> produced during the final testing phase so look for that mentioned below
>> and in upcoming commits. Please let me know if you have any issues with the
>> workflow after this commit.
>>
>> One very important note is that some developers (our workflow team
>> included) noticed differences in SLURM runs done before and after this
>> week's Theia maintenance. These differences are NOT seen between repeated
>> runs done after the maintenance with the exact same inputs and settings.
>> These differences are very small (butterflies) but will be discussed with
>> the machine admins. Any needed fixes will be implemented ASAP.
>>
>> SLURM workflow updates for R&D machines
>> <https://vlab.ncep.noaa.gov/redmine/news/767>
>>
>> SLURM workflow updates for R&D machines (Theia - no impact on WCOSS)
>>
>> Redmine Issue: https://vlab.ncep.noaa.gov/redmine/issues/58894
>>
>> Summary of changes:
>>
>>    - env/THEIA.env - change launcher format for SLURM
>>    - modulefiles/module_base.theia - updates to add slurm prod_util
>>    module
>>    - ush/rocoto/rocoto.py - added partition statements
>>    - ush/rocoto/setup_expt.py - set icsdir option requirement to false
>>    - ush/rocoto/setup_workflow.py - add SLURM checks for Theia parts
>>    - ush/rocoto/workflow_utils.py - add SLURM checks for Theia parts
>>    - util/sorc/compile_gfs_util_wcoss.sh - small fixes
>>    - updated fit2obs jobcard for SLURM
>>
>> Things still to address in later commits:
>>
>>    - Downstream jobs (e.g. gempak) are not yet supported/tested on R&D
>>    machines or under SLURM
>>
>> Caveats:
>>
>>    - Differences have been observed between SLURM tests done before and
>>    after May 7th Theia maintenance. This will be investigated and any
>>    necessary fixes will be tested and committed to the global-workflow master
>>    ASAP. Please report any similar behavior observed!
>>
>> How to incorporate changes into your own copy. Your options:
>>
>>    1. Sync merge the global-workflow master into your branch
>>    (preferred/advised option)
>>    2. Apply changes via patch:
>>
>>    See changes in following file on Theia:
>>
>>
>>    /scratch4/NCEPDEV/global/save/glopara/utilities/global-workflow_slurm_master.diff
>>
>>    Apply changes to your branch while within clone:
>>
>>    git apply --reject --whitespace=fix
>>    /scratch4/NCEPDEV/global/save/glopara/utilities/global-workflow_slurm_master.diff
>>
>>
>> Kate Friedman (formerly Howard)
>> NOAA/NWS/NCEP/EMC Engineering and Implementation Branch
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://www.lstsrv.ncep.noaa.gov/pipermail/ncep.list.fv3-announce/attachments/20190513/c618cbbb/attachment.html 


More information about the Ncep.list.fv3-announce mailing list