[Ncep.list.nems.announce] two upcoming NEMS commits

Samuel Trahan - NOAA Affiliate samuel.trahan at noaa.gov
Thu Sep 28 23:19:03 UTC 2017


Hi all,

At long last, my testing is complete, and I've fixed all bugs I've found in
NEMS and apps.  Details are below.  I plan on committing at 4 PM Eastern
time Friday September 29.

Tests run:

NEMSGSM (Theia, WCOSS Phase 1)
  - created a baseline, validated against it, validated against original
baseline
https://svnemc.ncep.noaa.gov/projects/nems/apps/NEMSGSM/branches/nccmp-deliver

NEMSfv3gfs (Theia; Jet; WCOSS Phase 1, Phase2, Cray)
  - created a baseline, validated against it, validated against original
baseline
https://svnemc.ncep.noaa.gov/projects/nems/apps/NEMSfv3gfs/branches/nccmp-deliver

WW3-GSM (Theia, WCOSS Phase 2)
  - created a baseline, validated against it, validated against original
baseline
https://svnemc.ncep.noaa.gov/projects/nems/apps/WW3-GSM/branches/nccmp-deliver

HYCOM-GSM-CICE (Theia only)
  - created a baseline, validated against it, validated against original
baseline
https://svnemc.ncep.noaa.gov/projects/nems/apps/HYCOM-GSM-CICE/branches/nccmp-deliver

GSM-MOM5-CICE5
  - added netcdf variable-by-variable comparisons to both enabled tests
  - created new baseline
  - validated against now baseline

I did not validate against the old baseline for GSM-MOM5-CICE5 because the
test now has additional baseline files that are not in the old baseline
area.

https://svnemc.ncep.noaa.gov/projects/nems/apps/GSM-MOM5-CICE5/branches/nccmp-deliver

NEMS branch:

https://svnemc.ncep.noaa.gov/projects/nems/branches/nccmp-deliver





In addition to the prior mentioned fixes, this commit also corrects the
following:



1. produtil: When generating a baseline, the NEMSCompsetRun --resume
feature forgot it was generating a baseline and reset to execution
(verification) mode.  This is corrected.

2. NEMS + apps: All /gpfs/hps/emc paths I found are now /gpfs/hps3/emc.
(produtil had none)

3. NEMS: Removed an erroneous fingerprint file check from baseline
generation during a time at which the target directory does not yet exist.

4. NEMSGSM: The gfs_slg_rsthst compset needs a dependency on gfs_slg
because it requires files created by that job when run in baseline
generation mode.

5. NEMSfv3gfs: On Theia, the NEMSAppBuilder and compile.sh do not generate
bitwise identical executables but do generate bitwise identical output.  In
all of the NEMSfv3gfs tests, there is a checksum run on the executable to
make sure it did not change during the test.  The fv3_appbuilder regression
test was mistakenly comparing its NEMSAppBuilder-compiled executable's
checksum against the compile.sh-compiled executable's checksum.  This
failed on Theia due to the executables not matching bit-for-bit.  The
regression test is updated to check the NEMSAppBuilder-compiled executable
against itself.

6. NEMSfv3gfs: Three of the compsets had their output directories set to
the same directory as the fv3_control.  (Not just the control directory,
but the output directory too.)  Occasionally, more than one would delete
and create the directory at the same time, causing occasional failures.
They now each have their own output directory.

7. Most apps: This does not work during baseline creation:

    BASELINE="/path/to/some/dir"
    BASELINE_TEMPLATE="@[BASELINE]"

In baseline creation mode, the BASELINE variable is replaced with a scrub
or user-specified directory.  A job, prep_baseline, copies the contents of
BASELINE_TEMPLATE to BASELINE.  If BASELINE_TEMPLATE="@[BASELINE]" then it
will copy the directory to itself.  The correct settings are:

    BASELINE="/path/to/some/dir"
    BASELINE_TEMPLATE="/path/to/some/dir"

This is fixed now.

8. HYCOM-GSM-CICE: had no directory with logs of past regression tests.  I
added one with the results of this commit's test.




On Wed, Sep 27, 2017 at 9:57 AM, Samuel Trahan - NOAA Affiliate <
samuel.trahan at noaa.gov> wrote:

> Hi all,
>
> Still testing.  If you're wondering why it is taking so long: most of the
> apps cannot generate their baseline even when run from their unmodified
> trunk.  The only one that works reliably is GSM-MOM5-CICE5.  Usually FV3
> will generate its baseline correctly.  The others simply don't work.  I've
> fixed about 2/3 of the app*platform combinations but I'm still working on
> the rest.
>
> Sincerely,
> Sam Trahan
>
> On Mon, Sep 25, 2017 at 1:27 PM, Samuel Trahan - NOAA Affiliate <
> samuel.trahan at noaa.gov> wrote:
>
>> Hi all,
>>
>> I'm still testing this.  I'm hoping to finish soon.
>>
>> Sincerely,
>> Sam Trahan
>>
>> On Wed, Sep 20, 2017 at 11:40 AM, Samuel Trahan - NOAA Affiliate <
>> samuel.trahan at noaa.gov> wrote:
>>
>>> Hi all,
>>>
>>> There are two NEMS commits going in soon, both of which modify the
>>> NEMSCompsetRun.  The commits are backwards-compatible and low risk.
>>> However, they will need twice as much testing as usual because they affect
>>> baseline generation.  I'll send out branches soon.  Testing will start
>>> Friday, and I'll try to finish by the end of the weekend.
>>>
>>> The timing of the commit is partly contingent on whether there is enough
>>> disk space somewhere on Theia to run the tests.  NEMS areas and all four
>>> stmps hit quota over the past few days, and emc.nemspara cannot write
>>> anywhere else.  I'll test on Gyre, Surge, and Jet first to detect any
>>> likely issues before I test on Theia.
>>>
>>>
>>> - The Changes -
>>>
>>>
>>> 1. Bug fix from Jun Wang to .bitcmp. comparison.  If an output file is
>>> compared to a baseline by specifying the baseline directory (instead of
>>> filename within the directory) then the baseline creation fails.  It does
>>> not realize that the target is a directory because the directory does not
>>> exist before the baseline is created:
>>>
>>>     "RESTART/my_output.rst" .bitcmp. "@[BASELINE_DIR]/RESTART/"
>>>
>>> The fix is to assume the target is a directory if:
>>>
>>>  - it ends with a /
>>>  - the last path component is .
>>>  - the last path component is ..
>>>
>>> Hence, after the commit, "@[BASELINE_DIR]/RESTART/" will work correctly
>>> if it ends in a /
>>>
>>>
>>> 2. Add the capability to NEMSCompsetRun to compare NetCDF files
>>> variable-by-variable instead of bit-for-bit.  This is needed for the CICE5
>>> files, which contain a global metadata value containing the date at which
>>> the file was created.  The actual contents of the variables should match.
>>>
>>>
>>> The affected repos are:
>>>
>>> - produtil
>>> - NEMS
>>> - NEMSfv3gfs
>>> - NEMSGSM
>>> - HYCOM-GSM-CICE
>>> - GSM-MOM5-CICE5
>>> - WW3-GSM
>>>
>>> Sincerely,
>>> Sam Trahan
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://www.lstsrv.ncep.noaa.gov/pipermail/ncep.list.nems.announce/attachments/20170928/41b6a680/attachment.html 


More information about the Ncep.list.nems.announce mailing list