[Ncep.list.nems.announce] Proposed NEMS commit for Performance Improvements to RRTM-G
john.michalakes at noaa.gov
Wed Jul 16 17:54:11 UTC 2014
I have an update on the effect of the changes on GFS performance. My
earlier results were on Zeus, which has older Westmere processors.
Rerunning the test on WCOSS (Tide) this morning, there is a marked
improvement from the changes for GFS too. On the GFS_16_32 regression case,
the time for the top of the trunk was 30.0 seconds; with the updates to
RRTMG I'm seeing 27.4 seconds. This test also uses threading and I think
the Sandybridge processors on WCOSS do a better job of latency-hiding with
two threads per core than the Westmeres on Zeus.
So at this point I'm in the process of looking at differences in solution
output from the GFS, and I'll provide a report on that soon as part of my
commit proposal. Aside from Moorthi's questions, I haven't heard back from
anyone else. Please let me know if you have questions or concerns. I'm
still shooting for having this committed by end of July, if possible.
PS... I've been doing my testing on WCOSS with the latest version of ESMF
(configure.nems.Wcoss.intel_ESMF_630r) and I've run into a couple of issues:
1. I noticed that even though the code appears to run correctly, at the end
of the run each MPI tasks gives the following message:
GFS FINALIZE STEP SUCCEEDED
Abort(0) on node 15 (rank 15 in comm -2080374783): application called
MPI_Abort(comm=0x84000001, 0) - process 15
Is that normal and expected?
2. When compiling for one of the tests with "make nmm_gsm GOCART_MODE=full"
I'm seeing a compilation error:
mpiifort -g -openmp -mkl=sequential -O2
-I/nwprod/lib/incmod/nemsio -convert big_endian -assume byterecl -fp-model
precise -xAVX -fno-alias -free -O3 -r8 -free -I../../share
ifort: command line warning #10120: overriding '-O2' with '-O3'
atmos_phy_chem_cpl_comp_mod.f90(284): error #6285: There is no matching
specific subroutine for this generic subroutine call.
call ESMF_FieldBundleAdd(Bundle, Field, rc=RC )
compilation aborted for atmos_phy_chem_cpl_comp_mod.f90 (code 1)
Is this a known issue?
From: John Michalakes [ <mailto:john.michalakes at noaa.gov>
mailto:john.michalakes at noaa.gov]
Sent: Monday, July 14, 2014 5:15 PM
To: 'ncep.list.nems.announce at lstsrv.ncep.noaa.gov'
Subject: Proposed NEMS commit for Performance Improvements to RRTM-G
I would like to start the process for obtaining approval to commit
performance related changes to the RRTMG in the NEMS trunk.
Performance improvement on Zeus for then NMM_CNTRL workload:
Original (revision 42943) : 93.5s (0.794s)
With optimized RRTMG: 89.2s (0.739s)
The first times is the time to run the atmospheric component from beginning
to end (that is, the time spent in the ATM_RUN subroutine). In parentheses
is the time per radiation call averaged over the 1125 calls (25 calls on 45
Performance improvement is configuration and workload dependent and will be
the subject of ongoing work. With regard to the GFS cases, the RRTM changes
degrade performance a little bit, but assuming output is okay, I'm
requesting approval to commit the changes into the trunk now and then work
more on improving performance afterwards. (I've been spending a lot of time
and energy keeping up with changes to the trunk).
The attached pdf file is a small sampling of "old-new" grads difference
plots at 03, 12, and 48 hours. There are differences but generally
"snow-like" (more or less random) especially at the 03hr time period.
Since the new output does not agree bit-for-bit, I modified the regression
test script to report the difference and continue on through the next tests.
All 13 tests run to completion:
but at this point the only output I've looked at is the NMM_CNTRL test for
I would like to also do difference plots for the GFS cases but I'm not sure
how to do that since the output isn't grads (as far as I can tell). I'm
looking for suggestions, please.
Code location and summary of changes:
The code is my home directory on Zeus:
and is relative to Revision: 42438 (thus, a little bit behind the top of
the trunk right now, which is at Revision: 42943, but that's just Weiyu's
new changes to the regression scripts).
A summary of the modified files is here:
Performing status on external item at 'src/atmos/gsm':
Requested action by group:
I would appreciate it if those interested could please review and get back
to me with questions, suggestions and concerns within the next week. Then,
based on the resulting discussions, I would hope to have the modifications
committed to the trunk and passing the reg-tests (with a new set of
reference data sets) by the end of this month (July).
-------------- next part --------------
An HTML attachment was scrubbed...
-------------- next part --------------
A non-text attachment was scrubbed...
Name: RRTMG plots.pdf
Size: 470217 bytes
Desc: not available
Url : https://lstsrv.ncep.noaa.gov/pipermail/ncep.list.nems.announce/attachments/20140716/f6d1d81d/attachment-0001.pdf
More information about the Ncep.list.nems.announce