[Ncep.list.fv3-announce] fv3gfs release beta test

Jun Wang - NOAA Affiliate jun.wang at noaa.gov
Fri May 12 16:40:54 UTC 2017


Jim,

Sam is working on that, we hope to make it to this release. Thanks.

Jun

On Fri, May 12, 2017 at 12:37 PM, James Rosinski - NOAA Affiliate <
james.rosinski at noaa.gov> wrote:

> Jun and others;
>
> I noticed there is a build capability for fv3gfs on jet, but not any run
> instructions or runjob_jet.sh for jet. Is jet run capability intended, or
> is that for later?
>
> Regards,
> Jim Rosinski
>
>
>
> On Fri, May 12, 2017 at 10:16 AM, Rusty Benson - NOAA Federal <
> rusty.benson at noaa.gov> wrote:
>
>> Eugene,
>>
>> Responding only to point 4, what options do you see that are Cray
>> compilation flags?
>>
>> Rusty
>> --
>> Rusty Benson, PhD
>> Modeling Systems Group
>> NOAA Geophysical Fluid Dynamics Lab
>> Princeton, NJ
>>
>> On Fri, May 12, 2017 at 11:58 AM, Eugene Mirvis <eugene.mirvis at noaa.gov>
>> wrote:
>>
>>> Gerard,
>>>
>>> Just several points to clarify.
>>>
>>> 1. NEMS practice  to call modulefiles what is actually the scripts
>>> (calling module commands),
>>> that require mostly bash env in order to source and keep environment was
>>> always non standard use.
>>>
>>> Therefore,  if the developers will take Sam's advice and make real
>>> modulefiles (starting with #%Module) from the script, "export" and other
>>> scripting works
>>> wouldn't make sense, while Module util commands and Tcl/Tk  will work.
>>>
>>> 2. There is another dilemma - to keep needed environment and change
>>> within a workflow env. change.
>>> btw,
>>> unsetenv, append-path, prepend-path and remove-path  module commands are
>>> very useful for controlling that, but you have to keep
>>> $LOADEDMODULES, and $MODULE PATH in consistent order.
>>>
>>> 3. Speaking of which,
>>> module purge
>>> and
>>> module switch
>>> are very useful to unload application driven modules.
>>> a/ You just have to do that not any moment, but before apps module is
>>> loaded, then, unload <appsModulefile> - not purge, but unload.
>>> b/ On Crays, the are some internal dependencies  inside of PrgEnv. So
>>> you have to keep all "module use <knowns>"  to recover, otherwise you might
>>> find "module not found"
>>>
>>> 4.
>>> Compiling on Theia, I'm just wondering why Cray's compilation flags are
>>> utilized... allover:
>>> See for  instance
>>> ...
>>> *mpiifort -I/apps/netcdf/4.3.0-intel/include -fno-alias -auto
>>> -safe-cray-ptr -ftz -assume byterecl -nowarn -sox -align array64byte -i4
>>> -real-size 64 -no-prec-div -no-prec-sqrt -xCORE-AVX2**
>>> -qno-opt-dynamic-align* -O2 -debug minimal -fp-model source
>>> -qoverride-limits -qopt-prefetch=3 -qopenmp -I/apps/esmf/7.0.0/intel/intel
>>> mpi/mod/modO/Linux.intel.64.intelmpi.default
>>> -I/apps/esmf/7.0.0/intel/intelmpi/include -I/apps/netcdf/4.3.0-intel/include
>>> -IENS_Cpl -I.  -I/scratch4/NCEPDEV/global/nos
>>> crub/Eugene.Mirvis/fv3gfs.v0beta/FV3/nems_dir -c
>>> module_MEDIATOR_methods.f90
>>> ...
>>>
>>> Thanks,
>>> -Eugene
>>> On 5/12/2017 3:27 AM, Samuel Trahan - NOAA Affiliate wrote:
>>>
>>> Gerard,
>>>
>>> The "module load module.free-nctools" is failing because that file contains
>>> commands that are not valid in a modulefile.  You can have as many "module"
>>> commands as you want in a modulefile, but you cannot have bash code.  Also,
>>> it needs to start with #%Module
>>>
>>> export A=B # bad
>>> setenv A B
>>> prepend-path A (something)
>>>
>>> No "source /etc/profile"
>>> No "source /.../init/bash"
>>> No "module purge" because that causes infinite loops on some platforms.
>>> (The module is trying to unload itself.)
>>>
>>> Obtaining the "module" command and purging modules has to be done before
>>> loading the module.  We have a pair of scripts (one csh, one sh) that do
>>> that on all NOAA machines.  They are aware of csh, tcsh, ksh, bash, and sh;
>>> and run the correct initialization for each.  The master copies are in
>>> NEMS/src/conf/module-setup.*sh.inc (csh and sh versions).  Those scripts
>>> work on all NOAA machines and have a test suite inside NEMS/tests.
>>>
>>> man 4 modulefilehttp://modules.sourceforge.net/c/modulefile.html
>>>
>>>
>>> Sincerely,
>>> Sam Trahan
>>>
>>> On Thu, May 11, 2017 at 5:01 PM, Gerard Ketefian - NOAA Affiliate <gerard.ketefian at noaa.gov> wrote:
>>>
>>>
>>> Hi Jun,
>>>
>>> Yes, after the missing "cp" fix, I (and Jim too) still had a problem with
>>> this line in runjob_theia.sh:
>>>
>>> module load module.fre-nctools
>>>
>>> This syntax (where you specify a file name after "module load") is not
>>> supported in our environment.  So what we did was first change the above
>>> line to:
>>>
>>> . ./module.fre-nctools
>>>
>>> to just source the module.fre-nctools file.  Also, we changed the contents
>>> of the file ${BASE_DIR}/fv3gfs.v0beta/release/v0/modulefiles/fv3gfs/fre-nctools.theia
>>> to:
>>>
>>>    module load impi/5.1.2.150
>>>    module load netcdf/4.3.0
>>>    module load hdf5/1.8.14
>>>    export HDF5_DIR=$HDF5
>>>    export NETCDF_DIR=$NETCDF
>>>    export LIBRARY_PATH=${LIBRARY_PATH}:${NETCDF}/lib:${HDF5}/lib
>>>
>>> (Btw, should LIBRARY_PATH be changed to LD_LIBRARY_PATH in the last line?)
>>>  With this, the remapping worked without any error messages.  When I
>>> compared the output with the baseline run, all the files were the same
>>> EXCEPT the three 1deg files from the remapping.  I don't know why those
>>> were differnt.  They weren't exactly the same but had almost the same
>>> sizes.
>>>
>>> It would be nice to be able to use the syntax
>>>
>>> module load module.fre-nctools
>>>
>>> in our environment as well.  When I do a "module help", I don't see this
>>> syntax as an option (where you're specifying a file name that contains
>>> several commands).  Is there a way to get this to work on theia?
>>>
>>> Thanks,
>>> Gerard
>>>
>>>
>>> On Thu, May 11, 2017 at 7:00 AM, Jun Wang - NOAA Affiliate <jun.wang at noaa.gov> wrote:
>>>
>>>
>>> Gerard,
>>>
>>> With the fix last night, do you still have problem load the module
>>> module.fre-nctools?
>>>
>>> Thanks for pointing out the forecast time specified in runjob_theia.sh, I
>>> changed it to be consistent with the baseline.
>>>
>>> Another fix on the forecast executable is Jim Abeles are also committed
>>> to the tag.
>>>
>>> I am planning to commit all the changes made to the temporary beta tag to
>>> trunk later today for Sam to add jet extension, please send me any
>>> fix/suggestion. Thanks to all who are doing testing.
>>>
>>>
>>> Jun
>>>
>>> On Thu, May 11, 2017 at 5:03 AM, Gerard Ketefian - NOAA Affiliate <gerard.ketefian at noaa.gov> wrote:
>>>
>>>
>>> Hi all,
>>>
>>> With Ligia's hints and Jun's last fix, I was able to complete the run
>>> but not the remap.  I think the remap fails because some modules don't get
>>> loaded properly.
>>>
>>> To get the remap to also work, I replaced the following line in
>>> runjob_theia.sh
>>>
>>> module load module.fre-nctools
>>>
>>> with the following block (copied and modified from the file
>>> module.fre-nctools):
>>>
>>> module load impi/5.1.2.150
>>> module load netcdf/4.3.0
>>> module load hdf5/1.8.14
>>> export HDF5_DIR=$HDF5
>>> export NETCDF_DIR=$NETCDF
>>> export LIBRARY_PATH=${LIBRARY_PATH}:${NETCDF}/lib:${HDF5}/lib
>>>
>>> This change should allow the 1deg remapped netcdf files to be generated.
>>>
>>> When I do the comparison of the sample run's netcdf files with baseline,
>>> there is a about factor of 5 difference (the sample run files being
>>> larger).  This is because there are only 8 output times in the baseline
>>> files but 40 in the run output.
>>>
>>> Gerard
>>>
>>>
>>> On Wed, May 10, 2017 at 9:56 PM, Jun Wang - NOAA Affiliate <jun.wang at noaa.gov> wrote:
>>>
>>>
>>> Ligia,
>>>
>>> Thanks for the feedback.  The suggestion on instruction is put in
>>> readme.txt. It is found that "cp " is missing  in line 124 in
>>> runjob_theia.sh. I committed the changes to the beta test tag:https://svnemc.ncep.noaa.gov/projects/nems/apps/NEMSfv3gfs/t
>>> ags/fv3gfs.v0beta
>>>
>>> Please check again. The results differences will need further
>>> investigation. Thanks.
>>>
>>> Jun
>>>
>>> On Wed, May 10, 2017 at 10:28 PM, Ligia Bernardet - NOAA Affiliate <ligia.bernardet at noaa.gov> wrote:
>>>
>>>
>>> Folks,
>>>
>>> Here is some feedback
>>>
>>>
>>> *About the instructions*
>>>
>>>    1. Minor typo. The word "trunk" should be removed: Four executable
>>>    files will be created under fv3gfs.v0beta/*trunk*/NEMS/exe
>>>    2. runjob_theia.sh: Non-NCEPDEV people need to change directories
>>>    DATA and ROTDIR to an area they can write to
>>>    3. diff_baseline.sh:
>>>       1. It would be helpful to tell users to add arguments
>>>       to diff_baseline.sh to set resolution and machine.
>>>       2. Non-NCEPDEV people need to change directory dir1 to location
>>>       of their output
>>>
>>> *Outcome*
>>> It seems I was able to get through the forecast but failed in remap.
>>> Problem seems related to loading modules, I did not fully investigate yet.
>>> Output is in /scratch4/BMC/gmtb/Ligia.Berna
>>> rdet/fv3gfs.v0beta/release/v0/exp
>>>
>>> When running diff, NetCDF files differ from the baseline. I noticed
>>> the file sizes are different (mine are larger than the baseline).
>>>
>>> Ligia
>>>
>>> /scratch4/BMC/gmtb/Ligia.Bernardet/fv3gfs.v0beta/release/v0/
>>> exp/../modulefiles/fv3gfs/fre-nctools.theia module.fre-nctools
>>>
>>> /var/spool/torque/mom_priv/jobs/23566207.bqs3.SC: line 126:
>>> /scratch4/BMC/gmtb/Ligia.Bernardet/fv3gfs.v0beta/release/v0/
>>> exp/../modulefiles/fv3gfs/fre-nctools.theia: *Permission denied*
>>>
>>> + module load module.fre-nctools
>>>
>>> ++ /apps/lmod/lmod/libexec/lmod bash load module.fre-nctools
>>>
>>> Lmod has detected the following error: The following module(s) are
>>> unknown:
>>>
>>> "module.fre-nctools"
>>>
>>> On Wed, May 10, 2017 at 5:47 PM, James Rosinski - NOAA Affiliate <james.rosinski at noaa.gov> wrote:
>>>
>>>
>>> Hi Jun;
>>>
>>> I am about to head home for the day, but here are my comments so far,
>>> after following the instructions for theia:
>>>
>>> o The builds of models and remap codes completed successfully. One
>>> suggestion might be to have the user specify 32 or 64-bit, and nh vs. hydro
>>> in order to cut down compilation time by a factor of 4.
>>>
>>> o The batch job attempting to run the model failed. Relevant lines in
>>> err_theia are:
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> *+ cd /scratch4/NCEPDEV/stmp3/James.Rosinski/C96fv3gfs2016092900+
>>> /bin/cp -p
>>> /scratch3/BMC/gsd-hpcs/rosinski/fv3gfs.v0beta/release/v0/exp/../../../NEMS/exe/fv3_gfs_nh.prod.32bit.x
>>> /scratch4/NCEPDEV/stmp3/James.Rosinski/C96fv3gfs2016092900/.+ -prepend-rank
>>> -np 288 ./fv3_gfs_nh.prod.32bit.x+ ERR=127+ export ERR+ err=127*
>>> Looks like somehow "mpirun" was not found (note there is nothing in
>>> front of "-prepend-rank"). FYI I use csh for my login shell--not sure if
>>> this is behind the error. I had no modules loaded when submitting the job.
>>>
>>> If  you'd like to examine the output you should have read access to
>>> it here on theia:
>>>
>>> /scratch3/BMC/gsd-hpcs/rosinski/fv3gfs.v0beta/release/v0/exp
>>>
>>> More info tomorrow...
>>>
>>> Regards,
>>> Jim Rosinski
>>>
>>>
>>> On Wed, May 10, 2017 at 2:55 PM, Jun Wang - NOAA Affiliate <jun.wang at noaa.gov> wrote:
>>>
>>>
>>> Dear all,
>>>
>>> I was noticed that some directory names in the readme.txt are not
>>> correct. I just updated the tag version, please let me know if you have any
>>> further questions. Thanks.
>>>
>>> Jun
>>>
>>> On Wed, May 10, 2017 at 4:39 PM, Jun Wang - NOAA Affiliate <jun.wang at noaa.gov> wrote:
>>>
>>>
>>> Rusty,
>>>
>>> Thanks for quick feedback. Today we just had a VLAB meeting on how
>>> to provide information for the public release. Vijay mentioned that EMC
>>> will be setting up an FV3GFS community web page through VLAB, some basic
>>> document will be provided there. A formal instruction on how to get release
>>> code, and to compile and run experiment will be on that web page too. For
>>> questions/feedback,  a forum will be set up for users to post questions and
>>> to provide answers &feedback, the purpose is that all the developers will
>>> see the questions/answers, it is suggested not to send questions/feedback
>>> to any individual's personal email (If people receive questions from users,
>>> we suggest that they post the questions along with their answers to the
>>> forum). The readme.txt is a temporary solution to get the testing started,
>>> it may be changed in the final release.
>>>
>>> Kate Howard (kate.howard at noaa.gov) is working on the VLAB fv3gfs
>>> web page, she can add the gfdl fv3gfs support email on the web page too, if
>>> you have any fv3 document for general developers, please send to her.
>>>
>>> Thanks.
>>>
>>>
>>> Jun
>>>
>>> On Wed, May 10, 2017 at 4:02 PM, Rusty Benson - NOAA Federal <rusty.benson at noaa.gov> wrote:
>>>
>>>
>>> Hi Jun and Vijay,
>>>
>>> In the readme.txt Q&A, you mention where to get help.  Has there
>>> been any thought to putting together a single email for tracking all
>>> questions/requests that can be used as a basis for creating a knowledgebase
>>> via a wiki or other forum?  By segmenting FV3 and physics support, I think
>>> we are missing an opportunity for personnel to get exposure to and learn
>>> about system pieces for which they may not necessarily be responsible.
>>>
>>> If we do go the route of a single support email, we have an
>>> existing email for FV3 support which could be used as an alias member.
>>> Otherwise, we would want to publish the email inside of the readme.txt and
>>> not have individual team members being contacted directly
>>> <oar.gfdl.fvgfs_support at noaa.gov> <oar.gfdl.fvgfs_support at noaa.gov>
>>>
>>>
>>>
>>> Rusty
>>> --
>>> Rusty Benson, PhD
>>> Modeling Systems Group
>>> NOAA Geophysical Fluid Dynamics Lab
>>> Princeton, NJ
>>>
>>> On Wed, May 10, 2017 at 2:08 PM, Jun Wang - NOAA Affiliate <jun.wang at noaa.gov> wrote:
>>>
>>>
>>> Dear all,
>>>
>>> The following email is for people who are willing to do beta
>>> testing for the fv3gfs May 15 release. Please ignore the email if you are
>>> not going to run the test.
>>>
>>> The svn tag for beta testing is located at:
>>>
>>>  https://svnemc.ncep.noaa.gov/projects/nems/apps/NEMSfv3gfs/
>>> tags/fv3gfs.v0beta
>>>
>>> The instruction file on how to get and compile the code and to
>>> run an experiment is at:
>>> https://svnemc.ncep.noaa.gov/projects/nems/apps/NEMSfv3gfs/t
>>> ags/fv3gfs.v0beta/release/v0/readme.txt
>>>
>>> Please follow the instructions to see if you can run an
>>> experiment.
>>>
>>>  Thanks.
>>>
>>>
>>> Jun
>>>
>>>
>>>
>>> _______________________________________________
>>> Ncep.list.fv3-announce mailing listNcep.list.fv3-announce at lstsrv.ncep.noaa.govhttps://www.lstsrv.ncep.noaa.gov/mailman/listinfo/ncep.list.
>>> fv3-announce
>>>
>>>
>>>
>>> _______________________________________________
>>> Ncep.list.fv3-announce mailing listNcep.list.fv3-announce at lstsrv.ncep.noaa.govhttps://www.lstsrv.ncep.noaa.gov/mailman/listinfo/ncep.list.
>>> fv3-announce
>>>
>>>
>>>
>>> _______________________________________________
>>> Ncep.list.fv3-announce mailing listNcep.list.fv3-announce at lstsrv.ncep.noaa.govhttps://www.lstsrv.ncep.noaa.gov/mailman/listinfo/ncep.list.
>>> fv3-announce
>>>
>>>
>>>
>>> --
>>> Gerard Ketefian
>>> Research Scientist
>>> NOAA/OAR/ESRL/GSD/EMB, R/GSD1
>>> 325 Broadway
>>> Boulder, CO 80305
>>> phone: 303-497-6209 <(303)%20497-6209> <(303)%20497-6209>
>>>
>>>
>>>
>>> --
>>> Gerard Ketefian
>>> Research Scientist
>>> NOAA/OAR/ESRL/GSD/EMB, R/GSD1
>>> 325 Broadway
>>> Boulder, CO 80305
>>> phone: 303-497-6209 <(303)%20497-6209>
>>>
>>>
>>> _______________________________________________
>>> Ncep.list.fv3-announce mailing listNcep.list.fv3-announce at lstsrv.ncep.noaa.govhttps://www.lstsrv.ncep.noaa.gov/mailman/listinfo/ncep.list.fv3-announce
>>>
>>>
>>> --
>>> EUGENE MIRVIS, Tech Lead,
>>> Senior Computational Scientist, IMSG @
>>>  Global Climate & Weather Modeling Branch of
>>>   NOAA/NCEP Environmental Modeling Center
>>>             NCWCP,  rm  2183
>>>      5830 University Research Ct.
>>>         College Park, MD 20740
>>>             Ph. 301.683.3809 <(301)%20683-3809>
>>>
>>>
>>> _______________________________________________
>>> Ncep.list.fv3-announce mailing list
>>> Ncep.list.fv3-announce at lstsrv.ncep.noaa.gov
>>> https://www.lstsrv.ncep.noaa.gov/mailman/listinfo/ncep.list.fv3-announce
>>>
>>>
>>
>> _______________________________________________
>> Ncep.list.fv3-announce mailing list
>> Ncep.list.fv3-announce at lstsrv.ncep.noaa.gov
>> https://www.lstsrv.ncep.noaa.gov/mailman/listinfo/ncep.list.fv3-announce
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://www.lstsrv.ncep.noaa.gov/pipermail/ncep.list.fv3-announce/attachments/20170512/d28aa779/attachment-0001.html 


More information about the Ncep.list.fv3-announce mailing list