[Ncep.list.fv3-announce] fv3gfs release beta test

Rusty Benson - NOAA Federal rusty.benson at noaa.gov
Fri May 12 16:16:49 UTC 2017


Eugene,

Responding only to point 4, what options do you see that are Cray
compilation flags?

Rusty
--
Rusty Benson, PhD
Modeling Systems Group
NOAA Geophysical Fluid Dynamics Lab
Princeton, NJ

On Fri, May 12, 2017 at 11:58 AM, Eugene Mirvis <eugene.mirvis at noaa.gov>
wrote:

> Gerard,
>
> Just several points to clarify.
>
> 1. NEMS practice  to call modulefiles what is actually the scripts
> (calling module commands),
> that require mostly bash env in order to source and keep environment was
> always non standard use.
>
> Therefore,  if the developers will take Sam's advice and make real
> modulefiles (starting with #%Module) from the script, "export" and other
> scripting works
> wouldn't make sense, while Module util commands and Tcl/Tk  will work.
>
> 2. There is another dilemma - to keep needed environment and change within
> a workflow env. change.
> btw,
> unsetenv, append-path, prepend-path and remove-path  module commands are
> very useful for controlling that, but you have to keep
> $LOADEDMODULES, and $MODULE PATH in consistent order.
>
> 3. Speaking of which,
> module purge
> and
> module switch
> are very useful to unload application driven modules.
> a/ You just have to do that not any moment, but before apps module is
> loaded, then, unload <appsModulefile> - not purge, but unload.
> b/ On Crays, the are some internal dependencies  inside of PrgEnv. So you
> have to keep all "module use <knowns>"  to recover, otherwise you might
> find "module not found"
>
> 4.
> Compiling on Theia, I'm just wondering why Cray's compilation flags are
> utilized... allover:
> See for  instance
> ...
> *mpiifort -I/apps/netcdf/4.3.0-intel/include -fno-alias -auto
> -safe-cray-ptr -ftz -assume byterecl -nowarn -sox -align array64byte -i4
> -real-size 64 -no-prec-div -no-prec-sqrt -xCORE-AVX2**
> -qno-opt-dynamic-align* -O2 -debug minimal -fp-model source
> -qoverride-limits -qopt-prefetch=3 -qopenmp -I/apps/esmf/7.0.0/intel/
> intelmpi/mod/modO/Linux.intel.64.intelmpi.default
> -I/apps/esmf/7.0.0/intel/intelmpi/include -I/apps/netcdf/4.3.0-intel/include
> -IENS_Cpl -I.  -I/scratch4/NCEPDEV/global/noscrub/Eugene.Mirvis/fv3gfs.v0beta/FV3/nems_dir
> -c module_MEDIATOR_methods.f90
> ...
>
> Thanks,
> -Eugene
> On 5/12/2017 3:27 AM, Samuel Trahan - NOAA Affiliate wrote:
>
> Gerard,
>
> The "module load module.free-nctools" is failing because that file contains
> commands that are not valid in a modulefile.  You can have as many "module"
> commands as you want in a modulefile, but you cannot have bash code.  Also,
> it needs to start with #%Module
>
> export A=B # bad
> setenv A B
> prepend-path A (something)
>
> No "source /etc/profile"
> No "source /.../init/bash"
> No "module purge" because that causes infinite loops on some platforms.
> (The module is trying to unload itself.)
>
> Obtaining the "module" command and purging modules has to be done before
> loading the module.  We have a pair of scripts (one csh, one sh) that do
> that on all NOAA machines.  They are aware of csh, tcsh, ksh, bash, and sh;
> and run the correct initialization for each.  The master copies are in
> NEMS/src/conf/module-setup.*sh.inc (csh and sh versions).  Those scripts
> work on all NOAA machines and have a test suite inside NEMS/tests.
>
> man 4 modulefilehttp://modules.sourceforge.net/c/modulefile.html
>
> Sincerely,
> Sam Trahan
>
> On Thu, May 11, 2017 at 5:01 PM, Gerard Ketefian - NOAA Affiliate <gerard.ketefian at noaa.gov> wrote:
>
>
> Hi Jun,
>
> Yes, after the missing "cp" fix, I (and Jim too) still had a problem with
> this line in runjob_theia.sh:
>
> module load module.fre-nctools
>
> This syntax (where you specify a file name after "module load") is not
> supported in our environment.  So what we did was first change the above
> line to:
>
> . ./module.fre-nctools
>
> to just source the module.fre-nctools file.  Also, we changed the contents
> of the file ${BASE_DIR}/fv3gfs.v0beta/release/v0/modulefiles/fv3gfs/fre-nctools.theia
> to:
>
>    module load impi/5.1.2.150
>    module load netcdf/4.3.0
>    module load hdf5/1.8.14
>    export HDF5_DIR=$HDF5
>    export NETCDF_DIR=$NETCDF
>    export LIBRARY_PATH=${LIBRARY_PATH}:${NETCDF}/lib:${HDF5}/lib
>
> (Btw, should LIBRARY_PATH be changed to LD_LIBRARY_PATH in the last line?)
>  With this, the remapping worked without any error messages.  When I
> compared the output with the baseline run, all the files were the same
> EXCEPT the three 1deg files from the remapping.  I don't know why those
> were differnt.  They weren't exactly the same but had almost the same
> sizes.
>
> It would be nice to be able to use the syntax
>
> module load module.fre-nctools
>
> in our environment as well.  When I do a "module help", I don't see this
> syntax as an option (where you're specifying a file name that contains
> several commands).  Is there a way to get this to work on theia?
>
> Thanks,
> Gerard
>
>
> On Thu, May 11, 2017 at 7:00 AM, Jun Wang - NOAA Affiliate <jun.wang at noaa.gov> wrote:
>
>
> Gerard,
>
> With the fix last night, do you still have problem load the module
> module.fre-nctools?
>
> Thanks for pointing out the forecast time specified in runjob_theia.sh, I
> changed it to be consistent with the baseline.
>
> Another fix on the forecast executable is Jim Abeles are also committed
> to the tag.
>
> I am planning to commit all the changes made to the temporary beta tag to
> trunk later today for Sam to add jet extension, please send me any
> fix/suggestion. Thanks to all who are doing testing.
>
>
> Jun
>
> On Thu, May 11, 2017 at 5:03 AM, Gerard Ketefian - NOAA Affiliate <gerard.ketefian at noaa.gov> wrote:
>
>
> Hi all,
>
> With Ligia's hints and Jun's last fix, I was able to complete the run
> but not the remap.  I think the remap fails because some modules don't get
> loaded properly.
>
> To get the remap to also work, I replaced the following line in
> runjob_theia.sh
>
> module load module.fre-nctools
>
> with the following block (copied and modified from the file
> module.fre-nctools):
>
> module load impi/5.1.2.150
> module load netcdf/4.3.0
> module load hdf5/1.8.14
> export HDF5_DIR=$HDF5
> export NETCDF_DIR=$NETCDF
> export LIBRARY_PATH=${LIBRARY_PATH}:${NETCDF}/lib:${HDF5}/lib
>
> This change should allow the 1deg remapped netcdf files to be generated.
>
> When I do the comparison of the sample run's netcdf files with baseline,
> there is a about factor of 5 difference (the sample run files being
> larger).  This is because there are only 8 output times in the baseline
> files but 40 in the run output.
>
> Gerard
>
>
> On Wed, May 10, 2017 at 9:56 PM, Jun Wang - NOAA Affiliate <jun.wang at noaa.gov> wrote:
>
>
> Ligia,
>
> Thanks for the feedback.  The suggestion on instruction is put in
> readme.txt. It is found that "cp " is missing  in line 124 in
> runjob_theia.sh. I committed the changes to the beta test tag:https://svnemc.ncep.noaa.gov/projects/nems/apps/NEMSfv3gfs/t
> ags/fv3gfs.v0beta
>
> Please check again. The results differences will need further
> investigation. Thanks.
>
> Jun
>
> On Wed, May 10, 2017 at 10:28 PM, Ligia Bernardet - NOAA Affiliate <ligia.bernardet at noaa.gov> wrote:
>
>
> Folks,
>
> Here is some feedback
>
>
> *About the instructions*
>
>    1. Minor typo. The word "trunk" should be removed: Four executable
>    files will be created under fv3gfs.v0beta/*trunk*/NEMS/exe
>    2. runjob_theia.sh: Non-NCEPDEV people need to change directories
>    DATA and ROTDIR to an area they can write to
>    3. diff_baseline.sh:
>       1. It would be helpful to tell users to add arguments
>       to diff_baseline.sh to set resolution and machine.
>       2. Non-NCEPDEV people need to change directory dir1 to location
>       of their output
>
> *Outcome*
> It seems I was able to get through the forecast but failed in remap.
> Problem seems related to loading modules, I did not fully investigate yet.
> Output is in /scratch4/BMC/gmtb/Ligia.Berna
> rdet/fv3gfs.v0beta/release/v0/exp
>
> When running diff, NetCDF files differ from the baseline. I noticed
> the file sizes are different (mine are larger than the baseline).
>
> Ligia
>
> /scratch4/BMC/gmtb/Ligia.Bernardet/fv3gfs.v0beta/release/v0/
> exp/../modulefiles/fv3gfs/fre-nctools.theia module.fre-nctools
>
> /var/spool/torque/mom_priv/jobs/23566207.bqs3.SC: line 126:
> /scratch4/BMC/gmtb/Ligia.Bernardet/fv3gfs.v0beta/release/v0/
> exp/../modulefiles/fv3gfs/fre-nctools.theia: *Permission denied*
>
> + module load module.fre-nctools
>
> ++ /apps/lmod/lmod/libexec/lmod bash load module.fre-nctools
>
> Lmod has detected the following error: The following module(s) are
> unknown:
>
> "module.fre-nctools"
>
> On Wed, May 10, 2017 at 5:47 PM, James Rosinski - NOAA Affiliate <james.rosinski at noaa.gov> wrote:
>
>
> Hi Jun;
>
> I am about to head home for the day, but here are my comments so far,
> after following the instructions for theia:
>
> o The builds of models and remap codes completed successfully. One
> suggestion might be to have the user specify 32 or 64-bit, and nh vs. hydro
> in order to cut down compilation time by a factor of 4.
>
> o The batch job attempting to run the model failed. Relevant lines in
> err_theia are:
>
>
>
>
>
>
>
>
> *+ cd /scratch4/NCEPDEV/stmp3/James.Rosinski/C96fv3gfs2016092900+
> /bin/cp -p
> /scratch3/BMC/gsd-hpcs/rosinski/fv3gfs.v0beta/release/v0/exp/../../../NEMS/exe/fv3_gfs_nh.prod.32bit.x
> /scratch4/NCEPDEV/stmp3/James.Rosinski/C96fv3gfs2016092900/.+ -prepend-rank
> -np 288 ./fv3_gfs_nh.prod.32bit.x+ ERR=127+ export ERR+ err=127*
> Looks like somehow "mpirun" was not found (note there is nothing in
> front of "-prepend-rank"). FYI I use csh for my login shell--not sure if
> this is behind the error. I had no modules loaded when submitting the job.
>
> If  you'd like to examine the output you should have read access to
> it here on theia:
>
> /scratch3/BMC/gsd-hpcs/rosinski/fv3gfs.v0beta/release/v0/exp
>
> More info tomorrow...
>
> Regards,
> Jim Rosinski
>
>
> On Wed, May 10, 2017 at 2:55 PM, Jun Wang - NOAA Affiliate <jun.wang at noaa.gov> wrote:
>
>
> Dear all,
>
> I was noticed that some directory names in the readme.txt are not
> correct. I just updated the tag version, please let me know if you have any
> further questions. Thanks.
>
> Jun
>
> On Wed, May 10, 2017 at 4:39 PM, Jun Wang - NOAA Affiliate <jun.wang at noaa.gov> wrote:
>
>
> Rusty,
>
> Thanks for quick feedback. Today we just had a VLAB meeting on how
> to provide information for the public release. Vijay mentioned that EMC
> will be setting up an FV3GFS community web page through VLAB, some basic
> document will be provided there. A formal instruction on how to get release
> code, and to compile and run experiment will be on that web page too. For
> questions/feedback,  a forum will be set up for users to post questions and
> to provide answers &feedback, the purpose is that all the developers will
> see the questions/answers, it is suggested not to send questions/feedback
> to any individual's personal email (If people receive questions from users,
> we suggest that they post the questions along with their answers to the
> forum). The readme.txt is a temporary solution to get the testing started,
> it may be changed in the final release.
>
> Kate Howard (kate.howard at noaa.gov) is working on the VLAB fv3gfs
> web page, she can add the gfdl fv3gfs support email on the web page too, if
> you have any fv3 document for general developers, please send to her.
>
> Thanks.
>
>
> Jun
>
> On Wed, May 10, 2017 at 4:02 PM, Rusty Benson - NOAA Federal <rusty.benson at noaa.gov> wrote:
>
>
> Hi Jun and Vijay,
>
> In the readme.txt Q&A, you mention where to get help.  Has there
> been any thought to putting together a single email for tracking all
> questions/requests that can be used as a basis for creating a knowledgebase
> via a wiki or other forum?  By segmenting FV3 and physics support, I think
> we are missing an opportunity for personnel to get exposure to and learn
> about system pieces for which they may not necessarily be responsible.
>
> If we do go the route of a single support email, we have an
> existing email for FV3 support which could be used as an alias member.
> Otherwise, we would want to publish the email inside of the readme.txt and
> not have individual team members being contacted directly
> <oar.gfdl.fvgfs_support at noaa.gov> <oar.gfdl.fvgfs_support at noaa.gov>
>
>
>
> Rusty
> --
> Rusty Benson, PhD
> Modeling Systems Group
> NOAA Geophysical Fluid Dynamics Lab
> Princeton, NJ
>
> On Wed, May 10, 2017 at 2:08 PM, Jun Wang - NOAA Affiliate <jun.wang at noaa.gov> wrote:
>
>
> Dear all,
>
> The following email is for people who are willing to do beta
> testing for the fv3gfs May 15 release. Please ignore the email if you are
> not going to run the test.
>
> The svn tag for beta testing is located at:
>
>  https://svnemc.ncep.noaa.gov/projects/nems/apps/NEMSfv3gfs/
> tags/fv3gfs.v0beta
>
> The instruction file on how to get and compile the code and to
> run an experiment is at:
> https://svnemc.ncep.noaa.gov/projects/nems/apps/NEMSfv3gfs/t
> ags/fv3gfs.v0beta/release/v0/readme.txt
>
> Please follow the instructions to see if you can run an
> experiment.
>
>  Thanks.
>
>
> Jun
>
>
>
> _______________________________________________
> Ncep.list.fv3-announce mailing listNcep.list.fv3-announce at lstsrv.ncep.noaa.govhttps://www.lstsrv.ncep.noaa.gov/mailman/listinfo/ncep.list.
> fv3-announce
>
>
>
> _______________________________________________
> Ncep.list.fv3-announce mailing listNcep.list.fv3-announce at lstsrv.ncep.noaa.govhttps://www.lstsrv.ncep.noaa.gov/mailman/listinfo/ncep.list.
> fv3-announce
>
>
>
> _______________________________________________
> Ncep.list.fv3-announce mailing listNcep.list.fv3-announce at lstsrv.ncep.noaa.govhttps://www.lstsrv.ncep.noaa.gov/mailman/listinfo/ncep.list.
> fv3-announce
>
>
>
>
> --
> Gerard Ketefian
> Research Scientist
> NOAA/OAR/ESRL/GSD/EMB, R/GSD1
> 325 Broadway
> Boulder, CO 80305
> phone: 303-497-6209 <(303)%20497-6209> <(303)%20497-6209>
>
>
>
>
> --
> Gerard Ketefian
> Research Scientist
> NOAA/OAR/ESRL/GSD/EMB, R/GSD1
> 325 Broadway
> Boulder, CO 80305
> phone: 303-497-6209 <(303)%20497-6209>
>
>
> _______________________________________________
> Ncep.list.fv3-announce mailing listNcep.list.fv3-announce at lstsrv.ncep.noaa.govhttps://www.lstsrv.ncep.noaa.gov/mailman/listinfo/ncep.list.fv3-announce
>
>
> --
> EUGENE MIRVIS, Tech Lead,
> Senior Computational Scientist, IMSG @
>  Global Climate & Weather Modeling Branch of
>   NOAA/NCEP Environmental Modeling Center
>             NCWCP,  rm  2183
>      5830 University Research Ct.
>         College Park, MD 20740
>             Ph. 301.683.3809 <(301)%20683-3809>
>
>
> _______________________________________________
> Ncep.list.fv3-announce mailing list
> Ncep.list.fv3-announce at lstsrv.ncep.noaa.gov
> https://www.lstsrv.ncep.noaa.gov/mailman/listinfo/ncep.list.fv3-announce
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://www.lstsrv.ncep.noaa.gov/pipermail/ncep.list.fv3-announce/attachments/20170512/68c6635d/attachment-0001.html 


More information about the Ncep.list.fv3-announce mailing list