[Ncep.hmon] Running Complex Jobs Under Slurm

Samuel Trahan - NOAA Affiliate samuel.trahan at noaa.gov
Thu Apr 18 14:58:39 UTC 2019


Hi all,

A few months ago, the pyprodutil package supported the --multi-prog option
that the admins are now requesting we use (look inside their two perl
scripts).  We deliberately removed --multi-prog support from pyprodutil and
replaced it with pack groups, since the admins requested that.  Now that
the admins found out that pack groups don't work, they're asking us to use
--multi-prog again.  It may be easiest to resurrect the old slurm support
code.

Sincerely,
Sam Trahan

On Wed, 17 Apr 2019 at 19:56, Avichal Mehra - NOAA Federal <
avichal.mehra at noaa.gov> wrote:

> FYI.
>
> ---------- Forwarded message ---------
> From: Leslie Hart - NOAA Federal <leslie.b.hart at noaa.gov>
> Date: Wed, Apr 17, 2019 at 7:23 PM
> Subject: Running Complex Jobs Under Slurm
> To: _OAR RDHPCS theia-notify <rdhpcs.theia.notify at noaa.gov>
>
>
> Dear Users,
>
> We consider a complex job to be any job that requires more than one
> executable within a single MPI execution, a job that requires a varying
> number of tasks per node or a combination of both. During the Quick Start
> Training, we suggested using Heterogeneous Job submission to create a
> complex job request. After working with this method for a while, we believe
> this is the wrong approach. We now have another recommended method for
> working with complex jobs under Slurm.
>
> We have created two scripts in the /contrib area of Jet and Theia. (We
> believe these methods will also function well on Gaea, but have only done
> limited testing there.) One is called arbitrary.pl which allows for
> varying numbers of tasks per node. The other is layout.pl which allows
> for multiple executables within a single MPI execution. They are accessible
> by "module load contrib sutils".
>
> We have updated the April training slides to have examples of various
> situations that a user may encounter. The updated material starts around
> slide 34. The new slide deck is available at
> https://docs.google.com/presentation/d/1OhGP1j7Irx61iqDq0jagTWCMRXIGgnUiwNrGfk9NMmM/edit?usp=sharing.
> (These slides are still under development but are pretty close to final.)
>
> We will have a short (approximately 30 minute) training session on
> Wednesday, April 24th at 11AM EDT that just discusses these updates.
> Details regarding location and webinar information will follow in the next
> few days.  In early- to mid-May we will repeat the entire quick start
> training session.
>
> Thanks,
> Leslie Hart & Raghu Reddy (and many others)
>
>
>
> --
>     Dr. Avichal Mehra                               Avichal.Mehra at noaa.gov
>
>     Lead Physical Scientist                      NOAA/NWS/NCEP/EMC
>     5830 University Research Court           Room 2104
>     College Park                                      Ph.   301-683-3746
>     MD 20740                                          Fax: 301-683-3703
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://www.lstsrv.ncep.noaa.gov/pipermail/ncep.hmon/attachments/20190418/5da68dc2/attachment-0001.html 


More information about the Ncep.hmon mailing list