[Ncep.list.nems.announce] Update on NEMSCompsetRun SLURM Support
Samuel Trahan - NOAA Affiliate
samuel.trahan at noaa.gov
Thu Feb 7 16:17:11 UTC 2019
As you know, all NOAA R&D machines are moving away from Moab/Torque/ALPS to
SLURM. The changes to NEMS, NEMSfv3gfs, NCEPLIBS-pyprodutil, and Rocoto
for SLURM on Jet and Theia are all working. The changes still support
non-SLURM Jet and Theia. However, Theia's SLURM is misconfigured; if you
ask for complex task geometry, then it think Theia has 12 cores per node
instead of 24. I'm waiting to hear back from the admins on that. Once
that bug is fixed, I'm planning on committing the SLURM support to the
On Thu, 20 Dec 2018 at 20:54, Samuel Trahan - NOAA Affiliate <
samuel.trahan at noaa.gov> wrote:
> Hi all,
> In an earlier email, I mentioned that I was going to commit a change to
> allow NEMS to use SLURM. I will delay this commit until further notice.
> The system SLURM configuration keeps changing, so it is not possible to
> keep a stable NEMSCompsetRun implementation for it. I'm going to keep
> updating the "slurm" branch as the SLURM systems update, but I'll wait to
> commit to NEMS "master" until the SLURM systems are stable.
> I should mention that the SLURM systems are being reconfigured out of
> necessity, to fix problems and add missing features. Decisions are made in
> collaboration with technical contacts in various parts of NOAA. These are
> critical changes that cannot be skipped or delayed, and you would sorely
> regret not having them. For example, in the past few weeks, admins
> achieved fully-functional task affinity, correct resource accounting, and
> support for complex task geometries. Getting those features to work
> together properly required using some of the more advanced aspects of
> SLURM, plus some bug reports to the developers of SLURM.
> Sam Trahan
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Ncep.list.nems.announce