[Ncep.list.fv3-announce] NEMS-fv3 issues

Rusty Benson - NOAA Federal rusty.benson at noaa.gov
Wed May 17 22:50:12 UTC 2017


Phil,

There's a good chance the system went unstable and the actual crash
occurred in a non-master OpenMP thread.  The remaining active MPI processes
would have continued until the first synchronization point, which is an
implicit barrier associated with MPI communication (a halo update).

Intel is notorious for poor signal handling of errors in non-master threads
and frequently there is no traceback from the thread that actually
encountered the problem.

You can contact me directly with more details of the case you are trying to
run.

Rusty
--
Rusty Benson, PhD
Modeling Systems Group
NOAA Geophysical Fluid Dynamics Lab
Princeton, NJ

On Wed, May 17, 2017 at 6:26 PM, Philip Pegion - NOAA Affiliate <
philip.pegion at noaa.gov> wrote:

> Hi,
>   I am trying to run the trunk of the NEMS version of FV3GFS on theia, and
> if the model blows up, I do not get any output, the model just hangs in
> doing a halo update.  Has anyone else seen this behavior?  Below is a the
> traceback after killing the job.
>
> Thanks,
>
> Phil
>
>
> [70] forrtl: error (78): process killed (SIGTERM)
> [70] Image              PC                Routine            Line
>  Source
> [70] libirc.so          00007FDF60288961  Unknown               Unknown
>  Unknown
> [70] libirc.so          00007FDF602870B7  Unknown               Unknown
>  Unknown
> [70] fv3_1.exe          00000000013BE5B4  Unknown               Unknown
>  Unknown
> [70] fv3_1.exe          00000000013BE3C6  Unknown               Unknown
>  Unknown
> [70] fv3_1.exe          000000000133DC24  Unknown               Unknown
>  Unknown
> [70] fv3_1.exe          000000000134313B  Unknown               Unknown
>  Unknown
> [70] libpthread.so.0    00007FDF6279A7E0  Unknown               Unknown
>  Unknown
> [70] libmpi.so.12       00007FDF5EFDFD2A  Unknown               Unknown
>  Unknown
> [70] libmpi.so.12       00007FDF5EFDA9CA  Unknown               Unknown
>  Unknown
> [70] libmpi.so.12       00007FDF5F278A61  Unknown               Unknown
>  Unknown
> [70] libmpi.so.12       00007FDF5F2786B7  Unknown               Unknown
>  Unknown
> [70] libmpifort.so.12   00007FDF5F6D8F67  Unknown               Unknown
>  Unknown
> [70] fv3_1.exe          0000000000F67229  mpp_mod_mp_mpp_sy         223
>  mpp_util_mpi.inc
> [70] fv3_1.exe          0000000001061810  mpp_domains_mod_m         245
>  mpp_do_update.h
> [70] fv3_1.exe          0000000001068520  mpp_domains_mod_m         147
>  mpp_update_domains2D.h
> [70] fv3_1.exe          00000000005EDE6D  fv_tracer2d_mod_m         264
>  fv_tracer2d.F90
> [70] fv3_1.exe          00000000004E1D8F  fv_dynamics_mod_m         475
>  fv_dynamics.F90
> [70] fv3_1.exe          00000000004AE69E  atmosphere_mod_mp         374
>  atmosphere.F90
> [70] fv3_1.exe          00000000004A9AE8  atmos_model_mod_m         451
>  atmos_model.F90
> [70] fv3_1.exe          00000000004A496D  fv3gfs_cap_mod_mp         628
>  fv3_cap.F90
>
> _______________________________________________
> Ncep.list.fv3-announce mailing list
> Ncep.list.fv3-announce at lstsrv.ncep.noaa.gov
> https://www.lstsrv.ncep.noaa.gov/mailman/listinfo/ncep.list.fv3-announce
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://www.lstsrv.ncep.noaa.gov/pipermail/ncep.list.fv3-announce/attachments/20170517/10f48449/attachment.html 


More information about the Ncep.list.fv3-announce mailing list