<div dir="ltr">Phil,<div><br></div><div>There's a good chance the system went unstable and the actual crash occurred in a non-master OpenMP thread. The remaining active MPI processes would have continued until the first synchronization point, which is an implicit barrier associated with MPI communication (a halo update).</div><div><br></div><div>Intel is notorious for poor signal handling of errors in non-master threads and frequently there is no traceback from the thread that actually encountered the problem.<br></div><div><br></div><div>You can contact me directly with more details of the case you are trying to run.</div></div><div class="gmail_extra"><br clear="all"><div><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div style="font-size:12.8px">Rusty</div><div style="font-size:12.8px">--</div><div style="font-size:12.8px">Rusty Benson, PhD</div><div style="font-size:12.8px">Modeling Systems Group</div><div style="font-size:12.8px">NOAA Geophysical Fluid Dynamics Lab</div><div style="font-size:12.8px">Princeton, NJ</div></div></div></div></div></div></div></div>
<br><div class="gmail_quote">On Wed, May 17, 2017 at 6:26 PM, Philip Pegion - NOAA Affiliate <span dir="ltr"><<a href="mailto:philip.pegion@noaa.gov" target="_blank">philip.pegion@noaa.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Hi,<div> I am trying to run the trunk of the NEMS version of FV3GFS on theia, and if the model blows up, I do not get any output, the model just hangs in doing a halo update. Has anyone else seen this behavior? Below is a the traceback after killing the job.</div><div><br></div><div>Thanks,</div><div><br></div><div>Phil</div><div><br></div><div><br></div><div><div>[70] forrtl: error (78): process killed (SIGTERM)</div><div>[70] Image PC Routine Line Source</div><div>[70] libirc.so 00007FDF60288961 Unknown Unknown Unknown</div><div>[70] libirc.so 00007FDF602870B7 Unknown Unknown Unknown</div><div>[70] fv3_1.exe 00000000013BE5B4 Unknown Unknown Unknown</div><div>[70] fv3_1.exe 00000000013BE3C6 Unknown Unknown Unknown</div><div>[70] fv3_1.exe 000000000133DC24 Unknown Unknown Unknown</div><div>[70] fv3_1.exe 000000000134313B Unknown Unknown Unknown</div><div>[70] libpthread.so.0 00007FDF6279A7E0 Unknown Unknown Unknown</div><div>[70] libmpi.so.12 00007FDF5EFDFD2A Unknown Unknown Unknown</div><div>[70] libmpi.so.12 00007FDF5EFDA9CA Unknown Unknown Unknown</div><div>[70] libmpi.so.12 00007FDF5F278A61 Unknown Unknown Unknown</div><div>[70] libmpi.so.12 00007FDF5F2786B7 Unknown Unknown Unknown</div><div>[70] libmpifort.so.12 00007FDF5F6D8F67 Unknown Unknown Unknown</div><div>[70] fv3_1.exe 0000000000F67229 mpp_mod_mp_mpp_sy 223 mpp_util_mpi.inc</div><div>[70] fv3_1.exe 0000000001061810 mpp_domains_mod_m 245 mpp_do_update.h</div><div>[70] fv3_1.exe 0000000001068520 mpp_domains_mod_m 147 mpp_update_domains2D.h</div><div>[70] fv3_1.exe 00000000005EDE6D fv_tracer2d_mod_m 264 fv_tracer2d.F90</div><div>[70] fv3_1.exe 00000000004E1D8F fv_dynamics_mod_m 475 fv_dynamics.F90</div><div>[70] fv3_1.exe 00000000004AE69E atmosphere_mod_mp 374 atmosphere.F90</div><div>[70] fv3_1.exe 00000000004A9AE8 atmos_model_mod_m 451 atmos_model.F90</div><div>[70] fv3_1.exe 00000000004A496D fv3gfs_cap_mod_mp 628 fv3_cap.F90</div></div></div>
<br>______________________________<wbr>_________________<br>
Ncep.list.fv3-announce mailing list<br>
<a href="mailto:Ncep.list.fv3-announce@lstsrv.ncep.noaa.gov">Ncep.list.fv3-announce@lstsrv.<wbr>ncep.noaa.gov</a><br>
<a href="https://www.lstsrv.ncep.noaa.gov/mailman/listinfo/ncep.list.fv3-announce" rel="noreferrer" target="_blank">https://www.lstsrv.ncep.noaa.<wbr>gov/mailman/listinfo/ncep.<wbr>list.fv3-announce</a><br>
<br></blockquote></div><br></div>