[Ncep.nhc.nco_contacts] TSB Morning Rounds - Tue Sep 20, 2016

TSB Admin nhc.tsbadmin at noaa.gov
Tue Sep 20 14:54:45 UTC 2016


Here is today's summary of NHC computer operations:
--- Craig Mattocks

----------

TAFB:
1. The exhaustion of user processes on nhc-lx-comp01 (see NHC-wide items  
below) played a role in the following problems:
a. NWPS output file for Atlantic and Pacific have not been available in  
N-AWIPS since yesterday. NWPS data is now flowing into the /model2/nwps  
mount point on the NetApp again.
b. GFE/NCP issues: when attempting to send grids to NMAP overnight from  
both Atlantic and East Pacifi workstations, each time an error sending to  
comp01 occured with failure to send. comp03 was successful in each case and  
grids were available for use in NMAP.

2. TAFB Surface Analyst/Forecaster Marshall Huffman noticed that the  
Atlantic Marine Wx discussion (MIMATS):
http://www.nhc.noaa.gov/text/MIAMIMATS.shtml
was being truncated when posted on NHC's web site, with the bottom portion  
of the product and warnings cut off. TSB Web Developer Dave Zelinsky  
investigated the issue - it is working fine now. There might have been a  
fatal typo (non-standard invisible character) introduced in the partially  
hand-typed product which casued dissemination to fail.
HSU:
1. The exhaustion of user processes on nhc-lx-comp01 (see NHC-wide items  
below) played a role in the following problems:
a. ATCF issues: Some of the microwave data not updating in a timely fashion  
to use for fixing th Best Track in real time.
b. N-AWIPS/AWIPS2 issues: the UKMET tracker did not update in N-AWIPS.  
Called SDM, and they were able to visualize the model fields in N-AWIPS  
there. Must be a data flow issue on our end?
c. Other Techincal issues: FNMOC page has been updating erratically.  
Sometimes, for example, TD 13 is on the page. We'll check again and then  
it's gone. Very unreliable. ASCAT data have also been erratic. Large swath  
of data missed earlier yesterday evening (eg, over Karl). Luckily, they  
were available at the FNMOC site. When the data did arrive, often a long  
delay. No HCCA available. :( We tried to retrieve it multiple times for  
each active cyclone but to no avail.

2. The ATCF GUI crashed and "core dumped" at 2000Z yesterday when the  
Hurricane Specialists attempted to renumber Invest 96L to TD-13/AL13  
possibly due to the simultaneous ingest of satellite images from NRL. TSB  
Developer Monica Bozeman cleaned out the corrupt AL13 entries from the ATCF  
decks, allowing the renumbering of Invest 96L to AL13 to proceed normally.
Other items:
1. A number of operational jobs failed on NHC's primary compute server  
(nhc-lx-comp01) due to a runaway application (AMSU wind radii data  
processing) and a low default setting for the number of user processes  
(threads) allowed to prevent accidental "fork bombs". TSB Developer Dave  
Zelinsky switched to NHC's backup server (nhc-lx-comp03) while NCO SysAdmin  
Curt Steinmetz boosted the user process limit from 1024 to 4096. This  
enabled Dave to login to the naprod account and stop the runaway  
application.

More information on user process limits:
https://access.redhat.com/solutions/30316
NHC RFCs:


-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://www.lstsrv.ncep.noaa.gov/pipermail/ncep.nhc.nco_contacts/attachments/20160920/debf94f3/attachment.html 


More information about the Ncep.nhc.nco_contacts mailing list