[Ncep.list.nws-ncep-management] UPDATE #7 - IDP Boulder declared a viable backup (Re: Problems at Boulder - IDP applications failed over to College Park)

NOAA SDM sdm at noaa.gov
Sat Aug 12 21:16:03 UTC 2017


7th update: Boulder MADIS database systems continue to experience a
lingering technical issue, in which the only impact would be a minor one -
degraded website displays that use MADIS data.  NCO technical support will
revisit this lingering issue with database administrator support on Monday,
during business hours.

Otherwise, NCO technical support have restored all remaining Boulder
systems back to healthy functioning, to the point where the IDP Boulder
facility can be declared a viable backup.  This update will be the final
update for IDP Boulder.


On 12 August 2017 at 19:28, NOAA SDM <sdm at noaa.gov> wrote:

> 6th update: NCO tech support have restored nearly all of Boulder systems
> back on-line.   They are working with database administrator personnel to
> resolve a lingering technical issue with Boulder MADIS database systems.
>
> On 12 August 2017 at 17:26, NOAA SDM <sdm at noaa.gov> wrote:
>
>> 5th update: NCO tech support has made further progress in bring Boulder
>> systems back on-line, including dataflow systems that are now reported as
>> "healthy."  Otherwise, recovery efforts continue towards declaring Boulder
>> a viable backup.
>>
>> On 12 August 2017 at 16:14, NOAA SDM <sdm at noaa.gov> wrote:
>>
>>> 4th update: NCO tech support is continuing recovery efforts in Boulder,
>>> which include applications checking.  In addition, database support
>>> personnel has joined to assess the health of database systems in Boulder.
>>>
>>> On 12 August 2017 at 15:08, NOAA SDM <sdm at noaa.gov> wrote:
>>>
>>>> 3rd update: NCO tech support has begun performing applications
>>>> checking, to determine whether Boulder is healthy enough to be declared a
>>>> viable backup.
>>>>
>>>> On 12 August 2017 at 14:09, NOAA SDM <sdm at noaa.gov> wrote:
>>>>
>>>>> 2nd update of the day: NCO tech support has brought Boulder IDP IRIS
>>>>> and MADIS databases on-line, but in single user only mode.  They are
>>>>> working to resolve the single user mode issue.
>>>>>
>>>>> On 12 August 2017 at 13:09, NOAA SDM <sdm at noaa.gov> wrote:
>>>>>
>>>>>> 1st update of the day: NCO tech support is currently on track to have
>>>>>> enough Boulder systems restored, to be declared as a viable backup.
>>>>>>
>>>>>> On 12 August 2017 at 04:06, NOAA SDM <sdm at noaa.gov> wrote:
>>>>>>
>>>>>>> UPDATE..
>>>>>>>
>>>>>>> NSB's Mark Reeser provided the following update..
>>>>>>>
>>>>>>> Recovered from boulder outage at approximately 630pm MDT.  (0030Z)
>>>>>>> Root cause is still unknown.  We will be collecting data and sending
>>>>>>> to Cisco for root cause analysis.
>>>>>>> Initial hypothesis is that the (relatively old code) that we're
>>>>>>> running on the Data Center Core switches, (BLDRCR1 and BLDRCR2) was
>>>>>>> impacted by a code bug, which was injected into the system earlier in the
>>>>>>> day during an approved accelerated change.
>>>>>>> Once the change was reverted, the system normalized...but Cisco can
>>>>>>> still not connect the dots on how our change, which was done correctly,
>>>>>>> would have had this impact. More to come as we continue with the root cause
>>>>>>> analysis.
>>>>>>>
>>>>>>> Boulder IDP apps will remain off line until deemed stable..
>>>>>>> Hopefully tomorrow all apps be be brought back tomorrow..
>>>>>>>
>>>>>>> Another Impact that was unforeseen..
>>>>>>>
>>>>>>> 00Z NAM ran without upper air data for the 00Z 08/12/17 cycle due to
>>>>>>> the network outage having an impact on WCOSS (TIDE Primary) ingest...  00Z
>>>>>>> GFS had near normal data counts.   Also the 00Z LAMP job had to be scrubbed
>>>>>>> due to a needed missing MRMS file which was also attributed to the Boulder
>>>>>>> network outage..
>>>>>>> SDM
>>>>>>>
>>>>>>> Randy
>>>>>>>
>>>>>>>
>>>>>>> ---------- Forwarded message ----------
>>>>>>> From: NOAA SDM <sdm at noaa.gov>
>>>>>>> Date: 11 August 2017 at 23:42
>>>>>>> Subject: Fwd: Problems at Boulder - IDP applications failed over to
>>>>>>> College Park
>>>>>>> To: "_NCEP.List.NWS-NCEP-management" <ncep.list.nws-ncep-management
>>>>>>> @noaa.gov>
>>>>>>>
>>>>>>>
>>>>>>> Problem at Boulder continues.
>>>>>>>
>>>>>>> *Boulder currently is not a viable backup. * All IDP applications
>>>>>>> have been moved to the College Park Facility with the exception of GIS
>>>>>>> which is pending Akamai processing which should be complete by 2300Z.
>>>>>>> Unfortunately GIS at the College Park Facility is older this will result in
>>>>>>> the loss of NWS_Forecasts_Guidance_Warnings / National Digital
>>>>>>> Guidance Database image services.
>>>>>>>
>>>>>>>
>>>>>>> The Following Impacts were noted by IDP and our DataFlow Groups
>>>>>>>
>>>>>>>
>>>>>>> TGFTP
>>>>>>>
>>>>>>>    -
>>>>>>>
>>>>>>>    Some data stopped updating on TGFTP at 17:57z, customers moved
>>>>>>>    to CP at 18:58z
>>>>>>>
>>>>>>>
>>>>>>> NOMADS
>>>>>>>
>>>>>>>    -
>>>>>>>
>>>>>>>    Outage started at 18:12z, customers moved to CP 18:54z
>>>>>>>
>>>>>>>
>>>>>>> IDP data to the TG
>>>>>>>
>>>>>>>    -
>>>>>>>
>>>>>>>    MADIS data stopped updating to the SBN between 18:11z and 20:22z
>>>>>>>    (2hr 11min)
>>>>>>>
>>>>>>>
>>>>>>> RADAR2
>>>>>>>
>>>>>>>    -
>>>>>>>
>>>>>>>    no impact, customers moved before radar2 systems in BD impacted
>>>>>>>
>>>>>>>
>>>>>>> RADAR3
>>>>>>>
>>>>>>>    -
>>>>>>>
>>>>>>>    22min outage in BD out between 18:44z and 19:06z, users failed
>>>>>>>    to CP starting at 19:16z
>>>>>>>
>>>>>>>
>>>>>>> Data pushes from local centers to Boulder for IDP apps there
>>>>>>>
>>>>>>>    -
>>>>>>>
>>>>>>>    1hr 20min stoppage in data flow, data backfilled when flow
>>>>>>>    restored (stopped 18:12z, started up again at 19:32z)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *MADIS* Customers down for two hours (between 1840Z and 2040Z)
>>>>>>>
>>>>>>> *NLETS* Customers down for over an hour and a half (between 1810Z
>>>>>>> and 1955Z)
>>>>>>>
>>>>>>> *IRIS/iNWS* down 40 minutes (between 1818Z and 1900Z)
>>>>>>>
>>>>>>> *EDIS/FTPMail* down 40 minutes (between 1820Z and 1900Z)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> SDM
>>>>>>> Randy
>>>>>>>
>>>>>>>
>>>>>>> ---------- Forwarded message ----------
>>>>>>> From: NOAA SDM <sdm at noaa.gov>
>>>>>>> Date: 11 August 2017 at 20:40
>>>>>>> Subject: Fwd: Problems at Boulder - IDP applications failed over to
>>>>>>> College Park
>>>>>>> To: "_NCEP.List.NWS-NCEP-management" <ncep.list.nws-ncep-management
>>>>>>> @noaa.gov>
>>>>>>>
>>>>>>>
>>>>>>> Problems at Boulder continue.. NCO will be failing over the
>>>>>>> remainder of the applications NLETS/EMWIN MADIS and GIS to college park
>>>>>>> while support investigates the issue at Boulder..
>>>>>>>
>>>>>>> SDM
>>>>>>>
>>>>>>> Randy
>>>>>>> ---------- Forwarded message ----------
>>>>>>> From: NOAA SDM <sdm at noaa.gov>
>>>>>>> Date: 11 August 2017 at 19:03
>>>>>>> Subject: Problems at Boulder - IDP applications failed over to
>>>>>>> College Park
>>>>>>> To: "_NCEP.List.NWS-NCEP-management" <ncep.list.nws-ncep-management
>>>>>>> @noaa.gov>
>>>>>>>
>>>>>>>
>>>>>>> All,
>>>>>>>
>>>>>>> Shortly after 1800Z (2 pm ET), various IDP applications hosted in
>>>>>>> Boulder, became non functional.  NCO technical support are failing over
>>>>>>> applications back to host in College Park.   Stay tuned for more details as
>>>>>>> we receive new information.
>>>>>>>
>>>>>>> --
>>>>>>> Senior Duty Meteorologist
>>>>>>> NOAA/NWS/NCEP/NCO/OMB
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Senior Duty Meteorologist
>>>>>>> NOAA/NWS/NCEP/NCO/OMB
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Senior Duty Meteorologist
>>>>>>> NOAA/NWS/NCEP/NCO/OMB
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Senior Duty Meteorologist
>>>>>>> NOAA/NWS/NCEP/NCO/OMB
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Senior Duty Meteorologist
>>>>>> NOAA/NWS/NCEP/NCO/OMB
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Senior Duty Meteorologist
>>>>> NOAA/NWS/NCEP/NCO/OMB
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Senior Duty Meteorologist
>>>> NOAA/NWS/NCEP/NCO/OMB
>>>>
>>>
>>>
>>>
>>> --
>>> Senior Duty Meteorologist
>>> NOAA/NWS/NCEP/NCO/OMB
>>>
>>
>>
>>
>> --
>> Senior Duty Meteorologist
>> NOAA/NWS/NCEP/NCO/OMB
>>
>
>
>
> --
> Senior Duty Meteorologist
> NOAA/NWS/NCEP/NCO/OMB
>



-- 
Senior Duty Meteorologist
NOAA/NWS/NCEP/NCO/OMB
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://www.lstsrv.ncep.noaa.gov/pipermail/ncep.list.nws-ncep-management/attachments/20170812/9867ba4e/attachment-0001.html 


More information about the Ncep.list.nws-ncep-management mailing list