[Ncep.list.nws-ncep-management] Luna WCOSS storage issue

NOAA SDM sdm at noaa.gov
Fri Jan 27 04:05:14 UTC 2017


Latest Update... From Cray SA Rich Bach


We are continuing to work directly with the DDN support team.



The verifies are still running at this time, we have completed 8 of the 48
and from the timing it will be 3 to 5 hours before these complete at this
point. (estimated.).



Once completed we will need to mmfsck on the HPS file system to resolve any
file system inconsistencies.   We will then do file system cleanup to clear
any bad blocks.



I will send the next update at 0100 EST.


SDM

Randy

On 27 January 2017 at 02:09, NOAA SDM <sdm at noaa.gov> wrote:

> IBM and Cray personnel continue to work on this issue... the latest update
> from Cray is as follows..
>
> Cray SA's have noted... Due to the nature of the issue the raid
> controllers on rack 2 have gone into a force verify state. That means that
> there is an active scan/verify of the data stored on the volumes in
> progress which needs to finish before we continue. Currently the verifies
> are at about 35 percent.  After this is complete.. they can begin proceed
> with their work to resolve this issue.
>
> We will provide another update once more information becomes available..
>
> SDM Grant Newby
>
>
>
>
> On 26 January 2017 at 23:13, NOAA SDM <sdm at noaa.gov> wrote:
>
>> IBM and Cray personnel continue to work on this issue... the latest
>> update from Cray is as follows..
>>
>> We have supplied some additional data and the escalation is at the
>> highest level.
>>
>>  We are waiting for a response with next steps within the hour.
>>
>> As noted previously,  LUNA has been unmounted from TIDE in the meantime,
>> allowing for that portion of the Reston WCOSS to still be available.
>> However, we do not currently have a fully viable backup WCOSS.
>>
>> Operations continue to run normally on the primary WCOSS in Orlando (GYRE
>> and SURGE).
>>
>> We will continue provide updates as we get more information.
>>
>> SDM Grant Newby
>>
>>
>>
>> FYI -
>>>
>>> Around 12:30pm ET, a file system problem developed on LUNA - IBM and
>>> Cray personnel are currently investigating.  LUNA has been unmounted from
>>> TIDE in the meantime, allowing for that portion of the Reston WCOSS to
>>> still be available.  However, we do not currently have a fully viable
>>> backup WCOSS.
>>>
>>> Operations continue to run normally on the primary WCOSS in Orlando
>>> (GYRE and SURGE).
>>>
>>> We will continue provide updates as we get more information.
>>>
>>> Rob Handel
>>> --
>>> Senior Duty Meteorologist
>>> NOAA/NWS/NCEP/NCO/OMB
>>>
>>
>>
>>
>> --
>> Senior Duty Meteorologist
>> NOAA/NWS/NCEP/NCO/OMB
>>
>> On 26 January 2017 at 18:42, NOAA SDM <sdm at noaa.gov> wrote:
>>
>>> FYI -
>>>
>>> Around 12:30pm ET, a file system problem developed on LUNA - IBM and
>>> Cray personnel are currently investigating.  LUNA has been unmounted from
>>> TIDE in the meantime, allowing for that portion of the Reston WCOSS to
>>> still be available.  However, we do not currently have a fully viable
>>> backup WCOSS.
>>>
>>> Operations continue to run normally on the primary WCOSS in Orlando
>>> (GYRE and SURGE).
>>>
>>> We will continue provide updates as we get more information.
>>>
>>> Rob Handel
>>> --
>>> Senior Duty Meteorologist
>>> NOAA/NWS/NCEP/NCO/OMB
>>>
>>
>>
>>
>> --
>> Senior Duty Meteorologist
>> NOAA/NWS/NCEP/NCO/OMB
>>
>
>
>
> --
> Senior Duty Meteorologist
> NOAA/NWS/NCEP/NCO/OMB
>



-- 
Senior Duty Meteorologist
NOAA/NWS/NCEP/NCO/OMB
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://www.lstsrv.ncep.noaa.gov/pipermail/ncep.list.nws-ncep-management/attachments/20170127/e0784b04/attachment-0001.html 


More information about the Ncep.list.nws-ncep-management mailing list