CASE : MSA2312fc MULTIDISKs LEFTOVER at the same time

简介:
今天一个HP MSA2312FC的存储发送了离奇的事情,多个VD上的多个磁盘状态变成了LEFTOVER。
造成的结果是多个VD变成QTOF状态如下 : 
# show vd
Name Size     Free    Own Pref   RAID   Disks Spr Chk  Status Jobs      
  Serial Number                    
------------------------------------------------------------------------
vd01 3996.7GB 751.5MB A   A      RAID5  5     0   64k  QTOF             
  00c0ff10386b0000b519384c00000000 
vd02 3996.7GB 751.5MB B   B      RAID5  5     0   64k  FTOL             
  00c0ff1035b10000dd19384c00000000 
vd03 3996.7GB 751.5MB A   A      RAID5  5     0   64k  FTOL             
  00c0ff10386b0000f919384c00000000 
vd04 3996.7GB 751.5MB B   B      RAID5  5     0   64k  FTOL             
  00c0ff1035b100002d1a384c00000000 
vd05 3996.7GB 751.5MB A   A      RAID5  5     0   64k  QTOF             
  00c0ff10386b0000c19f554e00000000 
vd06 3996.7GB 751.5MB B   B      RAID5  5     0   64k  FTOL             
  00c0ff1035b10000ce9f554e00000000 
vd07 3996.7GB 751.5MB A   A      RAID5  5     0   64k  QTOF             
  00c0ff10386b0000da9f554e00000000 
vd08 3996.7GB 751.5MB B   B      RAID5  5     0   64k  FTOL             
  00c0ff1035b10000fc9f554e00000000 
------------------------------------------------------------------------

在执行rescan之后,多个磁盘被找回,VD恢复FTOL。但是vd01上面还是有4块盘是leftover的状态。
根据HP工程师的指导,关闭所有的登录MSA2312FC的WEB页面。然后通过命令行连接到这台msa2312fc。
执行
# trust enable
# trust vdisk vd01
报错
Error: Command failed. (vd01) - Vdisk is not online or fault tolerant. Cannot be trusted.

这下搞得比较崩溃了,HP对CASE做了升级。
新的解决方案出来了。先去WEB页面解除VD01的隔离。如下。
CASE : MSA2312fc MULTIDISKs LEFTOVER at the same time - 德哥@Digoal - The Heart,The World.

右键点击vd01,选择Tools -> Dequarantine Vdisk
按照指示解除vd01的隔离。
然后去命令行看vd01的状态会变成OFFL
# show vd                                                   
Name Size     Free    Own Pref   RAID   Disks Spr Chk  Status Jobs      
  Serial Number                    
------------------------------------------------------------------------
vd01 3996.7GB 751.5MB A   A      RAID5  5     0   64k  OFFL             
  00c0ff10386b0000b519384c00000000 
vd02 3996.7GB 751.5MB B   B      RAID5  5     0   64k  FTOL             
  00c0ff1035b10000dd19384c00000000 
vd03 3996.7GB 751.5MB A   A      RAID5  5     0   64k  FTOL             
  00c0ff10386b0000f919384c00000000 
vd04 3996.7GB 751.5MB B   B      RAID5  5     0   64k  FTOL             
  00c0ff1035b100002d1a384c00000000 
vd05 3996.7GB 751.5MB A   A      RAID5  5     0   64k  FTOL   VRSC 56%  
  00c0ff10386b0000c19f554e00000000 
vd06 3996.7GB 751.5MB B   B      RAID5  5     0   64k  FTOL             
  00c0ff1035b10000ce9f554e00000000 
vd07 3996.7GB 751.5MB A   A      RAID5  5     0   64k  FTOL   VRSC 59%  
  00c0ff10386b0000da9f554e00000000 
vd08 3996.7GB 751.5MB B   B      RAID5  5     0   64k  FTOL             
  00c0ff1035b10000fc9f554e00000000 
------------------------------------------------------------------------

然后再到命令行执行
# trust vdisk vd01
VD恢复为FTOL状态。
# show vd         
Name Size     Free    Own Pref   RAID   Disks Spr Chk  Status Jobs      
  Serial Number                    
------------------------------------------------------------------------
vd01 3996.7GB 751.5MB A   A      RAID5  5     0   64k  FTOL             
  00c0ff10386b0000b519384c00000000 
vd02 3996.7GB 751.5MB B   B      RAID5  5     0   64k  FTOL             
  00c0ff1035b10000dd19384c00000000 
vd03 3996.7GB 751.5MB A   A      RAID5  5     0   64k  FTOL             
  00c0ff10386b0000f919384c00000000 
vd04 3996.7GB 751.5MB B   B      RAID5  5     0   64k  FTOL             
  00c0ff1035b100002d1a384c00000000 
vd05 3996.7GB 751.5MB A   A      RAID5  5     0   64k  FTOL   VRSC 56%  
  00c0ff10386b0000c19f554e00000000 
vd06 3996.7GB 751.5MB B   B      RAID5  5     0   64k  FTOL             
  00c0ff1035b10000ce9f554e00000000 
vd07 3996.7GB 751.5MB A   A      RAID5  5     0   64k  FTOL   VRSC 59%  
  00c0ff10386b0000da9f554e00000000 
vd08 3996.7GB 751.5MB B   B      RAID5  5     0   64k  FTOL             
  00c0ff1035b10000fc9f554e00000000 
------------------------------------------------------------------------
 
据HP称,这样操作可能导致部分数据丢失,可能指CACHE的脏数据, 或者是leftover掉的磁盘上的坏数据。
另外HP建议观察几天,没有问题的话升级一下固件。
另外如果正常情况下的leftover, 例如硬盘的错误次数达到一定的阈值, 这种情况下建议备份数据,重建vdisk, 再恢复.

卷的WR Policy可能会被修改为write-through, 后续可能需要手工开启.

trust命令 : 
# trust 
DESCRIPTION
Enables an offline vdisk to be brought online for emergency data collection.
This command must be enabled before each use.

Caution: This command can cause unstable operation and data loss if used
improperly. It is intended for disaster recovery only.

The trust command resynchronizes the time and date stamp and any other metadata
on a bad disk disk. This makes the disk an active member of the vdisk again.
You might need to do this when:
- One or more disks in a vdisk start up more slowly or were powered on after
  the rest of the disks in the vdisk. This causes the date and time stamps to
  differ, which the system interprets as a problem with the "late" disks.
  In this case, the vdisk functions normally after being trusted.
- A vdisk is offline because a disk is failing, you have no data backup, and
  you want to try to recover the data from the vdisk. In this case, trust may
  work, but only as long as the failing disk continues to operate.

When the "trusted" vdisk is back online, back up its data and audit the data
to make sure that it is intact. Then delete that vdisk, create a new vdisk, and
restore data from the backup to the new vdisk. Using a trusted vdisk is only a
disaster-recovery measure; the vdisk has no tolerance for any additional
failures.
                                      
INPUT
To enable the trust command:
trust enable

To trust a vdisk:
trust vdisk <vdisk>

enable
  Enables the trust command before use.

vdisk <vdisk>
  Name or serial number of the vdisk to trust. For syntax, type "help syntax".

EXAMPLE
Enable the trust command and then trust vdisk VD1:

  # trust enable
  Success: Command completed successfully.

  # trust vdisk VD1
  Success: Command completed successfully.
相关文章
|
7月前
CF1132D Stressful Training
CF1132D Stressful Training
|
8月前
|
网络协议 BI 调度
NR PRACH(五) type1 RA(4-step)基本过程
无线通信,最重要的前提是建立接收端和发射端之间的时间同步。
|
8月前
|
调度
NR PDSCH(六) DL data operation
NR中,网络端会根据UE业务动态的调整BWP,进而改变频域资源范围;不同的BWP会配置CORESET/Searchspace确定不同的时频域资源,让UE在对应的资源上进行盲检接收DCI;通过DCI获得调度信息后,再去PDSCH对应的时域资源和频域资源上 decode data。
|
机器学习/深度学习 监控 算法
CVPR2021 | Transformer用于End-to-End视频实例分割
视频实例分割(VIS)是一项需要同时对视频中感兴趣的对象进行分类、分割和跟踪的任务。本文提出了一种新的基于 Transformers 的视频实例分割框架 VisTR,它将 VIS 任务视为直接的端到端并行序列解码/预测问题。
CVPR2021 | Transformer用于End-to-End视频实例分割
|
供应链 计算机视觉
SAP SD 基础知识之计划行类别(Schedule Line Category)
SAP SD 基础知识之计划行类别(Schedule Line Category)
SAP SD 基础知识之计划行类别(Schedule Line Category)
SAP SD SO里‘Complete Delivery’勾选情况下的VL01G和VL01N
SAP SD SO里‘Complete Delivery’勾选情况下的VL01G和VL01N
SAP SD SO里‘Complete Delivery’勾选情况下的VL01G和VL01N
test case id - hash generation logic
Created by Wang, Jerry, last modified on Jul 06, 2016
105 0
test case id - hash generation logic
|
调度
6.3.3Transmit ONOFF time mask
6.3.3Transmit ONOFF time mask
240 0
|
调度 数据中心
10GBase-SR,LR,LRM,ER,ZR分别代表哪个SFP+光模块?
今天给大家介绍常见的10GBase-SR、10GBase-LRM、10GBase-LR、10GBase-ER和10GBase-ZR这五种以太网规范分别代表哪个SFP+光模块? 10G SFP+双纤系列光模块包括SR、LRM、LR、ER、ZR模块,它们的接口类型都是LC双工,且符合IEEE802.3ae、SFF-8472和SFF-8431标准,易天光通信ETU-LINK作为专业光模块制造商,接下来为大家详细分析这几种光模块。
4377 0