ASM diskgroup dismount with "Waited 15 secs for write IO to PST" (文档 ID 1581684.1)

简介: ASM diskgroup dismount with "Waited 15 secs for write IO to PST" (文档 ID 1581684.
ASM diskgroup dismount with "Waited 15 secs for write IO to PST" (文档 ID 1581684.1) 转到底部转到底部

In this Document

Symptoms
  Cause
  Solution
  References


APPLIES TO:

Oracle Database - Enterprise Edition - Version 11.2.0.3 to 12.1.0.1 [Release 11.2 to 12.1]
Information in this document applies to any platform.

SYMPTOMS

 

Normal or high redundancy diskgroup is dismounted with these WARNING messages.

//ASM alert.log


Mon Jul 01 09:10:47 2013
WARNING: Waited 15 secs for write IO to PST disk 1 in group 6.
WARNING: Waited 15 secs for write IO to PST disk 4 in group 6.
WARNING: Waited 15 secs for write IO to PST disk 1 in group 6.
WARNING: Waited 15 secs for write IO to PST disk 4 in group 6.
....
GMON dismounting group 6 at 72 for pid 44, osid 8782162 



 

 

CAUSE

Generally this kind messages comes in ASM alertlog file on below situations,

Delayed ASM PST heart beats on ASM disks in normal or high redundancy diskgroup,
thus the ASM instance dismount the diskgroup.By default, it is 15 seconds.

 

By the way the heart beat delays are sort of ignored for external redundancy diskgroup.
ASM instance stop issuing more PST heart beat until it succeeds PST revalidation,
but the heart beat delays do not dismount external redundancy diskgroup directly.

The ASM disk could go into unresponsiveness, normally in the following scenarios:

+    Some of the paths of the physical paths of the multipath device are offline or lost
+    During path 'failover' in a multipath set up
+    Server load, or any sort of storage/multipath/OS maintenance

 

The Doc ID 10109915.8  briefs about Bug 10109915(this fix introduce this underscore parameter). And the issue is with no OS/Storage tunable timeout mechanism in a case of a Hung NFS Server/Filer. And then  _asm_hbeatiowait  helps in setting the time out.


 

 


 

SOLUTION

1]    Check with OS and Storage admin that there is disk unresponsiveness.

2]    Possibly keep the disk responsiveness to below 15 seconds. 

This will depend on various factors like
+    Operating System
+    Presence of Multipath ( and Multipath Type )
+    Any kernel parameter

So you need to find out, what is the 'maximum' possible disk unresponsiveness for your set up.

For example, on AIX  rw_timeout  setting affects this and defaults to 30 seconds.

Another example is Linux with native multipathing. In such set up, number of physical paths and  polling_interval value in multipath.conf file, will dictate this maximum disk unresponsiveness.

So for your set up ( combination of OS / multipath / storage ), you need to find out this.

3]    If you can not keep the disk unresponsiveness to below 15 seconds, then the below parameter can be set in the ASM instance ( on all the Nodes of RAC ):

    _asm_hbeatiowait
    
As per internal bug 17274537 , based on internal testing the value should be increased to 120 secs, the same will be fixed in 12.2

 

Run below in asm instance to set desired value for _asm_hbeatiowait

alter system set "_asm_hbeatiowait"=<value> scope=spfile sid='*';

And then restart asm instance / crs, to take new parameter value in effect.



 

REFERENCES

BUG:17043894 - DISKGROUP DISMOUNTS IF 2 OUT OF 8 PATHS LOST
BUG:10109915 - ASM HANGS IN HIGH REDUNDANCY CONFIG IF 1 OF 5 DISKS GOES OFFLINE
NOTE:1910315.1 - How to Create a Normal Redundancy Diskgroup Best Practices
目录
相关文章
|
Oracle 关系型数据库 Linux
ASMFD (ASM Filter Driver) Support on OS Platforms (Certification Matrix). (文档 ID 2034681.1)
1) Starting with Oracle Grid Infrastructure 12C Release 1 (12.1.0.2), Oracle ASM Filter Driver (Oracle ASMFD) is installed with an Oracle Grid Infrastructure installation.
2646 0
|
Java 数据格式 XML
|
存储 Oracle 关系型数据库
【MOS】零宕机迁移ASM磁盘组到另一个SAN/磁盘阵列/DAS的准确步骤 (文档 ID 1946664.1)
【MOS】零宕机时间迁移 ASM 磁盘组到另一个 SAN/磁盘阵列/DAS 的准确步骤 (文档 ID 1946664.1) 文档内容 目标   提问,获得帮助,并分享您对于这篇文档的经验。
1167 0
|
存储 Oracle 关系型数据库
Oracle 11gR2 restart 单机使用asm存储 主机名发生更改处理过程 (文档 ID 986740.1)
How to Reconfigure Oracle Restart (文档 ID 986740.1) In this Document Goal ...
1027 0
lua的io操作文档
2014-09-16~15:26:35 I/O库提供两种不同的方式进行文件处理1、io表调用方式:使用io表,io.open将返回指定文件的描述,并且所有的操作将围绕这个文件描述 io表同样提供三种预定义的文件描述io.
744 0
http://devdocs.io/【文档收藏】
http://devdocs.io   http://bower.io/   www.bower.iobrowserify.org   jsPlumb布局 https://github.com/lndb/jsPlumb_Liviz.
953 0
|
Python
找到个好的讲PYTHON FILE IO的文档,收藏
现在我感觉快入门了哈, 这两天,可以用PYTHON写一点自己想要实现的东东了。 但文件,IO,编码,邮件,始终有点续不完全。 这个文档,我看行。。 http://www.dabeaz.com/python3io/ !!!
842 0
|
存储 Oracle 关系型数据库
【ASM学习】ASM文档
在深入介绍ASM的复杂内容之前,首先需要感谢Oracle公司的Nitin Vengurlekar,他负责编写了本章中关于ASM的优秀补充内容。    在Oracle Database 10g Release 2中,使用自动存储管理(Automatic Storage Management,ASM)极大地简化了数据库的存储管理和配置。
1050 0