[20171109]数据库与asm实例的通讯问题.txt

简介: [20171109]数据库与asm实例的通讯问题.txt http://www.toadworld.com/platforms/oracle/b/weblog/archive/2017/11/07/solving-communication-problems-bet...
[20171109]数据库与asm实例的通讯问题.txt

http://www.toadworld.com/platforms/oracle/b/weblog/archive/2017/11/07/solving-communication-problems-between-db-and-asm-instances

--//摘要其中的段落:
--//建立新表空间报如下错误:

A long time ago I received a call from a customer saying that there were some errors in the database instance. Well,
interestingly the databases were executing DMLs properly without any issue. I asked the customer if these errors
appeared only with one specific operation like an Insert, or like a CREATE <something>, etc.; and he said that he was
running a script received from the application team to create several tablespaces with its datafiles.  When he was
running the script he was receiving the following errors:

ORA-01119: error in creating database file '+DATA'
ORA-17502: ksfdcre:4 Failed to create file +DATA
ORA-27300: OS system dependent operation:open failed with status: 2
ORA-27301: OS failure message: No such file or directory
ORA-27302: failure occurred at: sskgmsmr_7

--//有人意外删除 $ORACLE_HOME/dbs/ab_<ASM SID>.dat.注意是grid用户下dbs目录.
# locate ab_+ASM1.dat
/u01/app/11.2.0/grid/dbs/ab_+ASM1.dat


First, you can see that the set of errors says that there is a directory or file that don't exist in the OS on the other
hand, it points to the ASM disk group, which in this case is "+DATA". So this is confusing, because either the file that
the database is looking for is in ASM or it is in the OS.  I did a quick check of the database instance and it was OK.
There were no errors in the alert log, all the disks were healthy. On the database side, however, there seemed to be
some issues, specifically with the sentences "CREATE TABLESPACE" which the customer had in the script provided by the
application team.

So, the clues were:

    No issues with the ASM Instance
    DMLs were being executed successfully in the database instance.
    CREATE TABLESPACE statements fail in the database instance.
    ASM and OS are both involved in a "file" or "directory" that doesn't exist.

With these four clues to go by, you should be on the right track if your concepts are solid. The root cause you would be
thinking about would involve the file that the database instance uses to communicate with the ASM instance This file is
named "ab_<ASM SID>.dat" and it is located in the $ORACLE_HOME/dbs. You need to know about the existence of this file
and what its function is.This file rarely has issues, or rarely causes problems…but sometimes it happens,

Let's define this file:

What is the "ab_<ASM SID>.dat" file? This file is used by the database instance to message an ASM instance. When the
database instance needs to send a message to the ASM instance, the database instance reads this file in order to find
out the information required for getting connected to the ASM instance. This file is in $ORACLE_HOME/dbs. If this file
doesn't exist the database will not be able to connect to the ASM instance and you will receive an error. This file is
important because it is involved in the database instance work.

Some time ago I wrote an article with several tests of where this file is required to execute some sentences in the
database and in which sentences the file is not required. You can read the details here.

The conclusion of that earlier article indicates:

    Tablespace creation – required
    Datafile creation – required
    Table creation – not required
    DML operations – not required
    Drop tablespace – not required
    Delete datafile – not required
    Startup database instance – required
    Shutdown database instance – not required

Well, taking that into account, to solve this customer's issue, I listed all the files in $ORACLE_HOME/dbs and the root
cause was confirmed. The file "ab_<ASM SID>.dat" did not exist in the directory. I asked the customer if he had moved
the file somewhere else or if he'd deleted it and he said that the day before the junior DBA was "cleaning" logs and
traces that were using space and that could be deleted. I think that one of those files that "could be deleted" was
"ab_<ASM SID>.dat". As I said before, this situation happens rarely. Solving the problem is not a big deal; what we have
to do is reboot the ASM instance, but in order to do that we have to reboot the database instance as well.  After
rebooting the ASM instance the file was recreated and the database was able to use it. The script that the customer had
was executed successfully and all the CREATE TABLESPACE operations were success.

Conclusion

Sometimes there are issues whose root cause is very rare, and in order to determine it quickly we have to have all our
concepts solid; otherwise, we might spend several hours trying to figure out what's going on, reading notes and so on.

In this case, it was very important to identify the clues. We had four clues here which pointed us to the right root
cause.  Sometimes the customer is stressed and under pressure and wants us to fix the problem fast, but DBAs have to
stay calm, we have to extract the clues (syntomps), to think about the root cause,  to create an hypothesis and work to
prove it. To shorten diagnostic time make sure you're on solid ground conceptually, which you can do by practicing
various scenarios while you are getting prepared for a certification.

目录
相关文章
|
1月前
|
NoSQL Java 数据库
【问题篇】springboot项目通过数据库限制实例端口号
【问题篇】springboot项目通过数据库限制实例端口号
19 0
|
3月前
|
弹性计算 关系型数据库 MySQL
快速上手阿里云RDS MySQL实例创建,轻松管理数据库
快速上手阿里云RDS MySQL实例创建,轻松管理数据库 在数字化时代,数据已成为企业的核心资产。如何高效、安全地存储和管理这些数据,成为企业在云计算时代亟待解决的问题。阿里云的RDS(关系型数据库服务)应运而生,为用户提供稳定、可靠的云上数据库解决方案。本文将详细介绍如何通过阿里云RDS管理控制台快速创建RDS MySQL实例,让您轻松上手,快速部署数据库。
170 2
|
4月前
|
数据安全/隐私保护 时序数据库
InfluxData【部署 03】时序数据库 InfluxDB 离线安装配置使用(下载+安装+端口绑定+管理员用户创建+开启密码认证+开机自启配置)完整流程实例分享
InfluxData【部署 03】时序数据库 InfluxDB 离线安装配置使用(下载+安装+端口绑定+管理员用户创建+开启密码认证+开机自启配置)完整流程实例分享
128 0
|
4月前
|
存储 JSON 关系型数据库
Pandas载入txt、csv、Excel、JSON、数据库文件讲解及实战(超详细 附源码)
Pandas载入txt、csv、Excel、JSON、数据库文件讲解及实战(超详细 附源码)
62 0
|
27天前
|
Java 数据库
java面向对象高级分层实例_数据库操作类
java面向对象高级分层实例_数据库操作类
10 1
|
28天前
|
SQL Java 数据库连接
springboot解析txt文件顺便加到数据库中(nohup文件)
springboot解析txt文件顺便加到数据库中(nohup文件)
108 1
|
1月前
|
SQL 关系型数据库 数据库
sql如何新建数据库实例
sql如何新建数据库实例
|
2月前
|
存储 Oracle 关系型数据库
【数据库数据恢复】Oracle数据库ASM磁盘组掉线的数据恢复案例
oracle数据库ASM磁盘组掉线,ASM实例不能挂载。数据库管理员尝试修复数据库,但是没有成功。
【数据库数据恢复】Oracle数据库ASM磁盘组掉线的数据恢复案例
|
4月前
|
监控 前端开发 应用服务中间件
Zabbix【部署 01】Zabbix企业级分布式监控系统部署配置使用实例(在线安装及问题处理)程序安装+数据库初始+前端配置+服务启动+Web登录
Zabbix【部署 01】Zabbix企业级分布式监控系统部署配置使用实例(在线安装及问题处理)程序安装+数据库初始+前端配置+服务启动+Web登录
411 0
|
4月前
|
NoSQL Java Redis
Redis【应用 01】Java实现动态切换写入不同Redis数据库的方法实例(动态切换Redis数据库)
Redis【应用 01】Java实现动态切换写入不同Redis数据库的方法实例(动态切换Redis数据库)
80 0