RAC脚本检查

  1. 云栖社区>
  2. 博客>
  3. 正文

RAC脚本检查

flzhang 2017-08-17 08:27:15 浏览838
展开阅读全文
Availablity (RAC)   PISAORA_R.B.1 interconnect network availability
私有网络里多块网卡互联
配置多个的好处:负载均衡,failover和私网带宽提升
oifcfg getif OK: Redundancy Configuration for Interconnect Network and Switch using IPMP, APA, etc
NO: No Redundancy
[Development] DB RAC Interconnect not redundant configuration
[Problem] Interconnect Network failure, caused one DB server Reboot
[Improvement] HP's APA (Auto Port Aggregation) using redundancy
[Note] Oracle Interfaces
lan1 185.191.120.0 global cluster_interconnect
lan4 17.91.220.0 global public è lan4 (Active) / lan5 (Standy)
  PISAORA_R.B.2 ok [>=10g]
Dynamic Resource Mastering (DRM) Disable
(_gc_affinity_time & _gc_undo_affinity)
(11g:_gc_policy_time)
SELECT max(ksppinm) name, max(ksppstvl) value
 FROM   x$ksppi a, x$ksppsv 
b
 WHERE 
a.indx=b.indx and ksppinm in ( '_gc_affinity_time', '_gc_policy_time' );

SELECT max(ksppinm) name, max(ksppstvl) value
 FROM   x$ksppi a, x$ksppsv 
b
 WHERE 
a.indx=b.indx and ksppinm = '_gc_undo_affinity';

Bug 6960699 (fixed 10.2.0.5)
 
"latch: cache buffers chains" contention/ORA-481/kjfcdrmrfg: SYNC TIMEOUT/ OERI[kjbldrmrpst:!master]

OK: DRM Disabled
 10g :  _gc_affinity_time = 0,
 
       
_gc_undo_affinity=false

 11g 
: _gc_policy_time = 0,
       
_gc_undo_affinity=false

NO: DRM Enabled (*default)


** ONLINE change is not possible.
The Parameter values must be same between RAC nodes
[Phenomenon] RAC Dynamic Resource Mastering features
[Problem] DRM functionality and performance according to the available bug
[Improvement] 10g: _gc_affinity_time = 0,
                     
_gc_undo_affinity = false
               
11g: _gc_policy_time = 0
                      
_gc_undo_affinity = false

[Note] is a node-to-node application partition, using the object is separated, mainly between nodes separated by block access if the DRM is effective, but with the same node object / block the access to this type of application, if the performance load DRM increase. In this case, counsel DRM disable
  PISAORA_R.B.3 [~11gR1]
IP=FIRST in listener.ora
??
The (IP=FIRST) statement will make the listener create a listening endpoint on the IP address to which the given HOST resolves. By default, without (IP=FIRST), the listener will listen on all network interfaces (e.g. INADDR_ANY)  OK: IP=FIRST in listener.ora
NO: hostname is used for "HOST=" clause but, "IP=FIRST" is not specified in listener.ora

** No need from 11g because CRS set it automatically.
[??] Listener.ora? IP=FIRST? ???? ?? ??.
[???] ???? ??  INADDR_ANY
???? ?? host? ?? network
              interface
? listener? connection ??
[????]  listener.ora
? IP=FIRST ? ???? ????.

[
??] What is IP=FIRST in the LISTENER.ORA file ? [ID 300729.1]


???: ??? ?? ??? listener.ora? ADDRESS=??? IP=ADDRESS??
(??)
LISTENER =
 
(DESCRIPTION_LIST =
   
(DESCRIPTION =
     
(ADDRESS_LIST =
       
(ADDRESS = (PROTOCOL = TCP)(HOST = racnode1-vip) (PORT = 1521) (IP = FIRST))
       
(ADDRESS = (PROTOCOL = TCP)(HOST = racnode1) (PORT = 1521) (IP = FIRST))
     
)
   
)
 
)

[
???? ?? ?? ????? ???]
1? ??? NIC ??? ?? VIP1 ? 2? ??? ???? ??, Client-Side ? CTF ? ??? ??? ?? ?? ???, 2? ??? LISTENER ? VIP1 ? ???? ??? ???? ??. ?? ??, VIP1 ? ?? 1? ??? ????? ???? ?, 2? ???? VIP1 ? ??? ??? ? Oracle Shadow Process ? Client ?? Network Connection ? ????? ??, ????(tcp_keepalive_interval + tcp_ip_abort_interval)?? Resource & Lock ? ?? ??? ???? ??.
  PISAORA_R.B.4 ok DB parameters consistency between instances select name,max(value) max,min(value) min from gv$parameter group by name
 
having max(value) <> min(value)
 
and name not in ('audit_file_dest','instance_name','instance_number','local_listener','parallel_instance_group','undo_tablespace','user_dump_dest','background_dump_dest','core_dump_dest','service_names','thread')
;
OK: No rows selected
    
Same Value for each instance
    
(Exception: Intended different setting such as SGA size by differency Capacity)
NO: Rows Return 
[??] 2? instance? ?? $ORACLE_HOME/dbs/initDWDB2.ora?? spfile=?? ?? paramter?? ?? ??
[???] instance? initSID.ora? spfile ?? ??? ??? ??, ??? parameter ??? ? ???? ?? ?? ??
[????] 2? instance? ?? instance? ????? spfile? ???? init file?? spfile? ??
-------------------------------------------------------
[
??*] RAC???? Primary Server? Standby Server? DB parameter ??? ??
[???*] Failover? 2? ?? ?? ??
[????*] Primary Server? ?? ??? Standby Server? ?? ???? ???? ??? ??
  PISAORA_R.B.5 IPC protocol  setting in the listener.ora
IPC 进程间通信机制
RAC中缓存融合内存之间通信靠IPC更快
listener.ora

[Note: MOD ID ID 403743.1] VIP Failover Take Long Time After Network Cable Pulled
  
OK: address list of IPC protocol is located upper than that of TCP protocol
NO: No setting for IPC protocol OR
     TCP is located upper than IPC protocol
[??] listener ? IPC(Interprocess Communication) Portocol ???
[???] Network Cable? ??? ? Public Network ?? ?? ?
               Failover
??  ?? ??(3~4?)
[????] listener.ora? IPC Protocol ?? ??
  PISAORA_R.B.6 Sequence Cache Size  SQL>select CACHE_SIZE  from dba_sequences where SEQUENCE_NAME in ('AUDSES$', 'IDGEN1$');

The cache size for IDGEN1$ is increased
  to 1000 
by bug 7694580 ( fixed 11.2 ).

[Note] High SQ Enqueue Contention with LOBs or Advanced Replication [ID 432508.1]
The cache size for IDGEN1$ is increased
  to 1000  by bug 7694580 ( fixed 11.2 ). 
OK: AUDSES$  10000,cache的大小默认是20,这里改成10000
    IDGEN1$ 
1000 (~11gR1)
NO: smaller value than "OK" specification


alter sequence sys.audses$ cache 10000;
alter sequence sys.idgen1$ cache 1000;
[??] ?? Oracle Sequence? Cache? Default? ?? ??
[???] Connection Storm?? LOB insert?? ??? ??? ??
             
[
????] AUDSES$, IDGEN1$ sequence? cache? 10,000, 1,000? ????.

????
.
alter sequence sys.audses$ cache 10000;
alter sequence sys.idgen1$ cache 1000;

[
??] High SQ Enqueue Contention with LOBs or Advanced Replication [ID 432508.1]
The cache size for IDGEN1$ is increased  to 1000  by bug 7694580 ( fixed 11.2 ). 
  PISAORA_R.B.7  Instance_groups
(avoid unintended internode parallelism)
SELECT *
FROM
      
(SELECT version
     
, to_number(replace(version,'.','')) as version_number
     
, parallel
 FROM  
v$instance
       ) 
ins,
      
------
      
(SELECT max(ksppinm) name, max(ksppstvl) value
 FROM   x$ksppi a, x$ksppsv 
b
 WHERE 
a.indx=b.indx and ksppinm = 'parallel_instance_group'
       ) 
par1,
      
------
      
(SELECT max(ksppinm) name, max(ksppstvl) value
 FROM   x$ksppi a, x$ksppsv 
b
 WHERE 
a.indx=b.indx and ksppinm = 'instance_groups'
       ) 
par2,
      
------
      
(SELECT max(ksppinm) name, max(ksppstvl) value
 FROM   x$ksppi a, x$ksppsv 
b
 WHERE 
a.indx=b.indx and ksppinm = 'parallel_force_local'
       ) 
par3
/
OK: [10g] parallel_instance_group and instance_groups are set
   
[11g] parallel_force_local=true
 
NO:
???


[Phenomenon] instance_group Not specified
[Problem] Parallel Query runs, not unintentional Query Process of the RAC
              
Node performed in all performance problems occur.
[Improvement]
[10g]
parallel_instance_group, instance_groups specify

parallel_instance_group: can be changed online
instance_groups: can not be changed online

[11g]
parallel_force_local = true set
  PISAORA_R.B.8  [Only 10gR2 10.2.0.5 ~11gR1]
Prevent Unexpected VIP relocation
# $GRID_HOME/bin/racgvip check
# timeout of ping in number of loops (1 sec)
PING_TIMEOUT="

If the problem is with the network, the above "ping" command would take longer than 1s, and this leads to VIPs going offline unexpectedly and relocating to another node.


In the VIP trace file under "CRS_HOME/log/node02"
在ora.nodename.vip.log 中内容
2011-02-18 15:3OK:39.48NR: [ RACG][1] [4587556][1][ora.node02.vip]: Fri Feb 18 15:3OK:37 GMT+08:00 2011 [ 8257768 ]
About to execute command: /usr/sbin/ping -S 192.168.220.36 -c 1 -w 1 192.168.220.33
Fri Feb 18 15:3OK:39 GMT+08:00 2011 [ 8257768 ]
IsIfAlive: RX packets checked if=en1 failed
翻译就是"/usr/sbin/ping -S 本机公网的IP 
-c 1 -w 1 网关的ip"这个命令,即通过本地的ip发送一个64K的包去ping网关的地址,
 
当发现timeout超过1秒时,即返回失败。这样oracle认为这个公网的ip不通,就进行了vip的切换。进而将本地监听offline。



OK: PING_TIMEOUT=" -c 1 -w 3"
NO: PING_TIMEOUT=" -c 1 -w 1"




# $GRID_HOME/bin/racgvip check
# timeout of ping in number of loops (1 sec)
PING_TIMEOUT=" -c 1 -
w 1"
==>
# timeout of ping in number of loops (3 sec)
PING_TIMEOUT=" -c 1 -
w 3"
srvctl start nodeapps -n
[Phenomenon] VIP check consists of 1 second timeout
[Problems] because of the load on the system within the timeout time to relocate VIP check is not a problem.
[Improvement] VIP check script timeout time of 1 second -> 3 second fix

[ID 1297867.1] VIPs Often Go Offline Unexpectedly and Relocate to Another Node
 
  PISAORA_R.B.9 Synchronization of Default OS TimeZone and oracle user TZ environment As Oracle user, grid user, root user

env | grep TZ
OK: Same TZ for Oracle, Grid, Root users.
NO: Different among the users

[Phenomenon] OS Default Timezone mismatch
[Problem] DB server Timezone note that during Booting OS User environment variables and inconsistencies in the system log and the time value of different Logging Has Timezone
[Improvement] OS Default Timezone (/ etc / default / tz) to the same set
  PISAORA_R.B.10 CRS Version crsctl query crs activeversion
crsctl query crs softwareversion
OK: Same as or higher than DB version
NO: Lower than the DB version
[??] CRS Version? DB Version ?? ??
[???] CRS Version? DB Version?? ??? ??? ?.
[????] CRS Version? Upgrade?.
  PISAORA_R.B.11 [Except for Exadata]
Interconnect Network Switch ?
  OK: Switch is used for interconnect network
NO: Cross  Cable without Switch
[??] CRS interconnect? direct??? ??
[???] interconnect? direct??? ????? ???? ?? ??.
              ?? Lan Card??? interconnect ??? ??? ??  CRS? ??               Server
? Lan Card? ??? ??? ???? ??.
[
????] Switch? ??? Interconnect? ??.

[
??]
Interconnect Network? Oracle?? ????? switch? ??? ??? support?, direct??? not support
  PISAORA_R.B.12 [Except for Exadata]
jumbo frame for Interconnect 对于新的能支持巨帧的系统Network内网连接中巨帧的大小设置为9000字节,交换机上配置和私网网卡配置巨帧大小都要相同
(HP) netstat -in (the other) ifconfig -a(查看frame size)
default: 1500, MTU:9000
OK: Jumbo Frame is used (MTU:9000)
NO: Jumbo Frame is NOT used (MTU:1500)

*** Only for New Systems, that is, pre-production systems,
Not recommed to change it for Production systems
[??] Interconnect Network? MTU? 1500 ?? ??
[???] DB block? 8k, 16k, 32k ??? ??? MTU? ?? ??? ???? ?? Merge?? ??? ??? ??? ?? ??. 
[
????] ?? Switch? ???? MTU? 9000?? ???? ?? ??.
[
??] ? Open?? ???? ???? ?? ?? ??.
?? ??? ??? ??
  PISAORA_R.B.13 [Except for ASM]
CRS Auto Start
HP/Solaris): cat /var/opt/oracle/scls_scr/$host/root/crsstart
==> check "disable"
IBM/Linux) : cat /etc/oracle/scls_scr/`hostname`/root/crsstart
==> check disable
(From 11gR2 check ohasdstr instead of crsstart)
OK: CRS Auto Start Disable
NO: no set (*default)


# crsctl disable crs
[??]  CRS auto start enable
[???] OS? Cluster ??? ????? Strat?? ?? CRS? ?? start???? ?? CRS? Resource? ????? ???? ?? ? ??.
[????] CRS Auto Start? disable??
               
os>crsctl disable crs
  PISAORA_R.B.14 [Except for ASM]
Voting Disk Configuration
CRS user>
os>crsctl query css votedisk
OK: # of Voting Disk is 3 or 5
NO: # of Voting Disk is 1
[??] Voting Disk? ??? ?? ?? ??
[???] 
Voting File? ??? ??? ??
[????] Voting File? ???  (????? ??? ?? ??? ???? ?? ??)
  PISAORA_R.B.15 [Except for ASM]
File System seperation for each voting files 每个voting file放在单独的文件系统上
CRS user>
os>crsctl query css votedisk
OK: Each File is located on the seperated file system
NO: located in the same directory
[??] ?? Voting File? ?? Directory? ??
[???] ????? ?? Voting File? ?? ?? ??? ??
[????] Voting File? ?? ?? Directory? ??
               (????? ??? ?? ??? ???? ?? ??)
  PISAORA_R.B.16 OCR Disk Configuration CRS user>
os>ocrcheck
OK: # of OCR Disk >= 2
NO: # of OCR Disk = 1
[??] OCR Disk? ??? ?? ?? ??
[???]  OCR Disk
? ??? ??? ??
[????] OCR? ???  (????? ??? ?? ??? ???? ?? ??)
  PISAORA_R.B.17 File System seperation for each OCR Disk CRS user>
os>ocrcheck
OK: Each File is located on the seperated file system
NO: located in the same directory OR same ASM Disk Group
[??] ?? OCR File? ?? Directory? ??
[???] ????? ?? OCR File? ?? ?? ??? ??
[????] OCR File? ?? ?? Directory? ??
               (????? ??? ?? ??? ???? ?? ??)
  PISAORA_R.B.19 CRS Network Hearbeat misscount
Linux上默认misscount为60s,其他平台为30s,若使用了第三方vendor clusterware则为600s
$ crsctl get css misscount
内网网络PING在MC时间内完成
OK:  less than 200 seconds
 
If vendor第三方 (OS) clusterware heartbeat can detect the heartbeat less than 200 seconds, it's OK to have larger than 200 of Oracle misscount.

default: 10g : CRS Only : 30 (linux? 60), Vendor Cluster : 600
         
11g : CRS Only : 30, Vendor Cluster : 600

Steps To Change CSS Misscount, Reboottime and Disktimeout [ID 284752.1]
[??] css misscount? ???? ??? ???? ?? ??.
[???] False Detection ?? ?? ????? ??.
[????] css misscount? 200 ??? ??? ????.
  PISAORA_R.B.20 [10gR1~11gR1]
CRS diagwait
CRS user>
os>crsctl get css diagwait
保证OPROCD在超过13s内返回,重启系统
也就是cpu hang超10s,css重启超3s,两值相加超13s重启系统

** Online Change is not allowed
OK: 13
NO: not Set

No need to set from 11gR2 [ID 559365.1]


/ora_crs/bin/crsctl stop crs
/ora_crs/bin/crsctl get css diagwait
/ora_crs/bin/crsctl set css diagwait 13
  -force   

/ora_crs/bin/crsctl get css diagwait 
[??] CRS Logging ?? ???? ? Default ??
[???] CRS? ?? ?? Reboot? Log ??? ?? ???? ?? ?? ??
[????] diagwait ????  0
?? 13?? ??
[??] 2?? DB ??? ?? ??? CRS Stop ?? ?? ??
??
??? CRS?? ????? ???? ?? ?? OCR Corruption? ??? ? ???? Rollable ???
  PISAORA_R.B.21 CRS client log Management ls -al /ora_crs/log/`hostname`/client | wc -l  OK: less than 1000
NO: more than 1000

** You have to remove them regularly
[??] $ORA_CRS_HOME//log/`hostname`/client ? ?? file? ??.
[???] svrctl command? ??? ???.
             
???? log ??? ?? file limit ?? ? log scan? ?? CRS
             
command ?? ??
[????] $ORA_CRS_HOME//log/`hostname`/client ? 1000? ??? file?
              ????? ????. (????? ??)
  PISAORA_R.B.22 OCR/OLR Auto Backup Status CRS user>
For OCR backup:
os>ocrconfig -showbackup

For OLR Backup:
os>ocrconfig -local -showbackup
OK: OCR Backup within 1 week
    
OR OLR Backup
NO: No backup for OCR OR OLR Backup
 
  PISAORA_R.B.23 [If Switch is used as VLAN]
Interconnect Network Switch Configuration?
如何在VLAN上部署RAC, 一个交换机虚拟出3个网路,public,private,san不成么?
[NOTE: MOS ID 220970.1] RAC: Frequently Asked Questions
If deploying the interconnect on a VLAN, there should be a 1:1 mapping of the VLAN to a non-routable subnet and the VLAN should not span multiple VLANs (tagged) or multiple switches.
OK:  1:1 mapping of the VLAN to a non-routable subnet
NO: share the Interlink with tagging

** Consult with Network Administration
[??] Interconnect Switch? VLAN?? ??.
[
???] Cluster interconnect? vlan? non-routerble IP? ????? ?.
[
????] ?? interlink? ???? 1:1 mapping? ??? ??.
[
??]
[ID 220970.1] RAC: Frequently Asked Questions
If deploying the interconnect on a VLAN, there should be a NR:1 mapping of the VLAN to a non-routable subnet and the VLAN should not span multiple VLANs (tagged) or multiple switches
  PISAORA_R.B.24 [11gR2, XA or DB link Case]
_clusterwide_global_transactions
SQL>SELECT max(ksppinm) name, max(ksppstvl) value
 FROM   x$ksppi a, x$ksppsv 
b
 WHERE  a.indx=b.indx and ksppinm = '_clusterwide_global_transactions';
OK: _clusterwide_global_transactions=false
NO: _clusterwide_global_transactions=true (*default)

** ONLINE Changed is not permitted
[??] clusterwide global transaction? ??? ??
[???] 11g?? XA? DB link??? ?? TX? RAC? ?? ????? ?? ????? clusterwide global TX? ??????, ??? bug ??? ?? ???? hang?? lock? ?? ??
[????] _clusterwide_global_transactions=false

[
?        ?] [ID 1361615.1] High rdbms ipc reply and DFS lock handle in 11gR2 RAC With XA Enabled Application
[ID 8588540.8] Bug 8588540 - Corruption / ORA-8102 in RAC with loopback DB links between instances
[ID 13605839.8] Bug 13605839 - ORA-600 [ktbsdp1] ORA-600 [kghfrempty:ds]. Corruption in Rollback with Clusterwide Global Transactions in RAC
  PISAORA_R.B.25 Owner/permission of OCR/Voting   OK: **Recommended Value
NO: Other value

** Recommended Value
OCR Disk: root:oinstall - 640
Voting Disk: oracle:oinstall - 644 OR 640
For Group, oinstall and dba are both OK
[??] OCR ? voting file? owner:group? ?? ??? ??.
[???] Oracle Clusterware (Grid Infrastructure) ??? ??? ??
[????] OCR? Voting Disk ? owner ? ??? ??? ?? ??
              
OCR - root:dba - 640
              
Voting Disk - oracle:dba - 644
  PISAORA_R.B.28 [10gR2~]
OS Watcher Configuration
$ ps -ef | grep OSW OK: OSW is being used
NO: not configured

(How to run)
cd ./OSW/OSWbb
nohup ./startOSWbb.sh 30 720 &
[??]  OS Watcher ???? ??
[???]  ??? ????? ???? ??? ??? ? CPU, MEMORY, IO, Network? ??? ?? ??.
[
????] OS Watcher ?? ??
[??] OS Watcher ????? ?? Technical Memo ??

- ??? ?? OS??? ??? ???? ??? ?? OS Platform? ????? ??? ??? ??? ???? Script
- RAC
? ???? ????? ??? ?? ???.

[
??] Technical Memo-OSW(OS_Watcher)_Black_Box.doc
  PISAORA_R.B.29 [>= VERITAS CFS集群文件系统 4.1]
VERITAS CFS ??? ODM (Oracle Disk Manager) library with VERITAS CFS
ODM提升文件系统的性能,使文件系统也能达到raw设备的性能。但是ODM需要第三方厂家提供相应的接口才能实现,比如Veritas的提供的ODM library
Check $ORACLE_HOME/lib/libodm* if it's soft  linked to the Veritas ODM


# ls -l $ORACLE_HOME/lib/libodm*
/opt/VRTSodm/lib/libodm.sl
 
-- HP PA Systems
/opt/VRTSodm/lib/libodm.so -- HP IA Systems
OK: VERITAS ODM Library is Linked
NO: Not Linked with VERITAS ODM
[??] VERITAS CFS??? ???? ODM library ???? ??
[???] Oracle Library? ??? ???? ?? I/O? ??? ?? ?? ???
[????] ???? Library? ????? ????.
             
VERITAS CFS
?? ???? ODM (Oracle Disk Manager) library ????? ?? ??
(?? ??)
1. Login as Oracle user
2. Shutdown database
3. Link the Oracle Disk Manager library into Oracle home
For Oracle 10g on HP 9000 Systems:
$ rm ${ORACLE_HOME}/lib/libodm10.sl
$ ln -s /opt/VRTSodm/lib/libodm.sl ${ORACLE_HOME}/lib/libodm10.sl
For Oracle 10g on Integrity Systems:
$ rm ${ORACLE_HOME}/lib/libodm10.so
$ ln -s /opt/VRTSodm/lib/libodm.so ${ORACLE_HOME}/lib/libodm10.so
For Oracle 11g on HP 9000 Systems:
$ rm ${ORACLE_HOME}/lib/libodm11.sl
$ ln -s /opt/VRTSodm/lib/libodm.sl ${ORACLE_HOME}/lib/libodm11.sl
For Oracle 11g on Integrity Systems:
$ rm ${ORACLE_HOME}/lib/libodm11.so
$ ln -s /opt/VRTSodm/lib/libodm.so ${ORACLE_HOME}/lib/libodm11.so
4. Start Oracle database

  PISAORA_R.B.30 [11.1 ~ ]
_gc_bypass_readers
<>
SQL>show parameter _gc_bypass_readers

[Note] When one Instance crashes, the other instance hang with "Buffer Busy" Wait with high workload (real case)
OK: _gc_bypass_readers=false
NO: _gc_bypass_readers=true (*default)
[Phenomenon] 11g Version of the operating _gc_bypass_readers = true.
[Problem] Reader Bypass due to an error the function phenomena such as Hang or recovery slowdown is occurring.
[Improvement] as a new feature to disable.
                
_gc_bypass_readers = false.

[Note: The action method;
SQL> alter system set "_gc_bypass_readers" = false;
That the parameters can be set in the operating environment and the online (rolling can also be changed), initSID.ora value is changed when modifying the maintenance restartup

Available ** ONLINE, ROLLABLE available
If you change ** Online, Session connected to the existing functionality of the Reader Bypass use immediately and replaced with Block Lock type, so that the work of Session may temporarily wait. Exact test under load, it takes 1 to 2 seconds during
  PISAORA_R.B.31 [11.2. ~ ]
_gc_read_mostly_locking
11g新特性,读能提高性能,写不成
《DRM和read-mostly locking》
  OK: _gc_read_mostly_locking=false
NO: _gc_read_mostly_locking=true (*default)

[Phenomenon] 11g Version of the operating _gc_read_mostly_locking = true.
[Problem] 11g instance crash due to renal failure occurs or Internal Error
[Improvement] as a new feature to disable.
                
_gc_read_mostly_locking = false.
Not ** ONLINE, ROLLABLE not
[Note]
Read-mostly: 11g new feature that does not occur by a change mainly about the object specified as read-mostly locking.
When acquiring S lock faster, interconnect traffic can be reduced
However, if the changes are many object inefficient I / O is causing Saved.

Bug 13457582 - ora-600 [kclantilock_8] [ID 13457582.8]
Instance crash.
  PISAORA_R.B.32 [9.2 ~ 10.2.x]
_gc_integrity_checks
[Note] With 11g, "Fusion Assert" issue is related with "_gc_integrity_checks" is 2,
that is, no problem with default value (1)
(9i~10g: Fusion Assert is realted with value 1)
 
OK: _gc_integrity_checks=0
NO: _gc_integrity_checks=1 (*default) 
[??] _gc_integrity_checks = true? ??.
[
???] 1. cross ??? ??? restart??? ??
              2. interconnect
??? ??? fusion asset ??? ?? ?? ??
[????] ??? ???? ??? ?.
             
_gc_integrity_checks = false.

** ONLINE
??, ROLLABLE ??
1. RAC ???? cross ??? ??? restart??? ??
( RAC cross ???? oracle??? not support ?? ???)

2.
??? resource? ?? ??? process? ????? threshold??? ?? operation?? ???? assert code? ????, ?? process? ???? ????? F/G?, ??? ??? ?? ??? LMS process? fusion assert  ??? ????. ?? ??, RAC ?? ??? ??? ? ??.

Fusion assert
?? block status message ???? ?? RAC?? ??? ??? ? ????, disable(0) ??? ????? ??
(??)
  PISAORA_R.B.33 [11.2.0 ~ ]
_disable_system_state
  OK:  _disable_system_state=10
NO: _disable_system_state=4294967294 (*default)
[Phenomenon] 11g Version operate in _disable_system_state = 4294967294.
[Problem] Oracle internally during abnormal system to perform the System State Dump, Level is set to high and the bulk of the load occurs Dump File.
               
In particular, the Diag Directory F / S Full phenomena occurring in
[Improvement] as a new feature to disable.
                
alter system set "_disable_system_state" = 10 scope = both;
Available ** ONLINE, ROLLABLE available
[Note]
Dump or explicitly call system, Oracle Code automatically carried over to the System State Dump Level Definitions. (Dump only done less than the specified value)

Third-party cases: System State Dump Level 267 due to "latch: gc element" contention and DISK I / O performance bottlenecks prevent the delay.
  PISAORA_R.B.34 Parameter Check to handle the failover connections
? 猜测是连接数不足,无法处理故障转移
SQL> select 'Parameter ==> '||resource_name||' inst#: '||inst_id|| ' current: '||current_utilization||' max: '||max_utilization||' limit: '||limit_value
 
from gv$resource_limit
 
where resource_name in ('processes', 'sessions')
 order by resource_name, inst_id;
OK: processes, sessions parameter are set to the larger value than the sum of "sessions high water mark (max)" of the other instance

NO: processes, sessions parameters are not large enough to get the failover connections (based on the sessions high water mark (max))
 
[??]
[???]
[????]
  PISAORA_R.B.35 [11.2.0.1 to 11.2.0.3]
VIP Intermediate Attribute
由于每秒都监控Network,在公网紧张时,导致不必要的listener offline 和vip failover
<<rac span="" listener止 >>
crsctl stat res ora.rac1.vip -p | grep STOP_DEPENDENCIES OK: STOP_DEPENDENCIES=hard(intermediate:ora.net1.network)

NO: STOP_DEPENDENCIES=hard(ora.net1.network)
[??] CRS Resource ?? Listener, Service? Stop ?.

[
???] Check Timeout? ?? ?? Resource? Unknown ??? ???? ?? Online? ?, ?? Dependency? ?? Resource? Stop?? Online ? Failover?? ?? ??.

[
????] GI(CRS) PSU 11.2.0.3.3 ?? ?? ? VIP Stop Dependency? 'immediate' ?? ??.

1.
?? ?? ??
$ crsctl stat res ora.bjmcspd1.vip -p | grep STOP_DEPENDENCIES -->
?? ??: STOP_DEPENDENCIES=hard(ora.net1.network)

2.
?? (VIP/ScanVIP)
$ crsctl modify res ora.bjmcspd1.vip -attr "STOP_DEPENDENCIES=hard(intermediate:ora.net1.network)"
$ crsctl modify res ora.bjmcspd2.vip -attr "STOP_DEPENDENCIES=hard(intermediate:ora.net1.network)"
$ crsctl modify res ora.scan1.vip -attr "STOP_DEPENDENCIES=hard(intermediate:ora.net1.network)“
3. nodeapps/VIP
???

[?        ?] [ID 1333165.1] VIP, SCAN VIP/Listener Fails Over and Listener Stops After Short Public Network Hiccup 

网友评论

登录后评论
0/500
评论
flzhang
+ 关注