我们先看一下这个报错日志:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
InnoDB: Warning: a long semaphore wait:
--Thread  140593224754944  has waited at btr0cur.c line  528  for  241.00  seconds the semaphore:
X-lock on RW-latch at  0x7fd9142bfcc8  created  in  file dict0dict.c line  1838
a writer (thread id  140570526021376 ) has reserved it  in  mode exclusive
number of readers  0 , waiters flag  1 , lock_word:  0
Last time read locked  in  file btr0cur.c line  535
Last time write locked  in  file /pb2/build/sb_0- 10180689 - 1378752874.69 /mysql- 5.5 . 34 /storage/innobase/btr/btr0cur.c line  528
InnoDB: Warning: a long semaphore wait:
--Thread  140570431108864  has waited at btr0cur.c line  528  for  241.00  seconds the semaphore:
X-lock on RW-latch at  0x7fd9142bfcc8  created  in  file dict0dict.c line  1838
a writer (thread id  140570526021376 ) has reserved it  in  mode exclusive
number of readers  0 , waiters flag  1 , lock_word:  0
Last time read locked  in  file btr0cur.c line  535
Last time write locked  in  file /pb2/build/sb_0- 10180689 - 1378752874.69 /mysql- 5.5 . 34 /storage/innobase/btr/btr0cur.c line  528
……………………
END OF INNODB MONITOR OUTPUT
============================
InnoDB: ###### Diagnostic info printed to the standard error stream
InnoDB: Error: semaphore wait has lasted >  600  seconds
InnoDB: We intentionally crash the server, because it appears to be hung.
140101  4 : 32 : 58  InnoDB: Assertion failure  in  thread  140570570065664  in  file srv0srv.c line  2502
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http: //bugs.mysql.com.
InnoDB: If you  get  repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption  in  the InnoDB tablespace. Please refer to
InnoDB: http: //dev.mysql.com/doc/refman/5.5/...-recovery.html
InnoDB: about forcing recovery.
20 : 32 : 58  UTC - mysqld got signal  6  ;
This could be because you hit a bug. It  is  also possible that  this  binary
or one of the libraries it was linked against  is  corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will  try  our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something  is  definitely wrong and  this  may fail.
key_buffer_size= 16777216
read_buffer_size= 131072
max_used_connections= 608
max_threads= 1600
thread_count= 516
connection_count= 515
It  is  possible that mysqld could  use  up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads =  444459  K bytes of memory
Hope that's ok;  if  not, decrease some  var iables  in  the equation.
Thread pointer:  0x0
Attempting back trace . You can  use  the following information to find out
where mysqld died. If you see no messages after  this , something went
terribly wrong...
stack_bottom =  0  thread_stack  0x30000
/usr/local/mysql/bin/mysqld(my_print_stack trace + 0x35 )[ 0x7a5f15 ]
/usr/local/mysql/bin/mysqld(handle_fatal_signal+ 0x403 )[ 0x673a13 ]
/lib/libpthread.so. 0 (+ 0xef60 )[ 0x7fde6901cf60 ]
/lib/libc.so. 6 (gsignal+ 0x35 )[ 0x7fde68219165 ]
/lib/libc.so. 6 (abort+ 0x180 )[ 0x7fde6821bf70 ]
/usr/local/mysql/bin/mysqld[ 0x7ff2ce ]
/lib/libpthread.so. 0 (+ 0x68ba )[ 0x7fde690148ba ]
/lib/libc.so. 6 (clone+ 0x6d )[ 0x7fde682b602d ]
The manual page at http: //dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what  is  causing the crash.
131231  04 : 34 : 11  mysqld_safe  Number  of processes running now:  0
131231  04 : 34 : 11  mysqld_safe mysqld restarted


这台机器凌晨MySQL进程崩溃,错误日志里全都是
InnoDB: Warning: a long semaphore wait
--Thread 140570431108864 has waited at btr0cur.c line 528 for 241.00 seconds the semaphore:
X-lock on RW-latch at 0x7fd9142bfcc8 created in file dict0dict.c line 1838


查看监控图(参考25日至31日)

wKioOVLE5fKwEiwxAACONro_77c509.jpg

发现spin waits和OS waits等待时间相当高,在手册里查到了这一句话:

1
You can monitor the  use  of the adaptive hash index and the contention  for  its  use  in  the SEMAPHORES section of the output of the SHOW ENGINE INNODB STATUS command. If you see many threads waiting on an RW-latch created  in  btr0sea.c, then it might be useful to disable adaptive hash indexing.


1
Sometimes, the read/write lock that guards access to the adaptive hash index can become a source of contention under heavy workloads, such  as  multiple concurrent joins.


由于自适应哈希索引造成大量的锁争用,进而堵塞很多进程,最终导致MySQL崩溃重启。


找到原因后,关闭了自适应哈希索引,观察了一天后(参考性能图1月1日),spin waits和OS waits等待时间逐渐减少。

1
set  global  innodb_adaptive_hash_index = 0;


最终病因找到解决之。


参考手册:

wKioJlLE6AuS5eIHAAHXwF5m3a4237.jpg