追踪源码揪出mysql崩溃的真凶

  1. 云栖社区>
  2. 博客>
  3. 正文

追踪源码揪出mysql崩溃的真凶

门牙没了 2016-01-08 19:35:57 浏览777
展开阅读全文

前言

      在MySQL服务器断电或者kill mysqld进程后,再操作数据库会发生莫名其妙的重启,我们下面就根据一个案例,来逐步的进行分析和诊断,解决这种常见的问题。

故障现象

从一张表中查询数据时报如下错误



随后整理表时也报同样错误:

mysql> optimize table tablex ;

ERROR 2013 (HY000): Lost connection to MySQL server during query

这意味着无法直接对表进行修复工作。

如果加上limit从表中直接查询数据(select * from tablex limit 10000,1;),limit的值增长至198900时,重现上述错误。


故障初步分析

1.我们首先来看下当时的错误日志:

点击(此处)折叠或打开

  1. The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
  2. information that should help you find out what is causing the crash.
  3. 151203 12:31:23 mysqld_safe Number of processes running now: 0
  4. 151203 12:31:23 mysqld_safe mysqld restarted
  5. 2015-12-03 12:31:24 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
  6. 2015-12-03 12:31:24 0 [Note] /usr/local/mysql/bin/mysqld (mysqld 5.6.26-log) starting as process 2207 ...
  7. 2015-12-03 12:31:24 2207 [Note] Plugin 'FEDERATED' is disabled.
  8. 2015-12-03 12:31:24 7ff8243a6720 InnoDB: Warning: Using innodb_additional_mem_pool_size is DEPRECATED. This option may be removed in future releases, together with the option innodb_use_sys_malloc and with the InnoDB's internal memory allocator.
  9. 2015-12-03 12:31:24 2207 [Note] InnoDB: Using atomics to ref count buffer pool pages
  10. 2015-12-03 12:31:24 2207 [Note] InnoDB: The InnoDB memory heap is disabled
  11. 2015-12-03 12:31:24 2207 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
  12. 2015-12-03 12:31:24 2207 [Note] InnoDB: Memory barrier is not used
  13. 2015-12-03 12:31:24 2207 [Note] InnoDB: Compressed tables use zlib 1.2.3
  14. 2015-12-03 12:31:24 2207 [Note] InnoDB: Using Linux native AIO
  15. 2015-12-03 12:31:24 2207 [Note] InnoDB: Using CPU crc32 instructions
  16. 2015-12-03 12:31:24 2207 [Note] InnoDB: Initializing buffer pool, size = 2.5G
  17. 2015-12-03 12:31:24 2207 [Note] InnoDB: Completed initialization of buffer pool
  18. 2015-12-03 12:31:24 2207 [Note] InnoDB: Highest supported file format is Barracuda.
  19. 2015-12-03 12:31:24 2207 [Note] InnoDB: The log sequence numbers 20043317101 and 20043317101 in ibdata files do not match the log sequence number 20540489747 in the ib_logfiles!
  20. 2015-12-03 12:31:24 2207 [Note] InnoDB: Database was not shutdown normally!
  21. 2015-12-03 12:31:24 2207 [Note] InnoDB: Starting crash recovery.
  22. 2015-12-03 12:31:24 2207 [Note] InnoDB: Reading tablespace information from the .ibd files...
  23. 2015-12-03 12:31:24 2207 [Note] InnoDB: Restoring possible half-written data pages
  24. 2015-12-03 12:31:24 2207 [Note] InnoDB: from the doublewrite buffer...
  25. InnoDB: Last MySQL binlog file position 0 47771, file name mysql-bin.000036
  26. 2015-12-03 12:31:24 2207 [Note] InnoDB: 128 rollback segment(s) are active.
  27. 2015-12-03 12:31:24 2207 [Note] InnoDB: Waiting for purge to start
  28. 2015-12-03 12:31:24 2207 [Note] InnoDB: 5.6.26 started; log sequence number 20540489747
  29. 2015-12-03 12:31:24 2207 [Note] Recovering after a crash using /data/mysql3306/logs/mysql-bin
  30. 2015-12-03 12:31:24 2207 [Note] Starting crash recovery...
  31. 2015-12-03 12:31:24 2207 [Note] Crash recovery finished.
  32. 2015-12-03 12:31:25 2207 [Note] Server hostname (bind-address): '*'; port: 3306
  33. 2015-12-03 12:31:25 2207 [Note] IPv6 is available.
  34. 2015-12-03 12:31:25 2207 [Note] - '::' resolves to '::';
  35. 2015-12-03 12:31:25 2207 [Note] Server socket created on IP: '::'.
  36. 2015-12-03 12:31:25 2207 [Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
  37. 2015-12-03 12:31:25 2207 [Note] Event Scheduler: Loaded 0 events
  38. 2015-12-03 12:31:25 2207 [Note] /usr/local/mysql/bin/mysqld: ready for connections.
  39. Version: '5.6.26-log' socket: '/data/mysql3306/mysql.sock' port: 3306 MySQL Community Server (GPL)
  40. 2015-12-03 12:31:25 2207 [Note] Slave I/O thread: connected to master 'replication@192.168.30.218:3306',replication started in log 'mysql-bin.000020' at position 47355
  41. 2015-12-03 12:31:25 2207 [Warning] Slave SQL: If a crash happens this configuration does not guarantee that the relay log info will be consistent, Error_code: 0
  42. 2015-12-03 12:31:25 2207 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.000020' at position 47355, relay log './mysqld-relay.000103' position: 283
  43. 2015-12-03 14:13:15 7ff723baa700 InnoDB: Assertion failure in thread 140699433084672 in file btr0pcur.cc line 430
  44. InnoDB: Failing assertion: page_is_comp(next_page) == page_is_comp(page)
  45. InnoDB: We intentionally generate a memory trap.
  46. InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
  47. InnoDB: If you get repeated assertion failures or crashes, even
  48. InnoDB: immediately after the mysqld startup, there may be
  49. InnoDB: corruption in the InnoDB tablespace. Please refer to
  50. InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
  51. InnoDB: about forcing recovery.
  52. 06:13:15 UTC - mysqld got signal 6 ;
  53. This could be because you hit a bug. It is also possible that this binary
  54. or one of the libraries it was linked against is corrupt, improperly built,
  55. or misconfigured. This error can also be caused by malfunctioning hardware.
  56. We will try our best to scrape up some info that will hopefully help
  57. diagnose the problem, but since we have already crashed,
  58. something is definitely wrong and this may fail.
  59. key_buffer_size=8388608
  60. read_buffer_size=131072
  61. max_used_connections=51
  62. max_threads=5000
  63. thread_count=52
  64. connection_count=50
  65. It is possible that mysqld could use up to
  66. key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1992176 K bytes of memory
  67. Hope that's ok; if not, decrease some variables in the equation.
  68. Thread pointer: 0x65d2370
  69. Attempting backtrace. You can use the following information to find out
  70. where mysqld died. If you see no messages after this, something went
  71. terribly wrong...
  72. stack_bottom = 7ff723ba9e10 thread_stack 0x40000
  73. /usr/local/mysql/bin/mysqld(my_print_stacktrace+0x35)[0x90d8e5]
  74. /usr/local/mysql/bin/mysqld(handle_fatal_signal+0x3d8)[0x674078]
  75. /lib64/libpthread.so.0(+0xf790)[0x7ff823f89790]
  76. /lib64/libc.so.6(gsignal+0x35)[0x7ff822c34625]
  77. /lib64/libc.so.6(abort+0x175)[0x7ff822c35e05]
  78. /usr/local/mysql/bin/mysqld[0xa29954]
  79. /usr/local/mysql/bin/mysqld[0x9cfd85]
  80. /usr/local/mysql/bin/mysqld[0x931341]
  81. /usr/local/mysql/bin/mysqld(_ZN7handler11ha_rnd_nextEPh+0x64)[0x58c2d4]
  82. /usr/local/mysql/bin/mysqld(_Z13rr_sequentialP11READ_RECORD+0x37)[0x83ebc7]
  83. /usr/local/mysql/bin/mysqld(_Z10sub_selectP4JOINP13st_join_tableb+0x181)[0x6d2691]
  84. /usr/local/mysql/bin/mysqld(_ZN4JOIN4execEv+0x391)[0x6d04b1]
  85. /usr/local/mysql/bin/mysqld[0x717329]
  86. /usr/local/mysql/bin/mysqld(_Z12mysql_selectP3THDP10TABLE_LISTjR4ListI4ItemEPS4_P10SQL_I_ListI8st_orderESB_S7_yP13select_resultP18st_select_lex_unitP13st_select_lex+0xbc)[0x717dec]
  87. /usr/local/mysql/bin/mysqld(_Z13handle_selectP3THDP13select_resultm+0x175)[0x717ff5]
  88. /usr/local/mysql/bin/mysqld[0x6f26e9]
  89. /usr/local/mysql/bin/mysqld(_Z21mysql_execute_commandP3THD+0x3574)[0x6f6f14]
  90. /usr/local/mysql/bin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x338)[0x6faa48]
  91. /usr/local/mysql/bin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0xc68)[0x6fbe28]
  92. /usr/local/mysql/bin/mysqld(_Z10do_commandP3THD+0xd7)[0x6fdc37]
  93. /usr/local/mysql/bin/mysqld(_Z24do_handle_one_connectionP3THD+0x116)[0x6c4bb6]
  94. /usr/local/mysql/bin/mysqld(handle_one_connection+0x45)[0x6c4c95]
  95. /usr/local/mysql/bin/mysqld(pfs_spawn_thread+0x126)[0xaf92f6]
  96. /lib64/libpthread.so.0(+0x7a51)[0x7ff823f81a51]
  97. /lib64/libc.so.6(clone+0x6d)[0x7ff822cea93d]
  98. Trying to get some variables.
  99. Some pointers may be invalid and cause the dump to abort.
  100. Query (7ff6d8004df0): is an invalid pointer
  101. Connection ID (thread ID): 7
  102. Status: NOT_KILLED
  103. The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
  104. information that should help you find out what is causing the crash.
  105. 151203 14:13:15 mysqld_safe Number of processes running now: 0
  106. 151203 14:13:15 mysqld_safe mysqld restarted
  107. 2015-12-03 14:13:16 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
  108. 2015-12-03 14:13:16 0 [Note] /usr/local/mysql/bin/mysqld (mysqld 5.6.26-log) starting as process 2481 ...
  109. 2015-12-03 14:13:16 2481 [Note] Plugin 'FEDERATED' is disabled.
  110. 2015-12-03 14:13:16 7fec16233720 InnoDB: Warning: Using innodb_additional_mem_pool_size is DEPRECATED. This option may be removed in future releases, together with the option innodb_use_sys_malloc and with the InnoDB's internal memory allocator.
  111. 2015-12-03 14:13:16 2481 [Note] InnoDB: Using atomics to ref count buffer pool pages
  112. 2015-12-03 14:13:16 2481 [Note] InnoDB: The InnoDB memory heap is disabled
  113. 2015-12-03 14:13:16 2481 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
  114. 2015-12-03 14:13:16 2481 [Note] InnoDB: Memory barrier is not used
  115. 2015-12-03 14:13:16 2481 [Note] InnoDB: Compressed tables use zlib 1.2.3
  116. 2015-12-03 14:13:16 2481 [Note] InnoDB: Using Linux native AIO
  117. 2015-12-03 14:13:16 2481 [Note] InnoDB: Using CPU crc32 instructions
  118. 2015-12-03 14:13:16 2481 [Note] InnoDB: Initializing buffer pool, size = 2.5G
  119. 2015-12-03 14:13:17 2481 [Note] InnoDB: Completed initialization of buffer pool
  120. 2015-12-03 14:13:17 2481 [Note] InnoDB: Highest supported file format is Barracuda.
  121. 2015-12-03 14:13:17 2481 [Note] InnoDB: The log sequence numbers 20043317101 and 20043317101 in ibdata files do not match the log sequence number 20612097468 in the ib_logfiles!
  122. 2015-12-03 14:13:17 2481 [Note] InnoDB: Database was not shutdown normally!
  123. 2015-12-03 14:13:17 2481 [Note] InnoDB: Starting crash recovery.
  124. 2015-12-03 14:13:17 2481 [Note] InnoDB: Reading tablespace information from the .ibd files...
  125. 2015-12-03 14:13:17 2481 [Note] InnoDB: Restoring possible half-written data pages
  126. 2015-12-03 14:13:17 2481 [Note] InnoDB: from the doublewrite buffer...
  127. InnoDB: Last MySQL binlog file position 0 366526, file name mysql-bin.000037
  128. 2015-12-03 14:13:17 2481 [Note] InnoDB: 128 rollback segment(s) are active.
  129. 2015-12-03 14:13:17 2481 [Note] InnoDB: Waiting for purge to start
  130. 2015-12-03 14:13:17 2481 [Note] InnoDB: 5.6.26 started; log sequence number 20612097468
  131. 2015-12-03 14:13:17 2481 [Note] Recovering after a crash using /data/mysql3306/logs/mysql-bin
  132. 2015-12-03 14:13:17 2481 [Note] Starting crash recovery...
  133. 2015-12-03 14:13:17 2481 [Note] Crash recovery finished.
  134. 2015-12-03 14:13:17 2481 [Note] Server hostname (bind-address): '*'; port: 3306
  135. 2015-12-03 14:13:17 2481 [Note] IPv6 is available.
  136. 2015-12-03 14:13:17 2481 [Note] - '::' resolves to '::';
  137. 2015-12-03 14:13:17 2481 [Note] Server socket created on IP: '::'.
  138. 2015-12-03 14:13:17 2481 [Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
  139. 2015-12-03 14:13:17 2481 [Note] Event Scheduler: Loaded 0 events
  140. 2015-12-03 14:13:17 2481 [Note] /usr/local/mysql/bin/mysqld: ready for connections.
  141. Version: '5.6.26-log' socket: '/data/mysql3306/mysql.sock' port: 3306 MySQL Community Server (GPL)
  142. 2015-12-03 14:13:17 2481 [Note] Slave I/O thread: connected to master 'replication@192.168.30.218:3306',replication started in log 'mysql-bin.000020' at position 409891
  143. 2015-12-03 14:13:17 2481 [Warning] Slave SQL: If a crash happens this configuration does not guarantee that the relay log info will be consistent, Error_code: 0
  144. 2015-12-03 14:13:17 2481 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.000020' at position 409891, relay log './mysqld-relay.000105' position: 283

主要注意其中几行:

InnoDB: Assertion failure in thread 140699433084672 in file btr0pcur.cc line 430
InnoDB: Failing assertion: page_is_comp(next_page) == page_is_comp(page)
InnoDB: We intentionally generate a memory trap.

我们只需要简单的了解一些源码相关的知识,就可以判断此类的问题。

相关知识

MySQL源码目录结构

源码目录结构请参考彭立勋的PPT
http://ofslides.com/plinux-6880/presentation-36798

故障深入分析

大致了解了mysql源码的目录结构,我们再来看报错的那个文件:btr0pcur.cc。

注释显示这是个和索引树的游标相关的文件。


430行所处的函数是btr_pcur_move_to_next_page()。这个函数的功能是: 移动持续的游标到下一条记录。

这时我们就不难理解为什么服务器掉电后,读取数据会发生崩溃。就是因为游标读取索引时遇到了损坏的数据块。

故障排除

确定了问题所在之后,我们就可以开始修复了。

1.在my.cnf中[mysqld]下添加innodb_force_recovery=1,然后重启mysql服务。

2.尝试使用mysqldump备份所有数据。如果备份失败,逐个增加innodb_force_recovery的值(最大为6)后再尝试备份。

3.将备份的数据在别处还原,并对还原的数据进行校验。

4.原mysql实例建议重建数据目录。因为服务器掉电不仅会损坏ibd文件,共享表空间ibdata和日志也可能有损坏。

注:

也可以使用创建MyISAM表的方式导出损坏的表的数据。但mysqldump是首选。

CREATE TABLE new_table ... ENGINE=MYISAM;

INSERT (IGNORE) INTO new_table SELECT * FROM old_table;

故障总结

1.要定期对数据进行备份

2.缓存的电池也要做好保养

3.不要轻易对mysqld进程做kill操作

网友评论

登录后评论
0/500
评论