SSD 因 NCQ hang,failed command: WRITE FPDMA QUEUED / tag 28 ncq 4096 out

  1. 云栖社区>
  2. 博客列表>
  3. 正文

SSD 因 NCQ hang,failed command: WRITE FPDMA QUEUED / tag 28 ncq 4096 out

德哥 2016-03-31 15:46:33 浏览1350 评论0

摘要: 新购入的建兴ZETA 256G,在CentOS 7.2中,用PostgreSQL自带的fsync测试工具pg_test_fsync测试IOPS时,突然IO hang住了。 dmesg报了一堆这样的超时: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) [ 895.

新购入的建兴ZETA 256G,在CentOS 7.2中,用PostgreSQL自带的fsync测试工具pg_test_fsync测试IOPS时,突然IO hang住了。
dmesg报了一堆这样的超时:
         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  895.604149] ata1.00: status: { DRDY }
[  895.606940] ata1.00: failed command: WRITE FPDMA QUEUED
[  895.609389] ata1.00: cmd 61/08:e0:38:bd:06/00:00:00:00:00/40 tag 28 ncq 4096 out
         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  895.614144] ata1.00: status: { DRDY }
[  895.616516] ata1.00: failed command: WRITE FPDMA QUEUED
[  895.618665] ata1.00: cmd 61/10:e8:00:90:06/02:00:00:00:00/40 tag 29 ncq 270336 out
         res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  895.622940] ata1.00: status: { DRDY }
[  895.625089] ata1.00: failed command: WRITE FPDMA QUEUED
[  895.627236] ata1.00: cmd 61/00:f0:00:8c:06/04:00:00:00:00/40 tag 30 ncq 524288 out
         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  895.631176] ata1.00: status: { DRDY }
[  895.633133] ata1: hard resetting link
[  895.937682] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  895.940816] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out
[  895.940830] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out
[  895.941234] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out
[  895.941243] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out
[  895.941314] ata1.00: configured for UDMA/133
[  895.941356] ata1.00: device reported invalid CHS sector 0
[  895.941362] ata1.00: device reported invalid CHS sector 0
[  895.941366] ata1.00: device reported invalid CHS sector 0
[  895.941369] ata1.00: device reported invalid CHS sector 0
[  895.941374] ata1.00: device reported invalid CHS sector 0
[  895.941377] ata1.00: device reported invalid CHS sector 0
[  895.941381] ata1.00: device reported invalid CHS sector 0
[  895.941384] ata1.00: device reported invalid CHS sector 0
[  895.941388] ata1.00: device reported invalid CHS sector 0
[  895.941392] ata1.00: device reported invalid CHS sector 0
[  895.941395] ata1.00: device reported invalid CHS sector 0
[  895.941399] ata1.00: device reported invalid CHS sector 0
[  895.941403] ata1.00: device reported invalid CHS sector 0
[  895.941408] ata1.00: device reported invalid CHS sector 0
[  895.941434] ata1: EH complete


现象和网上描述的类似,很多SSD有这样的问题。
https://bugzilla.kernel.org/show_bug.cgi?id=15573
https://communities.intel.com/thread/77801?start=0&tstart=0
http://www.cnblogs.com/welhzh/p/4469206.html
http://patchwork.ozlabs.org/patch/49365/
建议关闭ncq。

什么是NCQ?
http://baike.baidu.com/view/17501.htm
NCQ(Native Command Queuing,全速命令队列)是被设计用于改进在日益增加的负荷情况下硬盘的性能和稳定性的技术。当用户的应用程序发送多条指令到用户的硬盘,NCQ硬盘可以优化完成这些指令的顺序,从而降低机械负荷达到提升性能的目的。 NCQ技术是一种使硬盘内部优化工作负荷执行顺序,通过对内部队列中的命令进行重新排序实现智能数据管理,改善硬盘因机械部件而受到的各种性能制约。
貌似对SSD没什么用,所以是SSD的话,可以关闭它。

查看了一下,装载ncq的信息如下:
# dmesg|gerp ncq

[    4.157792] ahci 0000:00:1f.2: flags: 64bit ncq sntf ilck pm led clo pio slum part ems apst 


解决办法:
禁用ncq,启动项中加入libata.force=noncq
[root@digoal ahci]# vi /etc/default/grub 

GRUB_CMDLINE_LINUX="rhgb quiet libata.force=noncq"

重启。
或者修改/boot/grub2/grub.cfg   加到rhgb quiet后面
libata.force=noncq 


(如果我有机械盘,又有SSD,怎么处理呢?)
(机械盘需要ncq,而SSD不需要NCQ。)
(此时需要patch libata的代码才行,针对硬盘型号来处理。)
针对不同的盘设置不同的queue_depth,设置为1和禁用ncq功能相当。
Disabling ncq by putting the following in /etc/conf.d/local.start. 
echo 1 > /sys/block/sdX/device/queue_depth 


解释一下  libata.force=noncq  
通过查看libata的模块信息
[root@digoal ~]# modinfo libata
filename:       /lib/modules/3.10.0-327.el7.x86_64/kernel/drivers/ata/libata.ko
version:        3.00
license:        GPL
description:    Library module for ATA devices
author:         Jeff Garzik
rhelversion:    7.2
srcversion:     042B7B276FD3988FFBEFB88
depends:        
intree:         Y
vermagic:       3.10.0-327.el7.x86_64 SMP mod_unload modversions 
signer:         CentOS Linux kernel signing key
sig_key:        79:AD:88:6A:11:3C:A0:22:35:26:33:6C:0F:82:5B:8A:94:29:6A:B3
sig_hashalgo:   sha256
parm:           acpi_gtf_filter:filter mask for ACPI _GTF commands, set to filter out (0x1=set xfermode, 0x2=lock/freeze lock, 0x4=DIPM, 0x8=FPDMA non-zero offset, 0x10=FPDMA DMA Setup FIS auto-activate) (int)
parm:           force:Force ATA configurations including cable type, link speed and transfer mode (see Documentation/kernel-parameters.txt for details) (string)
parm:           atapi_enabled:Enable discovery of ATAPI devices (0=off, 1=on [default]) (int)
parm:           atapi_dmadir:Enable ATAPI DMADIR bridge support (0=off [default], 1=on) (int)
parm:           atapi_passthru16:Enable ATA_16 passthru for ATAPI devices (0=off, 1=on [default]) (int)
parm:           fua:FUA support (0=off [default], 1=on) (int)
parm:           ignore_hpa:Ignore HPA limit (0=keep BIOS limits, 1=ignore limits, using full disk) (int)
parm:           dma:DMA enable/disable (0x1==ATA, 0x2==ATAPI, 0x4==CF) (int)
parm:           ata_probe_timeout:Set ATA probing timeout (seconds) (int)
parm:           noacpi:Disable the use of ACPI in probe/suspend/resume (0=off [default], 1=on) (int)
parm:           allow_tpm:Permit the use of TPM commands (0=off [default], 1=on) (int)
parm:           atapi_an:Enable ATAPI AN media presence notification (0=0ff [default], 1=on) (int)


看到有一个force参数,它提示详见内核文档。
[root@digoal ~]# less /usr/share/doc/kernel-doc-3.10.0/Documentation/kernel-parameters.txt

找到了对应的解释
        libata.force=   [LIBATA] Force configurations.  The format is comma
                        separated list of "[ID:]VAL" where ID is
                        PORT[.DEVICE].  PORT and DEVICE are decimal numbers
                        matching port, link or device.  Basically, it matches
                        the ATA ID string printed on console by libata.  If
                        the whole ID part is omitted, the last PORT and DEVICE
                        values are used.  If ID hasn't been specified yet, the
                        configuration applies to all ports, links and devices.

                        If only DEVICE is omitted, the parameter applies to
                        the port and all links and devices behind it.  DEVICE
                        number of 0 either selects the first device or the
                        first fan-out link behind PMP device.  It does not
                        select the host link.  DEVICE number of 15 selects the
                        host link and device attached to it.

                        The VAL specifies the configuration to force.  As long
                        as there's no ambiguity shortcut notation is allowed.
                        For example, both 1.5 and 1.5G would work for 1.5Gbps.
                        The following configurations can be forced.

                        * Cable type: 40c, 80c, short40c, unk, ign or sata.
                          Any ID with matching PORT is used.

                        * SATA link speed limit: 1.5Gbps or 3.0Gbps.

                        * Transfer mode: pio[0-7], mwdma[0-4] and udma[0-7].
                          udma[/][16,25,33,44,66,100,133] notation is also
                          allowed.

                        * [no]ncq: Turn on or off NCQ.  # 和本文相关的部分。

                        * nohrst, nosrst, norst: suppress hard, soft
                          and both resets.

                        * rstonce: only attempt one reset during
                          hot-unplug link recovery

                        * dump_id: dump IDENTIFY data.

                        * atapi_dmadir: Enable ATAPI DMADIR bridge support

                        * disable: Disable this device.

                        If there are multiple matching configurations changing
                        the same attribute, the last one is used.

模块参数也可以在这里查看。
[root@digoal ~]# cd /sys/module/libata/parameters/
[root@digoal parameters]# ll
total 0
-rw-r--r-- 1 root root 4096 Dec 20 21:17 acpi_gtf_filter
-r--r--r-- 1 root root 4096 Dec 20 21:17 allow_tpm
-r--r--r-- 1 root root 4096 Dec 20 21:17 atapi_an
-r--r--r-- 1 root root 4096 Dec 20 21:17 atapi_dmadir
-r--r--r-- 1 root root 4096 Dec 20 21:17 atapi_enabled
-r--r--r-- 1 root root 4096 Dec 20 21:17 atapi_passthru16
-r--r--r-- 1 root root 4096 Dec 20 21:17 ata_probe_timeout
-r--r--r-- 1 root root 4096 Dec 20 21:17 dma
-r--r--r-- 1 root root 4096 Dec 20 21:17 fua
-rw-r--r-- 1 root root 4096 Dec 20 21:17 ignore_hpa
-r--r--r-- 1 root root 4096 Dec 20 21:17 noacpi

用云栖社区APP,舒服~

【云栖快讯】Apache旗下顶级开源盛会 HBasecon Asia 2018将于8月17日在京举行,现场仅600席,免费赠票领取入口  详情请点击

网友评论

德哥
文章1883篇 | 关注5327
关注
是众安保险针对阿里云用户推出的信息安全综合保险,若因黑客攻击导致用户云服务器上的数据泄露并造... 查看详情
通过机器学习和数据建模发现潜在的入侵和攻击威胁,帮助客户建设自己的安全监控和防御体系,从而解... 查看详情
为您提供简单高效、处理能力可弹性伸缩的计算服务,帮助您快速构建更稳定、安全的应用,提升运维效... 查看详情
阿里云总监课正式启航

阿里云总监课正式启航