ZFS fsync IOPS performance in FreeBSD

简介:
我在上一篇文章讲了一下ZFS的性能优化.
文章提到在Linux (CentOS 6.5 x64)中, ZFS的fsync调用性能不佳的问题, 完全不如ext4, 于是在同一台主机, 我安装了FreeBSD 10 x64. 使用同样的硬件测试一下fsync的性能.
PostgreSQL的安装参考
首先查看块设备, 这里使用12块4 TB的SATA盘.
# gpart list -a
Geom name: mfid[1-12]

创建zpool, 
# zpool create zp1 mfid1 mfid2 mfid3 mfid4 mfid5 mfid6 mfid7 mfid8 mfid9 mfid10 mfid11 mfid12
# zpool get all zp1
NAME  PROPERTY                       VALUE                          SOURCE
zp1   size                           43.5T                          -
zp1   capacity                       0%                             -
zp1   altroot                        -                              default
zp1   health                         ONLINE                         -
zp1   guid                           8490038421326880416            default
zp1   version                        -                              default
zp1   bootfs                         -                              default
zp1   delegation                     on                             default
zp1   autoreplace                    off                            default
zp1   cachefile                      -                              default
zp1   failmode                       wait                           default
zp1   listsnapshots                  off                            default
zp1   autoexpand                     off                            default
zp1   dedupditto                     0                              default
zp1   dedupratio                     1.00x                          -
zp1   free                           43.5T                          -
zp1   allocated                      285K                           -
zp1   readonly                       off                            -
zp1   comment                        -                              default
zp1   expandsize                     0                              -
zp1   freeing                        0                              default
zp1   feature@async_destroy          enabled                        local
zp1   feature@empty_bpobj            active                         local
zp1   feature@lz4_compress           enabled                        local
zp1   feature@multi_vdev_crash_dump  enabled                        local

创建zfs
# zfs create -o mountpoint=/data01 -o atime=off zp1/data01
# zfs get all zp1/data01
NAME        PROPERTY              VALUE                  SOURCE
zp1/data01  type                  filesystem             -
zp1/data01  creation              Thu Jun 26 23:52 2014  -
zp1/data01  used                  32K                    -
zp1/data01  available             42.8T                  -
zp1/data01  referenced            32K                    -
zp1/data01  compressratio         1.00x                  -
zp1/data01  mounted               yes                    -
zp1/data01  quota                 none                   default
zp1/data01  reservation           none                   default
zp1/data01  recordsize            128K                   default
zp1/data01  mountpoint            /data01                local
zp1/data01  sharenfs              off                    default
zp1/data01  checksum              on                     default
zp1/data01  compression           off                    default
zp1/data01  atime                 off                    local
zp1/data01  devices               on                     default
zp1/data01  exec                  on                     default
zp1/data01  setuid                on                     default
zp1/data01  readonly              off                    default
zp1/data01  jailed                off                    default
zp1/data01  snapdir               hidden                 default
zp1/data01  aclmode               discard                default
zp1/data01  aclinherit            restricted             default
zp1/data01  canmount              on                     default
zp1/data01  xattr                 off                    temporary
zp1/data01  copies                1                      default
zp1/data01  version               5                      -
zp1/data01  utf8only              off                    -
zp1/data01  normalization         none                   -
zp1/data01  casesensitivity       sensitive              -
zp1/data01  vscan                 off                    default
zp1/data01  nbmand                off                    default
zp1/data01  sharesmb              off                    default
zp1/data01  refquota              none                   default
zp1/data01  refreservation        none                   default
zp1/data01  primarycache          all                    default
zp1/data01  secondarycache        all                    default
zp1/data01  usedbysnapshots       0                      -
zp1/data01  usedbydataset         32K                    -
zp1/data01  usedbychildren        0                      -
zp1/data01  usedbyrefreservation  0                      -
zp1/data01  logbias               latency                default
zp1/data01  dedup                 off                    default
zp1/data01  mlslabel                                     -
zp1/data01  sync                  disabled               local
zp1/data01  refcompressratio      1.00x                  -
zp1/data01  written               32K                    -
zp1/data01  logicalused           16K                    -
zp1/data01  logicalreferenced     16K                    -

测试fsync, 相比Linux有很大的提升, 基本达到了块设备的瓶颈.
# /opt/pgsql9.3.4/bin/pg_test_fsync -f /data01/1
5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.

Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
        open_datasync                                 n/a
        fdatasync                                     n/a
        fsync                            6676.001 ops/sec     150 usecs/op
        fsync_writethrough                            n/a
        open_sync                        6087.783 ops/sec     164 usecs/op

Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
        open_datasync                                 n/a
        fdatasync                                     n/a
        fsync                            4750.841 ops/sec     210 usecs/op
        fsync_writethrough                            n/a
        open_sync                        3065.099 ops/sec     326 usecs/op

Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
         1 * 16kB open_sync write        4965.249 ops/sec     201 usecs/op
         2 *  8kB open_sync writes       3039.074 ops/sec     329 usecs/op
         4 *  4kB open_sync writes       1598.735 ops/sec     625 usecs/op
         8 *  2kB open_sync writes       1326.517 ops/sec     754 usecs/op
        16 *  1kB open_sync writes        620.992 ops/sec    1610 usecs/op

Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
        write, fsync, close              5422.742 ops/sec     184 usecs/op
        write, close, fsync              5552.278 ops/sec     180 usecs/op

Non-Sync'ed 8kB writes:
        write                           67460.621 ops/sec      15 usecs/op

# zpool iostat -v 1
               capacity     operations    bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
zp1          747M  43.5T      0  7.31K      0  39.1M
  mfid1     62.8M  3.62T      0    638      0  3.27M
  mfid2     61.9M  3.62T      0    615      0  3.23M
  mfid3     62.8M  3.62T      0    615      0  3.23M
  mfid4     62.0M  3.62T      0    615      0  3.23M
  mfid5     62.9M  3.62T      0    616      0  3.24M
  mfid6     62.0M  3.62T      0    616      0  3.24M
  mfid7     62.9M  3.62T      0    620      0  3.24M
  mfid8     61.6M  3.62T      0    620      0  3.24M
  mfid9     62.2M  3.62T      0    619      0  3.23M
  mfid10    61.8M  3.62T      0    615      0  3.23M
  mfid11    62.2M  3.62T      0    648      0  3.41M
  mfid12    62.1M  3.62T      0    650      0  3.29M
----------  -----  -----  -----  -----  -----  -----
zroot       2.69G   273G      0      0      0      0
  mfid0p3   2.69G   273G      0      0      0      0
----------  -----  -----  -----  -----  -----  -----

# iostat -x 1
                        extended device statistics  
device     r/s   w/s    kr/s    kw/s qlen svc_t  %b  
mfid0      0.0   0.0     0.0     0.0    0   0.0   0 
mfid1      0.0 416.6     0.0  7468.5    0   0.1   2 
mfid2      0.0 416.6     0.0  7468.5    0   0.0   2 
mfid3      0.0 429.6     0.0  7480.0    0   0.1   2 
mfid4      0.0 433.6     0.0  7484.0    0   0.1   3 
mfid5      0.0 433.6     0.0  7495.9    0   0.1   2 
mfid6      0.0 421.6     0.0  7484.5    0   0.1   3 
mfid7      0.0 417.6     0.0  7488.5    0   0.1   3 
mfid8      0.0 438.6     0.0  7638.3    0   0.1   2 
mfid9      0.0 437.6     0.0  7510.4    0   0.1   2 
mfid10     0.0 428.6     0.0  7494.4    0   0.1   4 
mfid11     0.0 416.6     0.0  7468.5    0   0.1   2 
mfid12     0.0 416.6     0.0  7468.5    0   0.1   2


disable sync的情形, FreeBSD和Linux下差不多.
# zfs set sync=disabled zp1/data01
# /opt/pgsql9.3.4/bin/pg_test_fsync -f /data01/1
5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.

Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
        open_datasync                                 n/a
        fdatasync                                     n/a
        fsync                           115687.300 ops/sec       9 usecs/op
        fsync_writethrough                            n/a
        open_sync                       126789.698 ops/sec       8 usecs/op

Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
        open_datasync                                 n/a
        fdatasync                                     n/a
        fsync                           65027.801 ops/sec      15 usecs/op
        fsync_writethrough                            n/a
        open_sync                       60239.232 ops/sec      17 usecs/op

Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
         1 * 16kB open_sync write       115246.114 ops/sec       9 usecs/op
         2 *  8kB open_sync writes      63999.355 ops/sec      16 usecs/op
         4 *  4kB open_sync writes      33661.426 ops/sec      30 usecs/op
         8 *  2kB open_sync writes      18960.527 ops/sec      53 usecs/op
        16 *  1kB open_sync writes       8251.087 ops/sec     121 usecs/op

Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
        write, fsync, close             47380.701 ops/sec      21 usecs/op
        write, close, fsync             50214.128 ops/sec      20 usecs/op

Non-Sync'ed 8kB writes:
        write                           78263.057 ops/sec      13 usecs/op

[参考]
相关文章
|
Unix 网络安全 C语言
|
Shell Unix 域名解析
|
数据安全/隐私保护 网络协议 Unix