Linux 文件同步工具——inotify+rsync实现实时同步

简介: 文章整理自:http://ixdba.blog.51cto.com/2895551/580280前面我们已经讲解了如何使用rsync实现文件同步,但是rsync会存在一些缺点:一、rsync的优点与不足 与传统的cp、tar备份方式相比,rsync具有安全性高、备份迅速、支持增量备份等优点,通过rsync可以解决对实时性要求不高的数据备份需求,例如定期的备份文件服务器数据到远端服务器,对本地磁盘定期做数据镜像等。

文章整理自:http://ixdba.blog.51cto.com/2895551/580280

前面我们已经讲解了如何使用rsync实现文件同步,但是rsync会存在一些缺点:

一、rsync的优点与不足
 与传统的cp、tar备份方式相比,rsync具有安全性高、备份迅速、支持增量备份等优点,通过rsync可以解决对实时性要求不高的数据备份需求,例如定期的备份文件服务器数据到远端服务器,对本地磁盘定期做数据镜像等。
 随着应用系统规模的不断扩大,对数据的安全性和可靠性也提出的更好的要求,rsync在高端业务系统中也逐渐暴露出了很多不足,首先,rsync同步数据时,需要扫描所有文件后进行比对,进行差量传输。如果文件数量达到了百万甚至千万量级,扫描所有文件将是非常耗时的。而且正在发生变化的往往是其中很少的一部分,这是非常低效的方式。其次,rsync不能实时的去监测、同步数据,虽然它可以通过linux守护进程的方式进行触发同步,但是两次触发动作一定会有时间差,这样就导致了服务端和客户端数据可能出现不一致,无法在应用故障时完全的恢复数据。基于以上原因,rsync+inotify组合出现了!


二、 初识inotify
 Inotify 是一种强大的、细粒度的、异步的文件系统事件监控机制,linux内核从2.6.13起,加入了Inotify支持,通过Inotify可以监控文件系统中添加、删除,修改、移动等各种细微事件,利用这个内核接口,第三方软件就可以监控文件系统下文件的各种变化情况,而inotify-tools就是这样的一个第三方软件。
在上面章节中,我们讲到,rsync可以实现触发式的文件同步,但是通过crontab守护进程方式进行触发,同步的数据和实际数据会有差异,而inotify可以监控文件系统的各种变化,当文件有任何变动时,就触发rsync同步,这样刚好解决了同步数据的实时性问题。


三、 安装inotify工具inotify-tools
 由于inotify特性需要Linux内核的支持,在安装inotify-tools前要先确认Linux系统内核是否达到了2.6.13以上,如果Linux内核低于2.6.13版本,就需要重新编译内核加入inotify的支持,也可以用如下方法判断,内核是否支持inotify:
[root@localhost webdata]# uname -r
2.6.18-164.11.1.el5PAE
[root@localhost webdata]# ll /proc/sys/fs/inotify
总计 0
-rw-r--r-- 1 root root 0 04-13 19:56 max_queued_events
-rw-r--r-- 1 root root 0 04-13 19:56 max_user_instances
-rw-r--r-- 1 root root 0 04-13 19:56 max_user_watches
如果有上面三项输出,表示系统已经默认支持inotify,接着就可以开始安装inotify-tools了。


可以到http://inotify-tools.sourceforge.net/下载相应的inotify-tools版本,然后开始编译安装:
[root@localhost  ~]# tar zxvf inotify-tools-3.14.tar.gz 
root@localhost  ~]# cd inotify-tools-3.14
[root@localhost inotify-tools-3.14]# ./configure
[root@localhost inotify-tools-3.14]# make
[root@localhost inotify-tools-3.14]# make install
[root@localhost inotify-tools-3.14]# ll /usr/local/bin/inotifywa*
-rwxr-xr-x 1 root root 37264 04-14 13:42 /usr/local/bin/inotifywait
-rwxr-xr-x 1 root root 35438 04-14 13:42 /usr/local/bin/inotifywatch
inotify-tools安装完成后,会生成inotifywait和inotifywatch两个指令,其中,inotifywait用于等待文件或文件集上的一个特定事件,它可以监控任何文件和目录设置,并且可以递归地监控整个目录树。
inotifywatch用于收集被监控的文件系统统计数据,包括每个inotify事件发生多少次等信息。


四、 inotify相关参数
inotify定义了下列的接口参数,可以用来限制inotify消耗kernel memory的大小。由于这些参数都是内存参数,因此,可以根据应用需求,实时的调节其大小。下面分别做简单介绍。
    /proc/sys/fs/inotify/max_queued_evnets     —— 表示调用inotify_init时分配给inotify instance中可排队的event的数目的最大值,超出这个值的事件被丢弃,但会触发IN_Q_OVERFLOW事件。
    /proc/sys/fs/inotify/max_user_instances   ——表示每一个real user ID可创建的inotify instatnces的数量上限。


    
/proc/sys/fs/inotify/max_user_watches     ——表示每个inotify instatnces可监控的最大目录数量。如果监控的文件数目巨大,需要根据情况,适当增加此值的大小,例如:

echo 30000000 > /proc/sys/fs/inotify/max_user_watches


五、 inotifywait相关参数
Inotifywait是一个监控等待事件,可以配合shell脚本使用它,下面介绍一下常用的一些参数:
 -m, 即--monitor,表示始终保持事件监听状态。
 -r, 即--recursive,表示递归查询目录。
 -q, 即--quiet,表示打印出监控事件。
 -e, 即--event,通过此参数可以指定要监控的事件,常见的事件有modify、delete、create、attrib等。
更详细的请参看man  inotifywait;

NAME
       inotifywait - wait for changes to files using inotify


SYNOPSIS
       inotifywait [-hcmrq] [-e <event> ] [-t <seconds> ] [--format <fmt> ] [--timefmt <fmt> ] <file> [ ... ]


DESCRIPTION
       inotifywait efficiently waits for changes to files using Linux?. inotify(7) interface.  It is suitable
       for waiting for changes to files from shell scripts.  It can either exit once an event occurs, or con-
       tinually execute and output events as they occur.


OUTPUT
       inotifywait will output diagnostic information on standard error and  event  information  on  standard
       output.   The  event  output  can  be configured, but by default it consists of lines of the following
       form:


       watched_filename EVENT_NAMES event_filename


       watched_filename
              is the name of the file on which the event occurred.  If the file is a  directory,  a  trailing
              slash is output.


       EVENT_NAMES
              are the names of the inotify events which occurred, separated by commas.


       event_filename
              is  output  only  when the event occurred on a directory, and in this case the name of the file
              within the directory which caused this event is output.


              By default, any special characters in filenames are not escaped in any way.  This can make  the
              output  of  inotifywait  difficult  to parse in awk scripts or similar.  The --csv and --format
              options will be helpful in this case.


OPTIONS
       -h, --help
              Output some helpful usage information.


       @<file>
              When watching a directory tree recursively, exclude the specified file from being watched.  The
              file  must  be  specified  with  a relative or absolute path according to whether a relative or
              absolute path is given for watched directories.  If a specific path is explicitly both included
              and excluded, it will always be watched.


              Note:  If  you  need  to  watch a directory or file whose name starts with @, give the absolute
              path.


       --fromfile <file>
              Read filenames to watch or exclude from a file, one filename per line.  If filenames begin with
              @  they  are  excluded  as described above.  If <file> is ?.?. filenames are read from standard
              input.  Use this option if you need to watch too many files to pass in as  command  line  argu-
              ments.


       -m, --monitor
              Instead of exiting after receiving a single event, execute indefinitely.  The default behaviour
              is to exit after the first event occurs.


       -d, --daemon
              Same as --monitor, except run in the background logging events to a file that must be specified
              by --outfile. Implies --syslog.


       -o, --outfile <file>
              Output events to <file> rather than stdout.


       -s, --syslog
              Output errors to syslog(3) system log module rather than stderr.


       -r, --recursive
              Watch all subdirectories of any directories passed as arguments.  Watches will be set up recur-
              sively to an unlimited depth.  Symbolic links are not traversed.  Newly created  subdirectories
              will also be watched.


              Warning:  If you use this option while watching the root directory of a large tree, it may take
              quite a while until all inotify watches are established, and events will  not  be  received  in
              this  time.  Also, since one inotify watch will be established per subdirectory, it is possible
              that the maximum amount of inotify watches per user will be reached.  The  default  maximum  is
              8192; it can be increased by writing to /proc/sys/fs/inotify/max_user_watches.


       -q, --quiet
              If  specified  once, the program will be less verbose.  Specifically, it will not state when it
              has completed establishing all inotify watches.


              If specified twice, the program will output nothing at all, except in the case of fatal errors.


       --exclude <pattern>
              Do  not  process any events whose filename matches the specified POSIX extended regular expres-
              sion, case sensitive.


       --excludei <pattern>
              Do not process any events whose filename matches the specified POSIX extended  regular  expres-
              sion, case insensitive.


       -t <seconds>, --timeout <seconds>
              Exit  if  an  appropriate event has not occurred within <seconds> seconds. If <seconds> is zero
              (the default), wait indefinitely for an event.


       -e <event>, --event <event>
              Listen for specific event(s) only.  The events which can be listened  for  are  listed  in  the
              EVENTS  section.  This option can be specified more than once.  If omitted, all events are lis-
              tened for.


       -c, --csv
              Output in CSV (comma-separated values) format.  This  is  useful  when  filenames  may  contain
              spaces, since in this case it is not safe to simply split the output at each space character.


       --timefmt <fmt>
              Set  a  time  format  string as accepted by strftime(3) for use with the ?.T?.conversion in the
              --format option.


       --format <fmt>
              Output in a user-specified format, using printf-like syntax.  The event strings output are lim-
              ited to around 4000 characters and will be truncated to this length.  The following conversions
              are supported:


       %w     This will be replaced with the name of the Watched file on which an event occurred.


       %f     When an event occurs within a directory, this will be replaced with the name of the File  which
              caused the event to occur.  Otherwise, this will be replaced with an empty string.


       %e     Replaced with the Event(s) which occurred, comma-separated.


       %Xe    Replaced  with the Event(s) which occurred, separated by whichever character is in the place of
              ?.?.


       %T     Replaced with the current Time in the format specified by the --timefmt option, which should be
              a format string suitable for passing to strftime(3).


EXIT STATUS
       0      The program executed successfully, and an event occurred which was being listened for.


       1      An  error  occurred  in execution of the program, or an event occurred which was not being lis-
              tened for.  The latter generally occurs if something happens which forcibly removes the inotify
              watch,  such  as a watched file being deleted or the filesystem containing a watched file being
              unmounted.


       2      The -t option was used and an event did not occur in the specified interval of time.


EVENTS
       The following events are valid for use with the -e option:


       access A watched file or a file within a watched directory was read from.


       modify A watched file or a file within a watched directory was written to.


       attrib The metadata of a watched file or a  file  within  a  watched  directory  was  modified.   This
              includes timestamps, file permissions, extended attributes etc.


       close_write
              A watched file or a file within a watched directory was closed, after being opened in writeable
              mode.  This does not necessarily imply the file was written to.


       close_nowrite
              A watched file or a file within a watched directory was closed, after being opened in read-only
              mode.


       close  A  watched  file  or  a  file  within  a watched directory was closed, regardless of how it was
              opened.  Note that this is actually implemented simply by listening for  both  close_write  and
              close_nowrite, hence all close events received will be output as one of these, not CLOSE.


       open   A watched file or a file within a watched directory was opened.


       moved_to
              A  file or directory was moved into a watched directory.  This event occurs even if the file is
              simply moved from and to the same directory.


       moved_from
              A file or directory was moved from a watched directory.  This event occurs even if the file  is
              simply moved from and to the same directory.


       move   A  file  or  directory  was  moved  from or to a watched directory.  Note that this is actually
              implemented simply by listening for both  moved_to  and  moved_from,  hence  all  close  events
              received will be output as one or both of these, not MOVE.


       move_self
              A  watched  file  or  directory was moved. After this event, the file or directory is no longer
              being watched.


       create A file or directory was created within a watched directory.


       delete A file or directory within a watched directory was deleted.


       delete_self
              A watched file or directory was deleted.  After this event the file or directory is  no  longer
              being watched.  Note that this event can occur even if it is not explicitly being listened for.


       unmount
              The filesystem on which a watched file or directory resides was unmounted.   After  this  event
              the file or directory is no longer being watched.  Note that this event can occur even if it is
              not explicitly being listened to.


EXAMPLES
   Example 1
       Running inotifywait at the command-line to wait for any file in the ?.est?.directory to  be  accessed.
       After running inotifywait, ?.at test/foo?.is run in a separate console.


       % inotifywait test
       Setting up watches.
       Watches established.
       test/ ACCESS foo


   Example 2
       A  short shell script to efficiently wait for httpd-related log messages and do something appropriate.


       #!/bin/sh
       while inotifywait -e modify /var/log/messages; do
         if tail -n1 /var/log/messages | grep httpd; then
           kdialog --msgbox "Apache needs love!"
         fi
       done


   Example 3
       A custom output format is used to watch ?./test?.   Meanwhile,  someone  runs  ?.ouch  ~/test/badfile;
       touch ~/test/goodfile; rm ~/test/badfile?.in another console.


       % inotifywait -m -r --format ?.:e %f?.~/test
       Setting up watches.  Beware: since -r was given, this may take a while!
       Watches established.
       CREATE badfile
       OPEN badfile
       ATTRIB badfile
       CLOSE_WRITE:CLOSE badfile
       CREATE goodfile
       OPEN goodfile
       ATTRIB goodfile
       CLOSE_WRITE:CLOSE goodfile
       DELETE badfile


BUGS
       There are race conditions in the recursive directory watching code which can cause events to be missed
       if they occur in a directory immediately after that directory is created.  This is probably  not  fix-
       able.


       It is assumed the inotify event queue will never overflow.


AUTHORS
       inotifywait is written and maintained by Rohan McGovern <rohan@mcgovern.id.au>.


       inotifywait  is  part  of  inotify-tools.   The  inotify-tools  website is located at: http://inotify-
       tools.sourceforge.net/


SEE ALSO
       inotifywatch(1), strftime(3), inotify(7)


inotifywait 3.14                March 14, 2010                  inotifywait(1)


六、 rsync+inotify企业应用案例

 案例描述
这是一个CMS内容发布系统,后端采用负载均衡集群部署方案,有一个负载调度节点和三个服务节点以及一个内容发布节点构成,内容发布节点负责将用户发布的数据生成静态页面,同时将静态网页传输到三台服务节点,而负载调度节点负责将用户请求根据负载算法调度到相应的服务节点,实现用户访问。用户要求在前端访问到的网页数据始终是最新的、一致的。
解决方案
为了保证用户访问到的数据一致性和实时性,必须保证三个服务节点与内容发布节点的数据始终是一致的,这就需要通过文件同步工具来实现,这里采用rsync,同时又要保证数据是实时的,这就需要inotify,即:使用inotify监视内容发布节点文件的变化,如果文件有变动,那么就启动rsync,将文件实时同步到三个服务节点。
系统环境
这里所有服务器均采用Linux操作系统,系统内核版本与节点信息如表1 所示:
表1


1 安装rsync与inotify-tools
inotify-tools是用来监控文件系统变化的工具,因此必须安装在内容发布节点,服务节点无需安装inotify-tools,另外需要在web1、web2、web3、webserver节点上安装rsync,由于安装非常简单,这里不在讲述。
在这个案例中,内容发布节点(即server)充当了rsync客户端的角色,而三个服务节点充当了rsync服务器端的角色,整个数据同步的过程,其实就是一个从客户端向服务端推送数据的过程。这点与上面我们讲述的案例刚好相反。



2 在B服务节点配置rsync
 这里给出三个服务节点的rsync配置文件,以供参考,读者可根据实际情况自行修改。
Web1节点rsyncd.conf配置如下:
uid = nobody
gid = nobody
use chroot = no
max connections = 10
strict modes = yes
pid file = /var/run/rsyncd.pid
lock file = /var/run/rsync.lock
log file = /var/log/rsyncd.log

[web1]
path = /web1/wwwroot/
comment = web1 file
ignore errors
read only = no
write only = no
hosts allow = 192.168.12.134
hosts deny = *
list = false
uid = root
gid = root
auth users = web1user
secrets file = /etc/web1.pass
Web2节点rsyncd.conf配置如下:
uid = nobody
gid = nobody
use chroot = no
max connections = 10
strict modes = yes
pid file = /var/run/rsyncd.pid
lock file = /var/run/rsync.lock
log file = /var/log/rsyncd.log

[web2]
path = /web2/wwwroot/
comment = web2 file
ignore errors
read only = no
write only = no
hosts allow = 192.168.12.134
hosts deny = *
list = false
uid = root
gid = root
auth users = web2user
secrets file = /etc/web2.pass
Web3节点rsyncd.conf配置如下:
uid = nobody
gid = nobody
use chroot = no
max connections = 10
strict modes = yes
pid file = /var/run/rsyncd.pid
lock file = /var/run/rsync.lock
log file = /var/log/rsyncd.log

[web3]
path = /web3/wwwroot/
comment = web3 file
ignore errors
read only = no
write only = no
hosts allow = 192.168.12.134
hosts deny = *
list = false
uid = root
gid = root
auth users = web3user
secrets file = /etc/web3.pass
在三台服务节点rsyncd.conf文件配置完成后,依次启动rsync守护进程,接着将rsync服务加入到自启动文件中:
echo  “/usr/local/bin/rsync --daemon” >>/etc/rc.local
到此为止,三个web服务节点已经配置完成。

3 配置内容发布节点
 配置内容发布节点的主要工作是将生成的静态网页实时的同步到集群中三个服务节点,这个过程可以通过一个shell脚本来完成,脚本内容大致如下:
#!/bin/bash
host1=192.168.12.131
host2=192.168.12.132
host3=192.168.12.133

src=/web/wwwroot/
dst1=web1
dst2=web2
dst3=web3
user1=web1user
user2=web3user
user3=web3user

/usr/local/bin/inotifywait -mrq --timefmt '%d/%m/%y %H:%M' --format '%T %w%f%e' -e modify,delete,create,attrib  $src \
| while read files
        do
        /usr/bin/rsync -vzrtopg --delete --progress --password-file=/etc/server.pass $src$user1@$host1::$dst1
  /usr/bin/rsync -vzrtopg --delete --progress --password-file=/etc/server.pass $src$user2@$host2::$dst2
  /usr/bin/rsync -vzrtopg --delete --progress --password-file=/etc/server.pass $src$user3@$host3::$dst3
                echo "${files} was rsynced" >>/tmp/rsync.log 2>&1
         done
脚本相关解释如下:
--timefmt:指定时间的输出格式。
--format:指定变化文件的详细信息。
这两个参数一般配合使用,通过指定输出格式,输出类似与:
15/04/10 00:29 /web/wwwroot/ixdba.shDELETE,ISDIR was rsynced
15/04/10 00:30 /web/wwwroot/index.htmlMODIFY was rsynced
15/04/10 00:31 /web/wwwroot/pcre-8.02.tar.gzCREATE was rsynced
这个脚本的作用就是通过inotify监控文件目录的变化,进而触发rsync进行同步操作,由于这个过程是一种主动触发操作,通过系统内核完成的,所以,比起那些遍历整个目录的扫描方式,效率要高很多。
有时会遇到这样的情况:向inotify监控的目录(这里是/web/wwwroot/)写入一个很大文件时,由于写入这个大文件需要一段时间,此时inotify就会持续不停的输出该文件被更新的信息, 这样就会持续不停的触发rsync去执行同步操作,占用了大量系统资源,那么针对这种情况,最理想的做法是等待文件写完后再去触发rsync同步。 在这种情况下,可以修改inotify的监控事件,即:“-e close_write,delete,create,attrib”。
接着,将这个脚本命名为inotifyrsync.sh,放到/web/wwwroot目录下,然后给定可执行权限,放到后台运行:
chmod 755 /web/wwwroot/inotifyrsync.sh
/web/wwwroot/inotifyrsync.sh &
最后,将此脚本加入系统自启动文件:
echo  “/web/wwwroot/inotifyrsync.sh &”>>/etc/rc.local
这样就完成了内容发布节点的所有配置工作。

4 测试rsync+inotify实时同步功能
 所有配置完成后,可以在网页发布节点的/web/wwwroot目录下添加、删除或者修改某个文件,然后到三个服务节点对应的目录查看文件是否跟随网页发布节点的/web/wwwroot目录下文件发生变化,如果你看到三个服务节点对应的目录文件跟着内容发布节点目录文件同步变化,那么我们这个业务系统就配置成功了。


相关文章
|
13天前
|
监控 Unix Linux
Linux操作系统调优相关工具(四)查看Network运行状态 和系统整体运行状态
Linux操作系统调优相关工具(四)查看Network运行状态 和系统整体运行状态
28 0
|
21天前
|
存储 前端开发 Linux
Linux系统之部署ToDoList任务管理工具
【4月更文挑战第1天】Linux系统之部署ToDoList任务管理工具
63 1
|
22天前
|
存储 传感器 运维
linux系统资源统计工具
【4月更文挑战第1天】Linux系统监控工具如dstat、htop、glances、vmstat、top、iostat、mpstat、sar和atop,用于跟踪CPU、内存、磁盘I/O、网络和进程性能。这些工具提供实时、交互式和历史数据分析,助力管理员优化系统性能和故障排查。例如,dstat是vmstat等工具的增强版,htop提供彩色界面的进程管理,而atop则结合了多种功能并记录历史数据。
28 5
linux系统资源统计工具
|
1月前
|
存储 Shell Linux
【Shell 命令集合 系统设置 】Linux 软件包管理工具 rpm命令 使用指南
【Shell 命令集合 系统设置 】Linux 软件包管理工具 rpm命令 使用指南
46 0
|
1月前
|
存储 算法 Shell
【Shell 命令集合 备份压缩 】⭐⭐Linux 压缩和归档工具 zip命令 使用指南
【Shell 命令集合 备份压缩 】⭐⭐Linux 压缩和归档工具 zip命令 使用指南
35 0
|
13天前
|
Linux
Linux操作系统调优相关工具(三)查看IO运行状态相关工具 查看哪个磁盘或分区最繁忙?
Linux操作系统调优相关工具(三)查看IO运行状态相关工具 查看哪个磁盘或分区最繁忙?
21 0
|
19天前
|
资源调度 JavaScript 安全
Linux系统之部署web-check网站分析工具
【4月更文挑战第3天】Linux系统之部署web-check网站分析工具
64 9
|
20天前
|
运维 监控 Linux
不是所有的Linux工具都会让人惊叹,但这个绝对让你叫绝
【4月更文挑战第3天】不是所有的Linux工具都会让人惊叹,但这个绝对让你叫绝
31 0
不是所有的Linux工具都会让人惊叹,但这个绝对让你叫绝
|
29天前
|
缓存 Linux iOS开发
【C/C++ 集成内存调试、内存泄漏检测和性能分析的工具 Valgrind 】Linux 下 Valgrind 工具的全面使用指南
【C/C++ 集成内存调试、内存泄漏检测和性能分析的工具 Valgrind 】Linux 下 Valgrind 工具的全面使用指南
64 1
|
1月前
|
安全 Shell Linux
【Shell 命令集合 系统管理 】Linux 终端复用工具 screen命令 使用指南
【Shell 命令集合 系统管理 】Linux 终端复用工具 screen命令 使用指南
33 0