处理i磁带读写过多导致的job abend问题

本文涉及的产品
文件存储 NAS,50GB 3个月
简介:

dz993013@bkj0d101# ./cs 10 | grep ZLF07A 
2016/03/01-716,ZLF07A,Backup (full),Tue Mar 1 16:40:55 2016,Tue Mar 1 19:59:31 2016,Completed ,0,0,root,sys,bkj0d101 
2016/03/03-711,ZLF07A,Backup (full),Thu Mar 3 16:48:03 2016,Thu Mar 3 20:02:10 2016,Completed ,0,0,root,sys,bkj0d101 
2016/03/04-698,ZLF07A,Backup (full),Fri Mar 4 16:33:00 2016,Fri Mar 4 19:43:47 2016,Completed ,0,0,root,sys,bkj0d101 
2016/03/05-649,ZLF07A,Backup (full),Sat Mar 5 16:44:23 2016,Sat Mar 5 20:09:05 2016,Completed ,0,0,root,sys,bkj0d101 
2016/03/06-747,ZLF07A,Backup (full),Sun Mar 6 16:32:26 2016,Sun Mar 6 19:42:38 2016,Completed ,0,0,root,sys,bkj0d101 
2016/03/07-714,ZLF07A,Backup (full),Mon Mar 7 16:30:54 2016,Mon Mar 7 19:41:31 2016,Completed ,0,0,root,sys,bkj0d101 
2016/03/08-698,ZLF07A,Backup (full),Tue Mar 8 16:32:23 2016,Tue Mar 8 19:51:35 2016,Completed ,0,0,root,sys,bkj0d101 
2016/03/09-695,ZLF07A,Backup (full),Wed Mar 9 16:31:45 2016,Wed Mar 9 19:44:20 2016,Completed ,0,0,root,sys,bkj0d101 
2016/03/10-710,ZLF07A,Backup (full),Thu Mar 10 16:37:24 2016,Thu Mar 10 20:18:27 2016,Completed ,0,0,root,sys,bkj0d101 
2016/03/11-695,ZLF07A,Backup (full),Fri Mar 11 16:30:26 2016,Fri Mar 11 21:24:51 2016,Completed/Errors ,0,1,root,sys,bkj0d101 
dz993013@bkj0d101# omnidb -sess 2016/03/11-695 
Object Name                         Object Type     Object Status     CopyID 
=============================================================================== 
bkj1d111 'CLONE_07'                   RawDisk         Completed       7933 (O) 
dz993013@bkj0d101# omnidb -sess 2016/03/11-695 -report 
[Normal] From: BSM@bkj0d101 "ZLF07A"  Time: 03/11/16 16:30:26 
        Backup session 2016/03/11-695 started. 
 
[Normal] From: BSM@bkj0d101 "ZLF07A"  Time: 03/11/16 16:30:27 
        Starting to execute "SHELL/pre_vnx.sh pre ZLF07A.conf"... 
 
        Starting pre-dpbackup jobs.. 
[Normal] From: BSM@bkj0d101 "ZLF07A"  Time: 03/11/16 16:31:09 
        The exec script "SHELL/pre_vnx.sh pre ZLF07A.conf" has completed. 
 
[Normal] From: BMA-NDMP@bkj0d101 "ESL3_DRV14"  Time: 03/11/16 16:31:10 
        STARTING Media Agent "ESL3_DRV14" 
 
[Normal] From: BMA-NDMP@bkj0d101 "ESL3_DRV14"  Time: 03/11/16 16:31:11 
        By: UMA@bkj1d110@/dev/rchgr/autoch2 
        Loading medium from slot 221 to device /dev/rtape/tape7314_BESTn 
 
[Normal] From: RBDA@bkj0d101 "CLONE_07"  Time: 03/11/16 16:32:15 
        STARTING Disk Agent for bkj0d101 "CLONE_07". 
 
[Normal] From: RBDA@bkj0d101 "CLONE_07"  Time: 03/11/16 16:32:15 
        Rawdisk Section Statistics: 
 
                Objects Total ......         1 
                Total Size .........   1.10 TB 
 
[Critical] From: BMA-NDMP@bkj0d101 "ESL3_DRV14"  Time: 03/11/16 21:12:01 
        /dev/rtape/tape7314_BESTn "00000093:51b9823a:4190:0009" 
Tape Alert [ 4]: Your data is at risk: 
        1. Copy any data you require from this tape. 
        2. Do not use this tape again. 
        3. Restart the operation with a different tape. 

 
[Normal] From: BMA-NDMP@bkj0d101 "ESL3_DRV14"  Time: 03/11/16 21:12:33 
        /dev/rtape/tape7314_BESTn 
        Medium header verification completed, 0 errors found. 
 
[Normal] From: BMA-NDMP@bkj0d101 "ESL3_DRV14"  Time: 03/11/16 21:13:11 
        Ejecting medium '221'. 
 
[Normal] From: BMA-NDMP@bkj0d101 "ESL3_DRV14"  Time: 03/11/16 21:13:11 
        By: UMA@bkj1d110@/dev/rchgr/autoch2 
        Unloading medium to slot 221 from device /dev/rtape/tape7314_BESTn 
 
[Normal] From: BMA-NDMP@bkj0d101 "ESL3_DRV14"  Time: 03/11/16 21:13:26 
        By: UMA@bkj1d110@/dev/rchgr/autoch2 
        Loading medium from slot 69 to device /dev/rtape/tape7314_BESTn 
 
[Normal] From: RBDA@bkj0d101 "CLONE_07"  Time: 03/11/16 21:23:57 
        Backup Profile: 
 
                Run Time ........... 4:51:42 
                Backup Speed ....... 65.94 MB/s 
 
[Normal] From: RBDA@bkj0d101 "CLONE_07"  Time: 03/11/16 21:23:57 
        COMPLETED Disk Agent for bkj0d101 "CLONE_07". 
 
[Normal] From: BMA-NDMP@bkj0d101 "ESL3_DRV14"  Time: 03/11/16 21:24:19 
        /dev/rtape/tape7314_BESTn 
        Medium header verification completed, 0 errors found. 
 
[Normal] From: BMA-NDMP@bkj0d101 "ESL3_DRV14"  Time: 03/11/16 21:24:38 
        Ejecting medium '69'. 
 
[Normal] From: BMA-NDMP@bkj0d101 "ESL3_DRV14"  Time: 03/11/16 21:24:38 
        By: UMA@bkj1d110@/dev/rchgr/autoch2 
        Unloading medium to slot 69 from device /dev/rtape/tape7314_BESTn 
 
[Normal] From: BMA-NDMP@bkj0d101 "ESL3_DRV14"  Time: 03/11/16 21:24:51 
        COMPLETED Media Agent "ESL3_DRV14" 
 
[Normal] From: BSM@bkj0d101 "ZLF07A"  Time: 03/11/16 21:24:51 
 
        Backup Statistics: 
 
                Session Queuing Time (hours)         0.00 
                ------------------------------------------- 
                Completed Disk Agents ........          1 
                Failed Disk Agents ...........          0 
                Aborted Disk Agents ..........          0 
                ------------------------------------------- 
                Disk Agents Total  ...........          1 
                =========================================== 
                Completed Media Agents .......          1 
                Failed Media Agents ..........          0 
                Aborted Media Agents .........          0 
                ------------------------------------------- 
                Media Agents Total  ..........          1 
                =========================================== 
                Mbytes Total ................. 1158573 MB 
                Used Media Total .............          2 
                Disk Agent Errors Total ......          0 
 
[Normal] From: BSM@bkj0d101 "ZLF07A"  Time: 03/11/16 21:24:51 
        Starting to execute "SHELL/pre_vnx.sh post ZLF07A.conf"... 
 
        Starting post-dpbackup jobs.. 
[Normal] From: BSM@bkj0d101 "ZLF07A"  Time: 03/11/16 21:25:26 
        The exec script "SHELL/pre_vnx.sh post ZLF07A.conf" has completed. 
 
dz993013@bkj0d101# omnimm -media_info 00000093:51b9823a:4190:0009 
 
Medium Label        Medium ID                           Pool          Library 
=============================================================================== 
[S21579L4] SAN_31_  00000093:51b9823a:4190:0009         SAN_31          ESL#3 
dz993013@bkj0d101# omnimm -media_info 00000093:51b9823a:4190:0009 -detail 
 
MediumID               : 00000093:51b9823a:4190:0009 
Pool name              : SAN_31 
Library                : ESL#3 
Medium Label           : [S21579L4] SAN_31_38 
Location               : [ESL#3:   221] 
Medium Owner           : bkj1d110 
Used blocks [KB]       : 1083985664 
Total blocks [KB]      : 1083985664 
Number of writes       : 2 
Number of overwrites   : 137 
Number of errors       : 0 
Creation time          : Wed Dec 26 18:29:40 2012 
Time of last write     : Fri Mar 11 16:32:41 2016 
Time of last overwrite : Fri Mar 11 16:32:14 2016 
Time of last access    : Fri Mar 11 16:32:41 2016 
Medium type            : FULL 
Write-protected        : No 
Encrypted              : No 
dz993013@bkj0d101# omnimm -list_pool 
 
Status  Pool name                 Media Type        MS   # of media   Free [MB] 
=============================================================================== 
Good    Default AIT               AIT               No           0           0 
Good    Default DDS               DDS               No           0           0 
Good    Default DLT               DLT               No           0           0 
Good    Default DTF               DTF               No           0           0 
Good    Default Exabyte           ExaByte           No           0           0 
Good    Default File              File              No           1         100 
Poor    Default LTO-Ultrium       LTO-Ultrium       No         106     1811640 
Good    Default Optical           Optical           No           0           0 
Good    Default QIC               QIC               No           0           0 
Good    Default SAIT              SAIT              No           0           0 
Good    Default SD-3              SD-3              No           0           0 
Good    Default SuperDLT          SuperDLT          No           0           0 
Good    Default T10000            T10000            No           0           0 
Good    Default T3480/T4890/T9490 T3480/T4890/T9490 No           0           0 
Good    Default T3590             T3590             No           0           0 
Good    Default T3592             T3592             No           0           0 
Good    Default T9840             T9840             No           0           0 
Good    Default T9940             T9940             No           0           0 
Good    Default Tape              Tape              No           0           0 
Poor    GOLDEN MEDIA              LTO-Ultrium       No         324     7037876 
Fair    GOLDEN_MEDIA_ESL1         LTO-Ultrium       No           6     6715062 
Poor    GOLDEN_MEDIA_ESL2         LTO-Ultrium       No           8     3138195 
Poor    GOLDEN_MEDIA_ESL3         LTO-Ultrium       No          17     8167628 
Poor    GOLDEN_MEDIA_ESL4         LTO-Ultrium       No          21     3480282 
Poor    IDB                       LTO-Ultrium       No          19    13592545 
Good    LTO5                      LTO-Ultrium       No           2      193744 
Poor    Legalhold_Exchange        LTO-Ultrium       No         204     2969243 
Poor    Legalhold_Expired         LTO-Ultrium       No         171     3475722 
Poor    NAS_31                    LTO-Ultrium       No          82    19870085 
Poor    NAS_32                    LTO-Ultrium       No          62     5634432 
Poor    NAS_LTO3                  LTO-Ultrium       No          54    12551232 
Poor    NAS_NetAPP_ESL1           LTO-Ultrium       No          52    10729648 
Poor    NAS_NetAPP_ESL2           LTO-Ultrium       No          67    15286656 
Poor    NAS_NetAPP_ESL3           LTO-Ultrium       No          77    10852480 
Poor    NAS_NetAPP_ESL4           LTO-Ultrium       No          84    59127213 
Poor    POOR MEDIA                LTO-Ultrium       No          12      926232 
Poor    SAN_21                    LTO-Ultrium       No         912   621231014 
Poor    SAN_22                    LTO-Ultrium       No         968   503534266 
Poor    SAN_31                    LTO-Ultrium       No         538   167208557 
Poor    SAN_32                    LTO-Ultrium       No         995   481114186 
Poor    SAN_LTO3_1                LTO-Ultrium       No         125    52554493 
Poor    SAN_LTO3_2                LTO-Ultrium       No         145    89462158 
Good    TEMP                      LTO-Ultrium       No           0           0 
dz993013@bkj0d101# omnimm -move_medium S21579L4 "POOR MEDIA" 
Medium S21579L4 successfully moved to pool POOR MEDIA. 
dz993013@bkj0d101# omnimm -media_info 00000093:51b9823a:4190:0009 -detail 

 
MediumID               : 00000093:51b9823a:4190:0009 
Pool name              : POOR MEDIA 
Library                : ESL#3 
Medium Label           : [S21579L4] SAN_31_38 
Location               : [ESL#3:   221] 
Medium Owner           : bkj1d110 
Used blocks [KB]       : 1083985664 
Total blocks [KB]      : 1083985664 
Number of writes       : 2 
Number of overwrites   : 137 
Number of errors       : 0 
Creation time          : Wed Dec 26 18:29:40 2012 
Time of last write     : Fri Mar 11 16:32:41 2016 
Time of last overwrite : Fri Mar 11 16:32:14 2016 
Time of last access    : Fri Mar 11 16:32:41 2016 
Medium type            : FULL 
Write-protected        : No 
Encrypted              : No 

*******************************************************************************************

看上面的log中标红的:

1)./cs 10 | grep ZLF07A 

2) omnidb -sess 2016/03/11-695 

[Critical] From: BMA-NDMP@bkj0d101 "ESL3_DRV14"  Time: 03/11/16 21:12:01 
        /dev/rtape/tape7314_BESTn "00000093:51b9823a:4190:0009" 
Tape Alert [ 4]: Your data is at risk: 
        1. Copy any data you require from this tape. 
        2. Do not use this tape again. 
        3. Restart the operation with a different tape. 

磁带in risk状态。

3) omnimm -media_info 00000093:51b9823a:4190:0009 

4) omnimm -media_info 00000093:51b9823a:4190:0009 -detail 

MediumID               : 00000093:51b9823a:4190:0009 
Pool name              : SAN_31 
Library                : ESL#3 
Medium Label           : [S21579L4] SAN_31_38 
Number of overwrites   : 137

可以看到磁带被覆盖的次数太多了,可以看到这盘磁带属于SAN_31,还可一看到lable号S21579L4

5) omnimm -list_pool 

可以查看配置的pool的信息,还有poor pool的名字。

6) omnimm -move_medium S21579L4 "POOR MEDIA" 

用的lable号,把这盘磁带放到poor media pool了。

7) omnimm -media_info 00000093:51b9823a:4190:0009 -detail 

可以看到这盘磁带已经被移动到poor pool了。

data已经存到这盘磁带了,比如数据在这盘磁带中需要存两个月,两个月之后这盘磁带就可以扔了,在poor pool中的磁带是永远不会被用到的。

then ask chops to fource ok.




    本文转自UVN2015  51CTO博客,原文链接:http://blog.51cto.com/10851095/1750106,如需转载请自行联系原作者




相关实践学习
基于ECS和NAS搭建个人网盘
本场景主要介绍如何基于ECS和NAS快速搭建个人网盘。
阿里云文件存储 NAS 使用教程
阿里云文件存储(Network Attached Storage,简称NAS)是面向阿里云ECS实例、HPC和Docker的文件存储服务,提供标准的文件访问协议,用户无需对现有应用做任何修改,即可使用具备无限容量及性能扩展、单一命名空间、多共享、高可靠和高可用等特性的分布式文件系统。 产品详情:https://www.aliyun.com/product/nas
相关文章
|
5月前
|
Oracle 关系型数据库 Linux
解决在linux服务器上部署定时自动查找cpu,内存,磁盘使用量,并将查询结果写入数据库的脚本,只能手动运行实现插库操作
问题描述:将脚本名命名为mortior.sh(以下简称mo),手动执行脚本后查询数据库,表中有相应的信息,放入自动执行队列中,脚本被执行,但是查询数据库,并没有新增数据。
43 0
|
7月前
|
存储 监控 API
7.7 实现进程内存读写
内存进程读写可以让我们访问其他进程的内存空间并读取或修改其中的数据。这种技术通常用于各种调试工具、进程监控工具和反作弊系统等场景。在`Windows`系统中,内存进程读写可以通过一些`API`函数来实现,如`OpenProcess`、`ReadProcessMemory`和`WriteProcessMemory`等。这些函数提供了一种通用的方式来访问其他进程的内存,并且可以用来读取或写入不同类型的数据,例如整数、字节集、浮点数等。在开始编写内存读者功能之前我们先来实现一个获取特定进程内特定模块基址的功能,该功能的实现分为两部分首先我们封装一个`GetProcessModuleHandle`函数
54 0
|
4月前
|
存储 API Windows
11.9 实现磁盘相关操作
如下代码实现了在Windows系统中获取所有磁盘驱动器的信息。具体包括两个函数,一个用于获取驱动器类型,另一个用于获取驱动器空间信息。主函数则调用这两个函数来遍历所有逻辑驱动器并输出相应的信息。在输出驱动器空间信息时,会输出该驱动器的总大小、已用空间以及可用空间。
23 0
|
9月前
|
存储 监控 NoSQL
PG明明业务进行的是SELECT,为什么监控磁盘,写负载那么大呢?
PG明明业务进行的是SELECT,为什么监控磁盘,写负载那么大呢?
61 0
|
9月前
|
消息中间件 关系型数据库 Shell
记录贴:sentry磁盘占用过大如何清理?
记录贴:sentry磁盘占用过大如何清理?
802 0
|
10月前
|
存储
【PE准备阶段】将内存中的数据读取到内存,将内存中的数据读取到文件中【滴水逆向39期作业】
【PE准备阶段】将内存中的数据读取到内存,将内存中的数据读取到文件中【滴水逆向39期作业】
|
容灾
服务运行过程中磁盘坏道引起的思考
服务运行过程中磁盘坏道引起的思考
服务运行过程中磁盘坏道引起的思考
|
Web App开发 消息中间件 监控
日志量巨大时filebeat占用文件句柄导致磁盘被打满
生产环境日志收集集群的一次优化经历
6391 0