mongoDB's Capped Collections

本文涉及的产品
云数据库 MongoDB,通用型 2核4GB
简介:
在mongoDB中有一个非常好用的collection : Capped Collections。
这种Collection可以设置最大的extent size或max documents,除此之外,在auto-FIFO age-out特性方面具有非常好的性能。capped collection自动维护插入顺序,在某些特殊的使用场景中非常有效,如记录日志的场景,某监控软件保留一个月的监控数据可查,在RDBMS中一般可以使用分区表,定期清理历史数据。在mongodb中使用capped collection只要预先设置好capped collection的size与,或max值.后面就不用管了。
范例一 : 固定大小的collection,size指定的大小单位为byte,指定size中包含数据库头的大小.
1. # 创建一个大小为1byte的capped collection(size是预先指派的,normal collection默认情况下是现用现得,当然normal collection也可以预先指派空间):
> db.createCollection('mycappedcol4',{'capped' : true,'size' : 1})   
{ "ok" : 1 }
# 实际存储SIZE = 256 bytes
# db.m.storageSize() - includes free space allocated to this collection 包含剩余空间
> db.mycappedcol4.storageSize()                                   
256
# 实际extents = 1
# _id上没有创建索引
# 最大的允许记录数为2147483647
> db.mycappedcol4.stats()                                         
{
        "ns" : "test.mycappedcol4",
        "count" : 0,
        "size" : 0,
        "avgObjSize" : NaN,
        "storageSize" : 256,
        " numExtents " : 1,
        " nindexes " : 0,
        "lastExtentSize" : 256,
        "paddingFactor" : 1,
        "flags" : 0,
        " totalIndexSize " : 0,
        "indexSizes" : {

        },
        "capped" : 1,
        " max " : 2147483647,
        "ok" : 1
}
> db.mycappedcol4.getIndexes()
[ ]
# 插入测试,没有记录被插入,因为超过了限制.
> db.mycappedcol4.insert({'lastname':'digoal','firstname':'zhou'})
> db.mycappedcol4.find()

2. # 创建一个大小为8194byte的capped collection:
# 实际分配8448 bytes,256 bytes的倍数.
> db.createCollection('mycappedcol5',{'capped' : true,'size' : 8194})
{ "ok" : 1 }
> db.mycappedcol5.storageSize()
8448
> 8448/256
33
> db.mycappedcol5.stats()
{
        "ns" : "test.mycappedcol5",
        "count" : 0,
        "size" : 0,
        "avgObjSize" : NaN,
        "storageSize" : 8448,
        "numExtents" : 1,
        "nindexes" : 0,
        "lastExtentSize" : 8448,
        "paddingFactor" : 1,
        "flags" : 0,
        "totalIndexSize" : 0,
        "indexSizes" : {

        },
        "capped" : 1,
        "max" : 2147483647,
        "ok" : 1
}
# 插入记录测试
> for (i=1;i<100;i++) {
... db.mycappedcol5.insert({'lastname' : 'digoal','firstname' : 'zhou','insertid' : i})
... }
# 已经达到设定的SIZE上限,插入记录99条,已经发生了AGE-OUT,实际存放了81条。
> db.mycappedcol5.count()
81
> db.mycappedcol5.stats()
{
        "ns" : "test.mycappedcol5",
        "count" : 81,
        "size" : 6804,
        "avgObjSize" : 84,
        "storageSize" : 8448,
        "numExtents" : 1,
        "nindexes" : 0,
        "lastExtentSize" : 8448,
        "paddingFactor" : 1,
        "flags" : 0,
        "totalIndexSize" : 0,
        "indexSizes" : {

        },
        "capped" : 1,
        "max" : 2147483647,
        "ok" : 1
}
# 查询,由于capped collection需要追踪插入的顺序,因此find()返回结果的顺序等于插入的顺序,如下
> db.mycappedcol5.find().limit(5)
{ "_id" : ObjectId("4d23cc569ae655253f80e455"), "lastname" : "digoal", "firstname" : "zhou", "insertid" : 19 }
{ "_id" : ObjectId("4d23cc569ae655253f80e456"), "lastname" : "digoal", "firstname" : "zhou", "insertid" : 20 }
{ "_id" : ObjectId("4d23cc569ae655253f80e457"), "lastname" : "digoal", "firstname" : "zhou", "insertid" : 21 }
{ "_id" : ObjectId("4d23cc569ae655253f80e458"), "lastname" : "digoal", "firstname" : "zhou", "insertid" : 22 }
{ "_id" : ObjectId("4d23cc569ae655253f80e459"), "lastname" : "digoal", "firstname" : "zhou", "insertid" : 23 }
# 如果要按照最近插入的顺序来获得返回结果,可以使用sort({$natural:-1}),注意在normal collection中使用sort({$natural:-1})返回的只是接近的反向顺序结果,但是capped collection是精确的顺序。参考:
# For standard tables, natural order is not particularly useful because, although the order is often close to insertion order, it is not  guaranteed  to be. However, for Capped Collections, natural order is guaranteed to be the insertion order. This can be very useful.
> db.mycappedcol5.find().sort({$natural:-1}).limit(5)
{ "_id" : ObjectId("4d23cc569ae655253f80e4a5"), "lastname" : "digoal", "firstname" : "zhou", "insertid" : 99 }
{ "_id" : ObjectId("4d23cc569ae655253f80e4a4"), "lastname" : "digoal", "firstname" : "zhou", "insertid" : 98 }
{ "_id" : ObjectId("4d23cc569ae655253f80e4a3"), "lastname" : "digoal", "firstname" : "zhou", "insertid" : 97 }
{ "_id" : ObjectId("4d23cc569ae655253f80e4a2"), "lastname" : "digoal", "firstname" : "zhou", "insertid" : 96 }
{ "_id" : ObjectId("4d23cc569ae655253f80e4a1"), "lastname" : "digoal", "firstname" : "zhou", "insertid" : 95 }
# 达到设定的SIZE上限,再次插入记录,发送AGE-OUT,根据FIFO原理,insertid=19的记录被age-out。
> db.mycappedcol5.insert({'lastname' : 'digoal','firstname' : 'zhou','insertid' : 100})
> db.mycappedcol5.find().sort({$natural:-1}).limit(5)                                  
{ "_id" : ObjectId("4d23cd149ae655253f80e4a6"), "lastname" : "digoal", "firstname" : "zhou", "insertid" : 100 }
{ "_id" : ObjectId("4d23cc569ae655253f80e4a5"), "lastname" : "digoal", "firstname" : "zhou", "insertid" : 99 }
{ "_id" : ObjectId("4d23cc569ae655253f80e4a4"), "lastname" : "digoal", "firstname" : "zhou", "insertid" : 98 }
{ "_id" : ObjectId("4d23cc569ae655253f80e4a3"), "lastname" : "digoal", "firstname" : "zhou", "insertid" : 97 }
{ "_id" : ObjectId("4d23cc569ae655253f80e4a2"), "lastname" : "digoal", "firstname" : "zhou", "insertid" : 96 }
> db.mycappedcol5.find().limit(5)                                                      
{ "_id" : ObjectId("4d23cc569ae655253f80e456"), "lastname" : "digoal", "firstname" : "zhou", "insertid" : 20 }
{ "_id" : ObjectId("4d23cc569ae655253f80e457"), "lastname" : "digoal", "firstname" : "zhou", "insertid" : 21 }
{ "_id" : ObjectId("4d23cc569ae655253f80e458"), "lastname" : "digoal", "firstname" : "zhou", "insertid" : 22 }
{ "_id" : ObjectId("4d23cc569ae655253f80e459"), "lastname" : "digoal", "firstname" : "zhou", "insertid" : 23 }
{ "_id" : ObjectId("4d23cc569ae655253f80e45a"), "lastname" : "digoal", "firstname" : "zhou", "insertid" : 24 }
# 更新测试,当更新后的记录大于原有记录存放的空间,被阻止。
> db.mycappedcol5.update({"insertid" : 21},{$set : {"lastname" : "digoallllllllllllllllllllllllll"}})
failing update: objects in a capped ns cannot grow
# 当更新后的记录不大于原有记录存放的空间,被允许。或者说不会发生行迁移的情况下是允许的。
> db.mycappedcol5.update({"insertid" : 21},{$set : {"lastname" : "digoall"}})                        
> db.mycappedcol5.find().limit(4)                                            
{ "_id" : ObjectId("4d23cc569ae655253f80e457"), "firstname" : "zhou", "insertid" : 21, "lastname" : "digoall" }
{ "_id" : ObjectId("4d23cc569ae655253f80e458"), "lastname" : "digoal", "firstname" : "zhou", "insertid" : 22 }
{ "_id" : ObjectId("4d23cc569ae655253f80e459"), "lastname" : "digoal", "firstname" : "zhou", "insertid" : 23 }
{ "_id" : ObjectId("4d23cc569ae655253f80e45a"), "lastname" : "digoal", "firstname" : "zhou", "insertid" : 24 }
# 因此再次插入时,被age-out的记录还是保持和第一次插入的顺序一致,而不是以最后一次更新的顺序为准,下面可以看出insertid=21的记录被age-out
> db.mycappedcol5.insert({'lastname' : 'digoal','firstname' : 'zhou','insertid' : 101})
> db.mycappedcol5.find().limit(5)                                                      
{ "_id" : ObjectId("4d23cc569ae655253f80e458"), "lastname" : "digoal", "firstname" : "zhou", "insertid" : 22 }
{ "_id" : ObjectId("4d23cc569ae655253f80e459"), "lastname" : "digoal", "firstname" : "zhou", "insertid" : 23 }
{ "_id" : ObjectId("4d23cc569ae655253f80e45a"), "lastname" : "digoal", "firstname" : "zhou", "insertid" : 24 }
{ "_id" : ObjectId("4d23cc569ae655253f80e45b"), "lastname" : "digoal", "firstname" : "zhou", "insertid" : 25 }
{ "_id" : ObjectId("4d23cc569ae655253f80e45c"), "lastname" : "digoal", "firstname" : "zhou", "insertid" : 26 }
# 删除记录测试,不允许删除
> db.mycappedcol5.remove({"insertid" : 22})
can't remove from a capped collection
# 索引测试,虽然capped collection默认不在_id上创建索引,但是没说不可以创建索引.如下
> db.mycappedcol5.ensureIndex({"insertid" : 1})
> db.mycappedcol5.getIndexes()
[
        {
                "_id" : ObjectId("4d23d1d59ae655253f80e4a9"),
                "ns" : "test.mycappedcol5",
                "key" : {
                        "insertid" : 1
                },
                "name" : "insertid_1"
        }
]
# 执行计划输出测试,走索引
> db.mycappedcol5.find({"insertid" : 80}).explain()
{
        "cursor" : "BtreeCursor insertid_1",
        "nscanned" : 1,
        "nscannedObjects" : 1,
        "n" : 1,
        "millis" : 0,
        "indexBounds" : {
                "insertid" : [
                        [
                                80,
                                80
                        ]
                ]
        }
}
# 走索引
> db.mycappedcol5.find({"insertid" : {$gt : 1}}).explain() 
{
        "cursor" : "BtreeCursor insertid_1",
        "nscanned" : 81,
        "nscannedObjects" : 81,
        "n" : 81,
        "millis" : 0,
        "indexBounds" : {
                "insertid" : [
                        [
                                1,
                                1.7976931348623157e+308
                        ]
                ]
        }
}
# 显然不走索引
> db.mycappedcol5.find({"lastname" : "digoal"}).explain() 
{
        "cursor" : "ForwardCappedCursor",
        "nscanned" : 81,
        "nscannedObjects" : 81,
        "n" : 81,
        "millis" : 0,
        "indexBounds" : {

        }
}
> db.mycappedcol5.ensureIndex({"lastname" : 1})      
#  居然走索引              
> db.mycappedcol5.find({"lastname" : "digoal"}).explain()
{
        "cursor" : "BtreeCursor lastname_1",
        "nscanned" : 81,
        "nscannedObjects" : 81,
        "n" : 81,
        "millis" : 0,
        "indexBounds" : {
                "lastname" : [
                        [
                                "digoal",
                                "digoal"
                        ]
                ]
        }
}

范例二 : 指定最大记录数的capped collection.
1. # 创建一个最大允许记录1000条记录的capped collection
> db.createCollection('mycappedcol6',{'capped' : true,'max' : 1000})                                 
{ "ok" : 1 }
# 默认预先指派 SIZE = 8192 bytes
> db.mycappedcol6.stats()
{
        "ns" : "test.mycappedcol6",
        "count" : 0,
        "size" : 0,
        "avgObjSize" : NaN,
        "storageSize" : 8192,
        "numExtents" : 1,
        "nindexes" : 0,
        "lastExtentSize" : 8192,
        "paddingFactor" : 1,
        "flags" : 0,
        "totalIndexSize" : 0,
        "indexSizes" : {

        },
        "capped" : 1,
        "max" : 1000,
        "ok" : 1
}
# 插入测试
> for (i=1;i<100;i++) {
... db.mycappedcol6.insert({"insertid" : i,"firstname" : "zhou","lastname" : "digoal"})
... }
# 注意到,仅仅存放了78条记录,因此max和size应该结合使用
> db.mycappedcol6.stats()
{
        "ns" : "test.mycappedcol6",
        " count " :  78 ,
        "size" : 6552,
        "avgObjSize" : 84,
        "storageSize" : 8192,
        "numExtents" : 1,
        "nindexes" : 0,
        "lastExtentSize" : 8192,
        "paddingFactor" : 1,
        "flags" : 0,
        "totalIndexSize" : 0,
        "indexSizes" : {

        },
        "capped" : 1,
        "max" : 1000,
        "ok" : 1
}
# 从以上的输出看出平均一条记录占据 84 bytes ,需要存放1000条记录的话至少需要 84000 bytes,还不包括头的大小.
# 以下创建一个84000 bytes的capped collection,仅存放了839条记录.
> db.createCollection('mycappedcol7',{'capped' : true,'max' : 1000,'size' : 84000})
{ "ok" : 1 }
> for (i=1;i<1001;i++) {                                                           
... db.mycappedcol7.insert({"insertid" : i,"firstname" : "zhou","lastname" : "digoal"})
... }
> db.mycappedcol7.stats()
{
        "ns" : "test.mycappedcol7",
        "count" : 839,
        "size" : 70476,
        "avgObjSize" : 84,
        "storageSize" : 84224,
        "numExtents" : 1,
        "nindexes" : 0,
        "lastExtentSize" : 84224,
        "paddingFactor" : 1,
        "flags" : 0,
        "totalIndexSize" : 0,
        "indexSizes" : {

        },
        "capped" : 1,
        "max" : 1000,
        "ok" : 1
}

范例三 :  创建一个预先指派size的normal collection 
> db.createCollection('Iamdigoal',{'capped' : false,'size' : 8192000})
{ "ok" : 1 }
# _id的索引被自动创建
> db.Iamdigoal.getIndexes()
[
{
"name" : "_id_",
"ns" : "test.Iamdigoal",
"key" : {
"_id" : 1
}
}
]
# 预先指派SIZE = 8192256
> db.Iamdigoal.stats()
{
"ns" : "test.Iamdigoal",
"count" : 0,
"size" : 0,
"avgObjSize" : NaN,
"storageSize" : 8192256,
"numExtents" : 1,
"nindexes" : 1,
"lastExtentSize" : 8192256,
"paddingFactor" : 1,
"flags" : 1,
"totalIndexSize" : 8192,
"indexSizes" : {
"_id_" : 8192
},
"ok" : 1
}

限制:
1. Capped Collection默认不创建_id的索引.(normal collection默认情况下创建_id的索引)
2. Capped Collection are not shardable
3. 更新限制,超出原因记录大小的记录的更新操作被阻止.
4. Maximum size for a capped collection is currently 1e9 bytes on a thirty-two bit machine. The maximum size of a capped collection on a sixty-four bit machine is constrained only by system resources.
5. If you perform a find()  on the collection with no ordering specified, the objects will always be returned in insertion order.  Reverse order is always retrievable with find().sort({$natural:-1}).
6. 默认情况下指定size的max的默认值 = 2147483647 , 指定max的size的默认值 = 8192 .
相关实践学习
MongoDB数据库入门
MongoDB数据库入门实验。
快速掌握 MongoDB 数据库
本课程主要讲解MongoDB数据库的基本知识,包括MongoDB数据库的安装、配置、服务的启动、数据的CRUD操作函数使用、MongoDB索引的使用(唯一索引、地理索引、过期索引、全文索引等)、MapReduce操作实现、用户管理、Java对MongoDB的操作支持(基于2.x驱动与3.x驱动的完全讲解)。 通过学习此课程,读者将具备MongoDB数据库的开发能力,并且能够使用MongoDB进行项目开发。 &nbsp; 相关的阿里云产品:云数据库 MongoDB版 云数据库MongoDB版支持ReplicaSet和Sharding两种部署架构,具备安全审计,时间点备份等多项企业能力。在互联网、物联网、游戏、金融等领域被广泛采用。 云数据库MongoDB版(ApsaraDB for MongoDB)完全兼容MongoDB协议,基于飞天分布式系统和高可靠存储引擎,提供多节点高可用架构、弹性扩容、容灾、备份回滚、性能优化等解决方案。 产品详情: https://www.aliyun.com/product/mongodb
相关文章
|
11月前
|
缓存 NoSQL MongoDB
开心档-软件开发入门之MongoDB 固定集合(Capped Collections)
【摘要】 本章将会讲解MongoDB 固定集合(Capped Collections)是性能出色且有着固定大小的集合,对于大小固定,我们可以想象其就像一个环形队列,当集合空间用完后,再插入的元素就会覆盖最初始的头部的元素!
|
12月前
|
存储 传感器 缓存
MongoDB 功能详解之时间序列集合(Time Series Collections)
时间序列集合(Time Series Collections):MongoDB 5.0 版本中的新功能。
|
缓存 NoSQL MongoDB
MongoDB:5-MongoDB的固定集合(capped collection)
MongoDB:5-MongoDB的固定集合(capped collection)
152 0
|
监控 NoSQL 索引
mongoDB 定长集合(capped collection)
大多数情况下,mongoDB中都是普通的集合,这些集合也称为动态集合,可以自动增长以容纳更多的数据。
1018 0
|
3月前
|
JSON NoSQL 小程序
Mongodb数据库的导出和导入总结
Mongodb数据库的导出和导入总结
189 0
|
1天前
|
NoSQL MongoDB 数据库
MongoDB数据恢复—MongoDB数据库文件被破坏的数据恢复案例
服务器数据恢复环境: 一台Windows Server操作系统服务器,服务器上部署MongoDB数据库。 MongoDB数据库故障&检测: 工作人员在未关闭MongoDB数据库服务的情况下,将数据库文件拷贝到其他分区。拷贝完成后将原MongoDB数据库所在分区进行了格式化操作,然后将数据库文件拷回原分区,重新启动MongoDB服务,服务无法启动。
|
5天前
|
NoSQL MongoDB Redis
Python与NoSQL数据库(MongoDB、Redis等)面试问答
【4月更文挑战第16天】本文探讨了Python与NoSQL数据库(如MongoDB、Redis)在面试中的常见问题,包括连接与操作数据库、错误处理、高级特性和缓存策略。重点介绍了使用`pymongo`和`redis`库进行CRUD操作、异常捕获以及数据一致性管理。通过理解这些问题、易错点及避免策略,并结合代码示例,开发者能在面试中展现其技术实力和实践经验。
42 8
Python与NoSQL数据库(MongoDB、Redis等)面试问答
|
1月前
|
NoSQL 网络协议 MongoDB
Windows公网远程连接MongoDB数据库【无公网IP】
Windows公网远程连接MongoDB数据库【无公网IP】
|
1月前
|
存储 NoSQL 关系型数据库
一篇文章带你搞懂非关系型数据库MongoDB
一篇文章带你搞懂非关系型数据库MongoDB
57 0