NoSQL Databases技术资料整理汇总

本文涉及的产品
云数据库 MongoDB,通用型 2核4GB
简介:

0 Reference

NoSQL论文

在 Stuttgart Media 大学的 Christof Strauch 历时8个月(2010年6月-2011年2月)完成了一篇150页长的NoSQL相关的论文, 对NoSQL的各个方面做了探讨

http://www.christof-strauch.de/nosqldbs.pdf

分布式系统领域经典论文翻译集

http://duanple.blog.163.com/blog/static/709717672011330101333271/

2010 NoSQL Summer Reading List

http://blog.nosqlfan.com/html/1647.html

http://www.empiricalreality.com/2010/09/22/2010-nosql-summer-reading-list/

NoSQL技术综述

Distributed Algorithms in NoSQL Databases

http://highlyscalable.wordpress.com/2012/09/18/distributed-algorithms-in-nosql-databases/

NOSQL Patterns

http://horicky.blogspot.com/2009/11/nosql-patterns.html

 

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

1 起源和历史

1.1 Goolge为一切的开始

Google created a full mechanism that included a distributed filesystem, a column-family-oriented data store, a distributed coordination system, and a MapReduce-based parallel algorithm execution environment. Graciously enough, Google published and presented a series of papers explaining some of the key pieces of its infrastructure. The most important of these publications are as follows:


Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. “The Google File System”; pub. 19th ACM Symposium on Operating Systems Principles, Lake George, NY, October 2003.URL:http://labs.google.com/papers/gfs.html


Jeffrey Dean and Sanjay Ghemawat. “MapReduce: Simplifi ed Data Processing on Large Clusters”; pub. OSDI’04: Sixth Symposium on Operating System Design and Implementation, San Francisco, CA, December 2004. URL: http://labs.google.com/papers/mapreduce.html


Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber. “Bigtable: A Distributed Storage System for Structured Data”; pub. OSDI’06: Seventh Symposium on Operating System Design and Implementation, Seattle, WA, November 2006. URL: http://labs.google.com/papers/bigtable.html

 

Mike Burrows. “The Chubby Lock Service for Loosely-Coupled Distributed Systems”; pub.OSDI’06: Seventh Symposium on Operating System Design and Implementation, Seattle,WA, November 2006. URL:http://labs.google.com/papers/chubby.html

1.2 Open-source和Yahoo

The creators of the open-source search engine, Lucene, were the first to develop an open-source version that replicated some of the features of Google’s infrastructure. Subsequently, the core Lucene developers joined Yahoo, where with the help of a host of other contributors, they created a parallel universe that mimicked all the pieces of the Google distributed computing stack. 
This open-source alternative is Hadoop.

1.3 Amazon的Dynamo

A year after the Google papers had catalyzed interest in parallel scalable processing and nonrelational distributed data stores, Amazon decided to share some of its own success story. In 2007, Amazon presented its ideas of a distributed highly available and eventually consistent data store named Dynamo.

You can read more about Amazon Dynamo in a research paper, the details of which are as follows: 

Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swami Sivasubramanian, Peter Vosshall, and Werner Vogels, “Dynamo: Amazon’s Highly Available Key/value Store,” in the Proceedings of the 21st ACM Symposium on Operating Systems Principles, Stevenson, WA, October 2007. Werner Vogels, the Amazon CTO, explained the key ideas behind Amazon Dynamo in a blog post accessible online at www.allthingsdistributed.com/2007/10/amazons_dynamo.html.

Then, Everyone…

 

2 NoSQL分类

2.1 Taxonomies by Data Model (基于数据模型分类)

相关Blog:

NoSQL Data Modeling Techniques

 

Concerning the classification of NoSQL stores Highscalability author Todd Hoff cites a presentation by Stephen Yen in his blog post “A yes for a NoSQL taxonomy” (cf. [Hof09c]). 
In the presentation “NoSQL is a Horseless Carriage” (cf. [Yen09]) Yen suggests a taxononmy that can be found in table 2.1.

Key-Value-Cache

Memcached, Repcached, Coherence, Infinispan, EXtreme Scale, Jboss Cache, Velocity, Terracoqa

Key-Value-Store

keyspace, Flare, Schema Free, RAMCloud

Eventually-Consistent Key-Value-Store

Dynamo, Voldemort, Dynomite, SubRecord 
Mo8onDb 
Dovetaildb

Ordered-Key-Value-Store

Tokyo Tyrant, Lightcloud, NMDB, Luxio, MemcacheDB, Actord

Data-Structures Server

Redis

Tuple Store

Gigaspaces, Coord, Apache River

Object Database

ZopeDB, DB4O, Shoal

Document Store

CouchDB, MongoDB, Jackrabbit, XML Databases, ThruDB, CloudKit, Perservere, Riak Basho, Scalaris

Wide Columnar Store

Bigtable, Hbase, Cassandra, Hypertable, KAI, OpenNeptune, Qbase, KDI

 

2.2 基于CAP理论分类

相关Blog:

CAP – Consistency, Availability, Partition Tolerance

How to beat the CAP theorem

 

image

 

3 NoSQL核心技术

3.1 Data Consistency, 数据一致性

3.1.1 一致性问题的理论基础

相关Blog:

全序, 分布式一致性的本质

 

在Lamport论文谈了那么多偏序和全序的问题, 全序到底有什么用? 论文里面给出互斥资源访问的例子, 如果觉得还是比较抽象 
这里以分布式数据存储为例 
对于并发写数据就存在一致性问题, 如何解决分布式数据库的一致性问题?
 
Lamport在上面那篇论文里面其实也给出了答案, 这就是他这篇paper里面第二个贡献, 也是常常为人忽略的 
如果将分布式系统的所有节点看作有限状态机, 只要保证每个节点的执行命令序列一致, 就能保证所有节点的状态的一致性

对于分布式数据库, 其实就是在同样的初始状况下, 保证每个数据库节点的数据更新序列一致, 就能简单的保证所有数据库的数据的一致性

所以可以看出, 一致性问题已经转变为排序问题

所以这就是为什么上面的paper来讨论偏序和全序的原因, 因为其实你解决了这个问题就已经解决了数据一致性问题

于是上面的问题转变为, 如何在分布式的环境中, 给所有的写操作全序?

1. 基于master或固定参照系, 比如下面的利用时间戳, 悲观或乐观锁 
    这些方法确实可以保证全序, 但都存在单点或时钟同步问题

2. 使用Paxos算法来保证全序, 尤其在强一致性的场景下 
    但问题在于, 该算法耗费比较高, 如果对于海量并发写而言, 需要高可用性的方案

当然对于高可用性的方案, 必须要做出一些牺牲, 无法保证全序

那么Vector Clocks算法就是这样一种方案, 当然只能达到偏序, 因为他的原理就是基于paper中描述的偏序理论

3.1.2 Nosql中的一致性技术概要

相关Blog:

Nosql数据一致性技术概要

 

image

 

3.1.3 Quorum Read and Write

此概念成名于Dynamo的设计, 但是该设计不光可以用于最终一致性的方案, 而是一种保证一致性的通用思路 
因为在分布式的环境中, 让w达到n是不现实的,在这种情况下怎样保证一致性... 
对于M/S架构, 如果master只会同步更新部分复本W, 如果read操作需要读到最新数据, 要不通过master, 要不就至少需要读R个复本, 并保证R+W>N 
Paxos同样也可以基于这样的设计
 

N The number of replicas for the data or the piece of data to be read or written.

R The number of machines contacted in read operations. 
W The number of machines that have to be blocked in write operations5. 
In the interest to provide e. g. the read-your-own-writes consistency model the following relation between the above parameters becomes necessary: 
R +W > N

几种特殊情况: 
W = 1, R = N,对写操作要求高性能高可用 
R = 1, W = N , 对读操作要求高性能高可用,比如类似cache之类业务 
W = Q, R = Q where Q = N / 2 + 1 一般应用适用,读写性能之间取得平衡。如N=3,W=2,R=2
 

 

3.1.4 Eventual Consistency (BASE), 最终一致性技术

当然最典型的代表就是Amazon Dynamo 
高可用性的solution, 任意节点都可以写入数据, 必然导致版本的不一致和冲突 
所以必须需要一种技术来记录各个版本之间的因果关系或偏序关系, 这就需要vector clocks

并且对于任意节点的更新, 如何在各个复本间同步以达到最终的一致性, 这就需要反熵协议

相关Blog:

Vector Clocks, 时间向量

Why Vector Clock are Easy or Hard?

Anti-Entropy Protocols

 

3.1.5 Strong Consistency, 强一致性技术

如上图右下角, M/S比较简单在上面的引用已经描述, 简单但很实用, Goolge早期在GFS和Bigtable都使用的这种设计 
其中最重要的算法是Paxos, Google的Megastore中使用

相关Blog:

Strong Consistency, 强一致性技术概述

Paxos Made Simple

 

3.2 Data Partitioning(Sharding), 数据动态划分

相关Blog:

Consistent Hashing算法及相关技术

 

3.3 Data Replication, 数据复本技术

相关Blog:

Data replication 同步技术

 

3.4 Data Storage Layout

Row-Based Storage Layout

A table of a relational model gets serialized as its lines are appended and flushed to disk. 
Advantages 
a. whole datasets can be read and written in a single IO operation 
b. one has a “[g]ood locality of access (on disk and in cache) of different columns”. 
Disadvantages 
a. operating on columns is expensive as a considerable amount data has to be read.

Columnar Storage Layout

相关Blog:

Columnar Storage, 关于Row-based和Columnar的比较

Serializes tables by appending their columns and flushing them to disk. 
Therefore operations on columns are fast and cheap while operations on rows are costly and can lead to seeks in a lot or all of the columns. A typical application field for this type 
of storage layout is analytics where an efficient examination of columns for statistical purposes is important.

其实没有好坏, 只是不同的场景, 如果需要整行读当然row-based好, 如果只需要少量的column, 当然选columnar 
做个balance, 就是下面的方案column-families

Columnar Storage Layout with Locality Groups

Similar to column-based storage but adds the feature of defining so called locality groups that are groups of columnsexpected to be accessed together by clients
The columns of such a group may therefore be stored together and physically separated from other columns and column groups. 
The idea of locality groups was introduced in Google’s Bigtable paper.

image 

 

3.5 Storage Implementaton, 数据存储实现

Storage implementation pluggable. e.g. A local MySQL DB, Berkeley DB, Filesystem or even a in memory Hashtable can be used as a storage mechanism.

特有的Storage implementation, HBase, Couchbase

3.5.1 SSTables(Sorted String Table)和Log Structured Merge Trees (LSM-trees)

相关Blog:

大数据索引技术 - B+ tree vs LSM tree

详解SSTable结构和LSMTree索引

image

 

3.5.2 CouchDB Storage Implementation

相关Blog:

NoSQL Databases - CouchDB

CouchDB has a MVCC model that uses a copy-on-modified approach. Any update will cause a private copy being made which in turn cause the index also need to be modified and causing the a private copy of the index as well, all the way up to the root pointer.

image

Notice that the update happens in an append-only mode where the modified data is appended to the file and the old data becomes garbage. Periodic garbage collection is done to compact the data. Here is how the model is implemented in memory and disks.

image

 

3.8 Query Models, 数据检索

Whereas key/value stores by design often only provide a lookup by primary key or some id field and lack capabilities to query any further fields, other datastores like the document databases CouchDB and MongoDB allow for complex queries—at least static ones predefined on the database nodes (as in CouchDB).

This is not surprising as in the design of many NoSQL databases rich dynamic querying features have been omitted in favor of performance and scalability.

On the other hand, also when using NoSQL databases, there are use-cases requiring at least some querying features for non-primary key attributes.

Nosql往往只支持基于主键query, 而无法支持复杂的查询, 比如范围查询, 非主键的查询, 当然也有象CouchDB和MangoDB可以支持这样的查询.

但大部分比较纯粹的NoSQL是不支持的, 因为基于key/value的query, 一般都是基于DHT(Distributed Hash Table)技术, 只支持exact match.

那么如果用nosql, 又想具有较复杂的querying features, 有如下思路,

Companion SQL-database is an approach in which searchable attributes are copied to a SQL or text database. The querying capabilities of this database are used to retrieve the primary keys of matching datasets by which the NoSQL database will subsequently be accessed.

如图, 这个想法就是用SQL当索引, 比较简单, 因为索引应该会小点, 所以扩展性问题不是那么突出, 但是还是有问题, 而且维护两个系统增加了复杂性

 image

 

Scatter/Gather Local Search can be used if the NoSQL store allows querying and indexing within database server nodes. If this is the case a query processor can dispatch queries to the database 
nodes where the query is executed locally. The results from all database servers are sent back to the query processor postprocessing them to e. g. do some aggregation and returning the results to a client that issued the query.

image 
Distributed B+Trees are another alternative to implement querying features. The basic idea is to hash the searchable attribute to locate the root node of a distributed B+tree (further information on scalable, distributed B+Trees can be found in a paper by Microsoft, HP and the University of Toronto, cf. [AGS08]). The “value” of this root node then contains an id for a child node in the B+tree which can again be looked up. This process is repeated until a leaf node is reached which contains the primary-key or id of a NoSQL database entry matching search criteria.

image

Prefix Hash Table (aka Distributed Trie) is a tree-datastructure where every path from the root-node to the leafs contains the prefix of the key and every node in the trie contains all the data whose key is prefixed by it (for further information cf. a Berkley-paper on this datastructure [RRHS04]). Besides an illustration Ho provides some code-snippets in his blog post that describe how to operate on prefix hash tables / distributed tries and how to use them for querying purposes (cf.[Ho09b]).

前缀HT, effciently supporting 1-dimensional range queries over a DHT.

image

 

4 主流NoSQL

4.1 BigTable, HBase

bigtable: A Distributed Storage System for Structured Data

HBase-TDG Introduction

HBase-TDG ClientAPI The Basics

HBase-TDG ClientAPI Advanced Features

HBase-TDG ArchitectureSSTable和LSMTree

HBase-TDG Schema Design

HBase vs. BigTable Comparison

 

4.2 KV

Dynamo: Amazon’s Highly Available Key-value Store 
Cassandra - A Decentralized Structured Storage System

 

4.3 Document DB

NoSQL Databases - MongoDB

NoSQL Databases - CouchDB

Comparing Mongo DB and Couch DB

MongoDB Schema Design


本文章摘自博客园,原文发布日期: 2012-06-13

相关实践学习
云数据库HBase版使用教程
  相关的阿里云产品:云数据库 HBase 版 面向大数据领域的一站式NoSQL服务,100%兼容开源HBase并深度扩展,支持海量数据下的实时存储、高并发吞吐、轻SQL分析、全文检索、时序时空查询等能力,是风控、推荐、广告、物联网、车联网、Feeds流、数据大屏等场景首选数据库,是为淘宝、支付宝、菜鸟等众多阿里核心业务提供关键支撑的数据库。 了解产品详情: https://cn.aliyun.com/product/hbase   ------------------------------------------------------------------------- 阿里云数据库体验:数据库上云实战 开发者云会免费提供一台带自建MySQL的源数据库 ECS 实例和一台目标数据库 RDS实例。跟着指引,您可以一步步实现将ECS自建数据库迁移到目标数据库RDS。 点击下方链接,领取免费ECS&RDS资源,30分钟完成数据库上云实战!https://developer.aliyun.com/adc/scenario/51eefbd1894e42f6bb9acacadd3f9121?spm=a2c6h.13788135.J_3257954370.9.4ba85f24utseFl
目录
相关文章
|
25天前
|
Cloud Native OLAP OLTP
在业务处理分析一体化的背景下,开发者如何平衡OLTP和OLAP数据库的技术需求与选型?
在业务处理分析一体化的背景下,开发者如何平衡OLTP和OLAP数据库的技术需求与选型?
123 4
|
25天前
|
缓存 NoSQL 关系型数据库
在Python Web开发过程中:数据库与缓存,MySQL和NoSQL数据库的主要差异是什么?
MySQL是关系型DB,依赖预定义的表格结构,适合结构化数据和复杂查询,但扩展性有限。NoSQL提供灵活的非结构化数据存储(如JSON),无统一查询语言,但能横向扩展,适用于大规模、高并发场景。选择取决于应用需求和扩展策略。
114 1
|
30天前
|
Cloud Native 关系型数据库 分布式数据库
开发者视角看云原生数据库一体化技术趋势
随着云原生数据库技术的不断发展,一体化数据库解决方案成为技术圈的热点,云原生数据库一体化技术是当前数据库领域的重要趋势,对于开发者而言,学习理解和应对这一趋势,对于业务开发的成功实施非常重要。比如,阿里云瑶池数据库和PolarDB-X等产品通过离在线一体化、处理分析一体化和集中分布一体化等创新理念,引领了数据库领域的新变革。那么本文就来从开发者的角度探讨云原生数据库一体化技术趋势,并分析在业务处理分析一体化、集中式与分布式数据库边界模糊和云原生一体化数据库的选择等方面的影响。
188 4
|
1月前
|
SQL 缓存 PHP
PHP技术探究:优化数据库查询效率的实用方法
本文将深入探讨PHP中优化数据库查询效率的实用方法,包括索引优化、SQL语句优化以及缓存机制的应用。通过合理的优化策略和技巧,可以显著提升系统性能,提高用户体验,是PHP开发者不容忽视的重要议题。
|
1月前
|
存储 数据管理 数据处理
数据之光:探索数据库技术的演进之路
数据之光:探索数据库技术的演进之路
60 1
|
12天前
|
存储 中间件 关系型数据库
数据库切片大对决:ShardingSphere与Mycat技术解析
数据库切片大对决:ShardingSphere与Mycat技术解析
20 0
|
25天前
|
SQL 关系型数据库 MySQL
【MySQL技术专题】「问题实战系列」深入探索和分析MySQL数据库的数据备份和恢复实战开发指南(8.0版本升级篇)
【MySQL技术专题】「问题实战系列」深入探索和分析MySQL数据库的数据备份和恢复实战开发指南(8.0版本升级篇)
95 0
|
4天前
|
NoSQL MongoDB Redis
Python与NoSQL数据库(MongoDB、Redis等)面试问答
【4月更文挑战第16天】本文探讨了Python与NoSQL数据库(如MongoDB、Redis)在面试中的常见问题,包括连接与操作数据库、错误处理、高级特性和缓存策略。重点介绍了使用`pymongo`和`redis`库进行CRUD操作、异常捕获以及数据一致性管理。通过理解这些问题、易错点及避免策略,并结合代码示例,开发者能在面试中展现其技术实力和实践经验。
29 8
Python与NoSQL数据库(MongoDB、Redis等)面试问答
|
7天前
|
存储 SQL 安全
6.数据库技术基础
6.数据库技术基础
|
20天前
|
NoSQL 大数据 数据挖掘
现代数据库技术与大数据应用
随着信息时代的到来,数据量呈指数级增长,对数据库技术提出了前所未有的挑战。本文将介绍现代数据库技术在处理大数据应用中的重要性,并探讨了一些流行的数据库解决方案及其在实际应用中的优势。