PostgreSQL sharding : citus 系列2 - TPC-H

本文涉及的产品
云原生数据库 PolarDB MySQL 版,Serverless 5000PCU 100GB
云原生数据库 PolarDB 分布式版,标准版 2核8GB
云数据库 RDS MySQL Serverless,0.5-2RCU 50GB
简介: 标签PostgreSQL , citus , tpc-h背景紧接着上一篇文档,本文测试citus的tpc-h能力(包括兼容性).《PostgreSQL sharding : citus 系列1 - 多机部署(含OLTP(TPC-B)测试)》《(TPC-H测试 SF=10,SF=200) PostgreSQL 11 vs 10 vs Deepgreen》https://github.com/digoal/gp_tpch实际测试过程中,发现CITUS对TPC-H的SQL支持并不完整。

标签

PostgreSQL , citus , tpc-h


背景

紧接着上一篇文档,本文测试citus的tpc-h能力(包括兼容性).

《PostgreSQL sharding : citus 系列1 - 多机部署(含OLTP(TPC-B)测试)》

《(TPC-H测试 SF=10,SF=200) PostgreSQL 11 vs 10 vs Deepgreen》

https://github.com/digoal/gp_tpch

实际测试过程中,发现CITUS对TPC-H的SQL支持并不完整。

citus tpc-h 测试

1、下载gp_tpch包

git clone https://github.com/digoal/gp_tpch  

2、生成200G测试数据

cd gp_tpch  
ln -s `pwd` /tmp/dss-data  
./dbgen -s 200  

3、将数据转换为PG识别的格式

for i in `ls *.tbl`; do sed 's/|$//' $i > ${i/tbl/csv}; echo $i; done;  

4、生成测试SQL

SF=200  
  
mkdir dss/queries  
for q in `seq 1 22`  
do  
    DSS_QUERY=dss/templates ./qgen -s $SF $q > dss/queries/$q.sql  
    sed 's/^select/explain select/' dss/queries/$q.sql > dss/queries/$q.explain.sql  
done  

5、修改citus 的几个参数,确保在跑两类QUERY的时候不报错(末尾会提到报错原因)。

alter role postgres set citus.enable_repartition_joins =on;  
alter role postgres set citus.max_intermediate_result_size =-1;  

6、测试TPC-H(与coordinator同一台主机上测试)

./tpch.sh ./results 127.0.0.1 port tpch-db tpch-user password citus  

tpc-h 性能

有些SQL不支持,显示0.

2018-08-29 19:24:30 [1535541870] :     query 1 finished OK (12 seconds)
2018-08-29 19:24:30 [1535541870] :     query 2 finished OK (0 seconds)
2018-08-29 19:25:48 [1535541948] :     query 3 finished OK (77 seconds)
2018-08-29 19:25:50 [1535541950] :     query 4 finished OK (2 seconds)
2018-08-29 19:29:45 [1535542185] :     query 5 finished OK (234 seconds)
2018-08-29 19:29:47 [1535542187] :     query 6 finished OK (1 seconds)
2018-08-29 19:37:33 [1535542653] :     query 7 finished OK (465 seconds)
2018-08-29 19:44:30 [1535543070] :     query 8 finished OK (415 seconds)
2018-08-29 19:58:29 [1535543909] :     query 9 finished OK (837 seconds)
2018-08-29 20:00:26 [1535544026] :     query 10 finished OK (116 seconds)
2018-08-29 20:00:26 [1535544026] :     query 11 finished OK (0 seconds)
2018-08-29 20:00:32 [1535544032] :     query 12 finished OK (6 seconds)
2018-08-29 20:00:33 [1535544033] :     query 13 finished OK (0 seconds)
2018-08-29 20:01:40 [1535544100] :     query 14 finished OK (67 seconds)
2018-08-29 20:05:33 [1535544333] :     query 15 finished OK (232 seconds)
2018-08-29 20:05:34 [1535544334] :     query 16 finished OK (0 seconds)
2018-08-29 20:05:34 [1535544334] :     query 17 finished OK (0 seconds)
2018-08-29 20:05:34 [1535544334] :     query 18 finished OK (0 seconds)
2018-08-29 20:06:51 [1535544411] :     query 19 finished OK (76 seconds)
2018-08-29 20:06:51 [1535544411] :     query 20 finished OK (0 seconds)
2018-08-29 20:06:52 [1535544412] :     query 21 finished OK (0 seconds)
2018-08-29 20:06:52 [1535544412] :     query 22 finished OK (0 seconds)
2018-08-29 20:06:52 [1535544412] : finished TPC-H benchmark

citus tpc-h SQL文件讲解

1、分片字段

select create_distributed_table('part','p_partkey');  
select create_distributed_table('region','r_regionkey');  
select create_distributed_table('nation','n_nationkey');  
select create_distributed_table('supplier','s_suppkey');  
select create_distributed_table('customer','c_custkey');  
select create_distributed_table('partsupp','ps_suppkey');  
select create_distributed_table('orders','o_orderkey');  
select create_distributed_table('lineitem','l_orderkey');  

2、colocate(默认情况下是同一个分组,不需要colocate)

SELECT mark_tables_colocated('part', ARRAY['region', 'nation', 'supplier', 'customer', 'partsupp', 'orders', 'lineitem']);  
  
默认分组,在创建分片表时colocate默认为default, 

只要表的shard数量、分片类型(append, or hash)、colocate组 都一致,则他们就是colocate的。

create_distributed_table

参数:colocate_with: (Optional) include current table in the co-location group of another table. 
By default tables are co-located when they are distributed by columns of the same type, 
have the same shard count, and have the same replication factor. 
Possible values for colocate_with are default, none to start a new co-location group, 
or the name of another table to co-locate with that table. (See Co-Locating Tables.)

3、索引

ALTER TABLE PART ADD constraint pk1 PRIMARY KEY (P_PARTKEY);  
ALTER TABLE SUPPLIER ADD constraint pk2 PRIMARY KEY (S_SUPPKEY);  
ALTER TABLE PARTSUPP ADD constraint pk3 PRIMARY KEY (PS_PARTKEY, PS_SUPPKEY);  
ALTER TABLE CUSTOMER ADD constraint pk4 PRIMARY KEY (C_CUSTKEY);  
ALTER TABLE ORDERS ADD constraint pk5 PRIMARY KEY (O_ORDERKEY);  
ALTER TABLE LINEITEM ADD constraint pk6 PRIMARY KEY (L_ORDERKEY, L_LINENUMBER);  
ALTER TABLE NATION ADD constraint pk7 PRIMARY KEY (N_NATIONKEY);  
ALTER TABLE REGION ADD constraint pk8 PRIMARY KEY (R_REGIONKEY);  
  
CREATE INDEX idx_nation_regionkey ON public.nation USING btree (n_regionkey);  
CREATE INDEX idx6 ON public.nation USING btree (n_nationkey, n_regionkey);  
  
CREATE INDEX idx5 ON public.region USING btree (r_name, r_regionkey);  
  
CREATE INDEX IDX_CUSTOMER_NATIONKEY ON CUSTOMER (C_NATIONKEY);  
  
CREATE INDEX IDX_LINEITEM_PART_SUPP ON LINEITEM (L_PARTKEY,L_SUPPKEY);  
CREATE INDEX idx_lineitem_shipdate ON public.lineitem USING btree (l_shipdate, l_discount, l_quantity);  
CREATE INDEX idx_lineitem__2 ON public.lineitem USING btree (l_partkey);  
CREATE INDEX idx_lineitem__3 ON public.lineitem USING btree (l_suppkey);  
CREATE INDEX idx_lineitem__11 ON public.lineitem USING btree (l_shipdate);  
CREATE INDEX idx_lineitem_orderkey ON public.lineitem USING btree (l_orderkey);  
CREATE INDEX idx1 ON public.lineitem USING btree (l_orderkey) WHERE (l_commitdate < l_receiptdate);  
  
CREATE INDEX idx_orders__6 ON public.orders USING btree (o_orderpriority);  
CREATE INDEX idx_orders_orderdate ON public.orders USING btree (o_orderdate);  
CREATE INDEX idx_orders_custkey ON public.orders USING btree (o_custkey)  
  
CREATE INDEX idx_part__5 ON public.part USING btree (p_type);  
CREATE INDEX idx_part__6 ON public.part USING btree (p_size);  
CREATE INDEX idx_part_1 ON public.part USING btree (p_container, p_brand);  
  
CREATE INDEX idx_supplier_nation_key ON public.supplier USING btree (s_nationkey);  
CREATE INDEX idx4 ON public.supplier USING btree (s_suppkey, s_nationkey);  
  
CREATE INDEX idx ON public.partsupp USING btree (ps_partkey, ps_suppkey, ps_supplycost);  
CREATE INDEX idx_partsupp_partkey ON public.partsupp USING btree (ps_partkey);  
CREATE INDEX idx_partsupp_suppkey ON public.partsupp USING btree (ps_suppkey);  
CREATE INDEX idx_partsupp__4 ON public.partsupp USING btree (ps_supplycost);  

citus测试TPC-H遇到的兼容性报错与解决方法

错误1

ERROR:  could not run distributed query with subquery outside the FROM and WHERE clauses  
HINT:  Consider using an equality filter on the distributed table's partition column.  

报错SQL

select  
        ps_partkey,  
        sum(ps_supplycost * ps_availqty) as value  
from  
        partsupp,  
        supplier,  
        nation  
where  
        ps_suppkey = s_suppkey  
        and s_nationkey = n_nationkey  
        and n_name = 'SAUDI ARABIA'  
group by  
        ps_partkey having  
                sum(ps_supplycost * ps_availqty) > (  
                        select  
                                sum(ps_supplycost * ps_availqty) * 0.0000005000  
                        from  
                                partsupp,  
                                supplier,  
                                nation  
                        where  
                                ps_suppkey = s_suppkey  
                                and s_nationkey = n_nationkey  
                                and n_name = 'SAUDI ARABIA'  
                )  
order by  
        value desc  
LIMIT 1;  

改成

select  
        ps_partkey,  
        sum(ps_supplycost * ps_availqty) as value  
from  
        partsupp,  
        supplier,  
        nation  
where  
        ps_suppkey = s_suppkey  
        and s_nationkey = n_nationkey  
        and n_name = 'SAUDI ARABIA'  
group by  
        ps_partkey having  
                sum(ps_supplycost * ps_availqty) >   
		             ( sum(ps_supplycost * ps_availqty) filter   
			       (where  
                                ps_suppkey = s_suppkey  
                                and s_nationkey = n_nationkey  
                                and n_name = 'SAUDI ARABIA'  
			        )   
			     ) * 0.0000005000  
                         
order by  
        value desc  
LIMIT 1;  

错误2

ERROR:  the query contains a join that requires repartitioning
HINT:  Set citus.enable_repartition_joins to on to enable repartitioning

报错SQL

select
        l_orderkey,
        sum(l_extendedprice * (1 - l_discount)) as revenue,
        o_orderdate,
        o_shippriority
from
        customer,
        orders,
        lineitem
where
        c_mktsegment = 'AUTOMOBILE'
        and c_custkey = o_custkey
        and l_orderkey = o_orderkey
        and o_orderdate < date '1995-03-08'
        and l_shipdate > date '1995-03-08'
group by
        l_orderkey,
        o_orderdate,
        o_shippriority
order by
        revenue desc,
        o_orderdate
LIMIT 10;

有些SQL需要repartition join,需要设置如下参数

alter role postgres set citus.enable_repartition_joins =on;  

错误3

ERROR:  the intermediate result size exceeds citus.max_intermediate_result_size (currently 1048576 kB)  
DETAIL:  Citus restricts the size of intermediate results of complex subqueries and CTEs to avoid accidentally pulling large result sets into once place.  
HINT:  To run the current query, set citus.max_intermediate_result_size to a higher value or -1 to disable.  

有些SQL的中间结果比较大,可以设置为无限制,重新测试

alter role postgres set citus.max_intermediate_result_size =-1;  

错误4

ERROR:  0A000: complex joins are only supported when all distributed tables are joined on their distribution columns with equal operator  

复杂JOIN,仅支持分布键作为JOIN的字段,并且只能使用等值JOIN。

postgres=# explain select          
        sum(l_extendedprice) / 7.0 as avg_yearly  
from  
        lineitem join   
        part on (p_partkey = l_partkey and p_brand = 'Brand#13' and p_container = 'SM PKG') join  
        (SELECT l_partkey AS agg_partkey, 0.2 * avg(l_quantity) AS avg_quantity FROM lineitem GROUP BY l_partkey) part_agg  
on (agg_partkey = l_partkey) limit 10;  
  
ERROR:  0A000: complex joins are only supported when all distributed tables are joined on their distribution columns with equal operator  
LOCATION:  DeferErrorIfUnsupportedSubqueryPushdown, query_pushdown_planning.c:469  

对应的代码如下

https://github.com/citusdata/citus/blob/d63cbf382288358533b71d907f6a3a7fc8a1df5f/src/backend/distributed/planner/query_pushdown_planning.c

	else if (!RestrictionEquivalenceForPartitionKeys(plannerRestrictionContext))  
	{  
		return DeferredError(ERRCODE_FEATURE_NOT_SUPPORTED,  
							 "complex joins are only supported when all distributed tables are "  
							 "joined on their distribution columns with equal operator",  
							 NULL, NULL);  
	}  

错误5

ERROR:  failed to execute task 18  

报错SQL

select  
        ps_partkey,  
        sum(ps_supplycost * ps_availqty) as value  
from  
        partsupp,  
        supplier,  
        nation  
where  
        ps_suppkey = s_suppkey  
        and s_nationkey = n_nationkey  
        and n_name = 'SAUDI ARABIA'  
group by  
        ps_partkey having  
                sum(ps_supplycost * ps_availqty) > ( sum(ps_supplycost * ps_availqty) filter (where  
                                ps_suppkey = s_suppkey  
                                and s_nationkey = n_nationkey  
                                and n_name = 'SAUDI ARABIA') ) * 0.0000005000  
                         
order by  
        value desc  
LIMIT 1;  

错误6

ERROR:  0A000: cannot run outer join query if join is not on the partition column
DETAIL:  Outer joins requiring repartitioning are not supported.
LOCATION:  FixedJoinOrderList, multi_join_order.c:189

报错SQL

select
        c_count,
        count(*) as custdist
from
        (
                select
                        c_custkey,
                        count(o_orderkey)
                from
                        customer left outer join orders on
                                c_custkey = o_custkey
                                and o_comment not like '%unusual%requests%'
                group by
                        c_custkey
        ) as c_orders (c_custkey, c_count)
group by
        c_count
order by
        custdist desc,
        c_count desc
LIMIT 1;

小结

citus的复杂SQL的语法支持还不是非常完备,如果要实现比较复杂的运算,建议Writing in SQL, thinking in MapReduce

https://www.citusdata.com/blog/2018/08/17/breaking-down-citus-real-time-executor/

这样的话,基本上你可以避免掉复杂的SQL,使用分解动作来实现。

相关实践学习
使用PolarDB和ECS搭建门户网站
本场景主要介绍基于PolarDB和ECS实现搭建门户网站。
阿里云数据库产品家族及特性
阿里云智能数据库产品团队一直致力于不断健全产品体系,提升产品性能,打磨产品功能,从而帮助客户实现更加极致的弹性能力、具备更强的扩展能力、并利用云设施进一步降低企业成本。以云原生+分布式为核心技术抓手,打造以自研的在线事务型(OLTP)数据库Polar DB和在线分析型(OLAP)数据库Analytic DB为代表的新一代企业级云原生数据库产品体系, 结合NoSQL数据库、数据库生态工具、云原生智能化数据库管控平台,为阿里巴巴经济体以及各个行业的企业客户和开发者提供从公共云到混合云再到私有云的完整解决方案,提供基于云基础设施进行数据从处理、到存储、再到计算与分析的一体化解决方案。本节课带你了解阿里云数据库产品家族及特性。
相关文章
|
7月前
|
弹性计算 关系型数据库 OLAP
快速上手并跑通AnalyticDB PostgreSQL版TPC-H测试
本案例介绍如何创建AnalyticDB PostgreSQL实例、生成测试数据、建表、导入数据,并跑通TPC-H查询。
335 0
|
11月前
|
SQL 分布式计算 关系型数据库
「PostgreSQL技巧」Citus实时执行程序如何并行化查询
「PostgreSQL技巧」Citus实时执行程序如何并行化查询
|
SQL 存储 运维
从Citus深度解密如何基于PostgreSQL做分布式数据库
从源码级别揭秘Citus如何基于PostgreSQL做一款分布式数据库,解决分布式场景的数据分片、分布式SQL、分布式事务、数据倾斜、数据迁移等难点问题,理解分布式领域设计的“取”与“舍”。
1463 3
从Citus深度解密如何基于PostgreSQL做分布式数据库
|
存储 SQL 安全
分布式 PostgreSQL,Citus(11.x) 效用函数
分布式 PostgreSQL,Citus(11.x) 效用函数
569 0
|
SQL 缓存 关系型数据库
分布式 PostgreSQL,Citus 11.x SQL 参考(中文手册)
分布式 PostgreSQL,Citus 11.x SQL 参考(中文手册)
519 0
|
SQL 关系型数据库 物联网
Hyperscale (Citus) ,分布式 PostgreSQL 实战指南
Hyperscale (Citus) ,分布式 PostgreSQL 实战指南
260 0
|
SQL 关系型数据库 PostgreSQL
Citus 11(分布式 PostgreSQL) 文档贡献与本地运行
Citus 11(分布式 PostgreSQL) 文档贡献与本地运行
160 0
Citus 11(分布式 PostgreSQL) 文档贡献与本地运行
|
存储 SQL 关系型数据库
分布式 PostgreSQL 集群(Citus),官方快速入门教程
分布式 PostgreSQL 集群(Citus),官方快速入门教程
321 0
|
SQL 存储 NoSQL
分布式 PostgreSQL 集群(Citus),分布式表中的分布列选择最佳实践
分布式 PostgreSQL 集群(Citus),分布式表中的分布列选择最佳实践
491 0
分布式 PostgreSQL 集群(Citus),分布式表中的分布列选择最佳实践
|
网络协议 Ubuntu 关系型数据库
分布式 PostgreSQL 集群(Citus)官方安装指南
分布式 PostgreSQL 集群(Citus)官方安装指南
1044 0

相关产品

  • 云原生数据库 PolarDB