Greenplum:分区前思考几个问题-阿里云开发者社区

Greenplum:分区前思考几个问题

2017-11-15 1478

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

简介：

Deciding on a Table Partitioning Strategy

分区前思考几个问题

Not all tables are good candidates for partitioning. If the answer is yes to all or most of

the following questions, then table partitioning is a viable database design strategy for

improving query performance. If the answer is no to most of the following questions,

then table partitioning is not the right solution for that table:

• Is the table large enough?

表足够大？超过一千万条记录？

Large fact tables are good candidates for table

partitioning. If you have millions or billions of records in a table, you wi ll see

performance benefits from logically breaking that data up into smaller chunks. For

smaller tables with only a few thousand rows or less, the administrative overhead

of maintaining the partitions will outweigh any performance benefits you might

see.

• Are you experiencing unsatisfactory performance?

查询性能满意否？

As with any performance tuning initiative, a table should be partitioned only if queries against that table are

producing slower response times than desired.

• Do your query predicates have identifiable access patterns?

查询where条件有规律可循？

Examine the WHERE clauses of your query workload and look for table columns that are

consistently used to access data. For example, if most of your queries tend to look

up records by date, then a monthly or weekly date-partitioning design might be

beneficial. Or if you tend to access records by region, consider a list-partitioning

design to divide the table by region.

• Does your data warehouse maintain a window of historical data?

数据时间窗口定长？例如只保留12个月?

Another consideration for partition design is your organization’s business requirements for

maintaining historical data. For example, your data warehouse may only require

you to keep the past twelve months worth of data. If the data is partitioned by

month, you can easily drop the oldest monthly partition from the warehouse, and

load current data into the most recent monthly partition.

• Can the data be divided into somewhat equal parts based on some defining

criteria?

每个分区差不多大？

You should choose partitioning criteria that will divide your data as

evenly as possible. If the partitions contain a relatively equal number of records,

query performance improves based on the number of partitions created. For

example, by dividing a large table into 10 partitions, a query will execute 10 times

faster than it would against the unpartitioned table (provided that the partitions are

designed to support the query’s criteria).

本文转自 hexiaini235 51CTO博客，原文链接：http://blog.51cto.com/idata/1266063，如需转载请自行联系原作者

Greenplum:分区前思考几个问题

热门文章

最新文章

相关课程

相关电子书

相关实验场景