Tablestore入门手册--全局二级索引使用

概述

全局二级索引和主表有着相同的存储结构，其索引列可以是主表的主键列或预定义列，其属性列为主表的预定义列。写入主表的数据，经过毫秒级延迟异步同步到全局二级索引，即可被查到。
与主表相同，如GetRow，BatchGetRow和BatchGetRow的查询操作均可以作用于二级索引；与主表不同，二级索引不支持用户直接写，只接受来自主表的数据同步。

指定主键的前缀范围，可以实现对主表的范围扫描（GetRange），查询范围的指定必须和主键范围保持一致。如果查询范围无法表示成主键前缀的形式，则可以使用二级索引重新组合字段顺序。相比范围查询（GetRange）加过滤器（filter）的方式，二级索引可以大大减少扫描数据量，提升查询速度。
本文通过一个例子来阐述二级索引加速查询的本质，使用姿势，并配上完整代码——
以电话话单查询为背景，用户的每次通话，都会被记录在如下主表中：

CellNumber、StartTime作为表的联合主键，分别代表主叫号码与通话发生时间。
CalledNumber、Duration、BaseStationNumber三列为表的预定义列，分别代表被叫号码、通话时长、基站号码。
假想两个查询场景：

查询1：查询号码234567的主叫话单

GetRange直接查询主表即可——指定CellNumber最小值和最大值均为234567；指定StartTime最小值为0，最大值为INT_MAX。

查询2：查询号码123456的被叫话单，并返回基站号码

如果直接查询主表，必须扫描全表，再过滤出被叫号码CalledNumber为123456的行，性能差，成本高。此时，您可以创建一张全局二级索引表，具体做法如下——

全局二级索引schema

将被叫号码CalledNumber放在二级索引表的pk列（系统会自动追加主表pk列补全索引表pk列，保证行的唯一性）。查询要求返回基站号码BaseStationNumber，可以将这一列设置为索引表的属性列，否则还要用索引表查询得到的pk反查主表。因此，二级索引的shema如下所示。

其中，CalledNumber是用户指定的索引pk列；CellNumber和StartTime是系统自动补全的索引pk列。

查询全局二级索引

指定主键前缀范围CalledNumber从123456到123456，CellNumber从INT_MIN到INT_MAX，StartTime从INT_MIN到INT_MAX，使用GetRange查询索引表，并指定返回列包含基站号码。

下面来看看创建、查询以及删除全局二级索引的主要代码。

创建

创建分两种方式，效果等价。

方式一：创建主表的同时创建全局二级索引表

//二级索引IndexMeta
IndexMeta indexMeta = new IndexMeta(indexName);
indexMeta.addPrimaryKeyColumn(CALLED_NUMBER);       //将主表的预定义列"called_number"作为二级索引表的pk列
//此时会自动补齐二级索引表的剩余两个pk列: "cell_number", "start_time"
indexMeta.addDefinedColumn(BASE_STATION_NUMBER);    //将主表的预定义列"base_station_number"作为二级索引表的属性列

//创建主表时，同时创建索引表
TableMeta tableMeta = new TableMeta(tableName);

tableMeta.addPrimaryKeyColumn(CELL_NUMBER, PrimaryKeyType.INTEGER);
tableMeta.addPrimaryKeyColumn(START_TIME, PrimaryKeyType.INTEGER);

tableMeta.addDefinedColumn(new DefinedColumnSchema(CALLED_NUMBER, DefinedColumnType.INTEGER));
tableMeta.addDefinedColumn(new DefinedColumnSchema(BASE_STATION_NUMBER, DefinedColumnType.INTEGER));

TableOptions tableOptions = new TableOptions(-1, 1);
CreateTableRequest createTableRequest = new CreateTableRequest(tableMeta, tableOptions, Arrays.asList(indexMeta));
syncClient.createTable(createTableRequest);

方式二：先创建主表；再为已经存在的主表添加二级索引表

创建主表

TableMeta tableMeta = new TableMeta(tableName);

tableMeta.addPrimaryKeyColumn(CELL_NUMBER, PrimaryKeyType.INTEGER);
tableMeta.addPrimaryKeyColumn(START_TIME, PrimaryKeyType.INTEGER);

tableMeta.addDefinedColumn(new DefinedColumnSchema(CALLED_NUMBER, DefinedColumnType.INTEGER));
tableMeta.addDefinedColumn(new DefinedColumnSchema(BASE_STATION_NUMBER, DefinedColumnType.INTEGER));

// Set TTL to -1, never expire; Set maxVersions to 1, as one version is permitted
TableOptions tableOptions = new TableOptions(-1, 1);
CreateTableRequest createTableRequest = new CreateTableRequest(tableMeta, tableOptions);
syncClient.createTable(createTableRequest);

创建二级索引
includeBaseData可以指定索引的同步方式。
true表示：创建索引前主表的存量数据，也会被同步到索引表中；
false则表示：只会同步创建索引后主表的增量数据。

IndexMeta indexMeta = new IndexMeta(indexName);
indexMeta.addPrimaryKeyColumn(CALLED_NUMBER);       //将主表的预定义列"called_number"作为二级索引表的pk列
                                                    //此时会自动补齐二级索引表的剩余两个pk列: "cell_number", "start_time"
indexMeta.addDefinedColumn(BASE_STATION_NUMBER);    //将主表的预定义列"base_station_number"作为二级索引表的属性列

//全局二级索引索引创建请求(includeBaseData为true表示先同步主表全量数据，再同步增量数据; includeBaseData为false表示只同步增量数据)
CreateIndexRequest request = new CreateIndexRequest(tableName, indexMeta, true);

//创建全局二级索引
syncClient.createIndex(request);

查询

查询二级索引使用和查询主表一样的方式。在这个例子中，用户指定二级索引pk列为CalledNumber，系统会自动补齐剩余两列主键列CellNumber和StartTime。创建二级索引时，指定预定义列BaseStationNumber为属性列，因此直接查询索引即可返回“基站号码”信息，无需反查主表。但如果要返回通话时长Duration列，则需要先查询二级索引返回主键列，再用主键列反查主表。
本例中，查询号码123456的被叫话单，并返回基站号码可以这样写：

RangeRowQueryCriteria rangeRowQueryCriteria = new RangeRowQueryCriteria(indexName);

long calledNumber = 123456L;

// 构造主键
PrimaryKeyBuilder startPrimaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
startPrimaryKeyBuilder.addPrimaryKeyColumn(CALLED_NUMBER, PrimaryKeyValue.fromLong(calledNumber));
startPrimaryKeyBuilder.addPrimaryKeyColumn(CELL_NUMBER, PrimaryKeyValue.INF_MIN);
startPrimaryKeyBuilder.addPrimaryKeyColumn(START_TIME, PrimaryKeyValue.INF_MIN);
rangeRowQueryCriteria.setInclusiveStartPrimaryKey(startPrimaryKeyBuilder.build());

// 构造主键
PrimaryKeyBuilder endPrimaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
endPrimaryKeyBuilder.addPrimaryKeyColumn(CALLED_NUMBER, PrimaryKeyValue.fromLong(calledNumber));
endPrimaryKeyBuilder.addPrimaryKeyColumn(CELL_NUMBER, PrimaryKeyValue.INF_MAX);
endPrimaryKeyBuilder.addPrimaryKeyColumn(START_TIME, PrimaryKeyValue.INF_MAX);
rangeRowQueryCriteria.setExclusiveEndPrimaryKey(endPrimaryKeyBuilder.build());

rangeRowQueryCriteria.setMaxVersions(1);
rangeRowQueryCriteria.addColumnsToGet(BASE_STATION_NUMBER); //查询二级索引，返回pk列和属性列"base_station_number"

System.out.println(String.format("号码 %d 的所有被叫话单: ", calledNumber));
while (true) {
    GetRangeResponse getRangeResponse = syncClient.getRange(new GetRangeRequest(rangeRowQueryCriteria));
    for (Row row : getRangeResponse.getRows()) {
        System.out.println(row);
    }

    // 若nextStartPrimaryKey不为null, 则继续读取.
    if (getRangeResponse.getNextStartPrimaryKey() != null) {
        rangeRowQueryCriteria.setInclusiveStartPrimaryKey(getRangeResponse.getNextStartPrimaryKey());
    } else {
        break;
    }
}

列出二级索引

列出一张主表有哪些二级索引。

DescribeTableRequest request = new DescribeTableRequest(tableName);
DescribeTableResponse response = syncClient.describeTable(request);
for (IndexMeta indexMeta : response.getIndexMeta()) {
    System.out.println(indexMeta.getIndexName());
}

查询二级索引meta信息

可以如同查询主表meta一样，查询二级索引的meta信息。

DescribeTableResponse response = syncClient.describeTable(new DescribeTableRequest(indexName));
System.out.println(response.getTableMeta());

删除

删除二级索引。

DeleteIndexRequest request = new DeleteIndexRequest(tableName, indexName);
syncClient.deleteIndex(request);

代码

完整代码在这里：https://github.com/aliyun/tablestore-examples/tree/master/basic/Java/GlobalIndexCRD

总结

二级索引以另一种顺序重组主表的主键列和预定义列，在特定的场景下，您可以避免大范围的扫描主表，极大提升了查询效率。更多细节可以参考官网文档 https://help.aliyun.com/document_detail/91947.html。
如有疑问或者需要更好的在线支持，欢迎加入钉钉群：“表格存储公开交流群”（群号：23307953）。群内提供免费的在线专家服务，欢迎扫码加入。

Tablestore入门手册--全局二级索引使用

Tablestore入门手册--全局二级索引使用

概述

查询1：查询号码234567的主叫话单

查询2：查询号码123456的被叫话单，并返回基站号码

创建

查询

列出二级索引

查询二级索引meta信息

删除

代码

总结

云存储

热门文章

最新文章

相关课程

相关电子书

相关实验场景