5. Querying-阿里云开发者社区

Hibernate Search第二个最重要的能力就是执行lucene查询和检索Hibernate session中的实体.

准备和执行查询包括以下步骤：

创建FullTextSession
创建Lucene query，通过Hibernate Search query DSL (recommended)或者使用Lucene query API
Wrapping the Lucene query using an org.hibernate.Query
执行查询--> list() or scroll()

我们使用FullTextSession进行查询，通过传递一个Hibernate的session

Example 5.1. Creating a FullTextSession

Session session = sessionFactory.openSession();

...

FullTextSession fullTextSession =Search.getFullTextSession(session);

一旦你拥有了FullTextSession,你可以使用2种查询方法: Hibernate Search query DSL 或者 Lucene query.

DSL查询方法：

final QueryBuilder b = fullTextSession.getSearchFactory()
    .buildQueryBuilder().forEntity( Myth.class ).get();

org.apache.lucene.search.Query luceneQuery =
    b.keyword()
        .onField("history").boostedTo(3)
        .matching("storm")
        .createQuery();

org.hibernate.Query fullTextQuery = fullTextSession.createFullTextQuery( luceneQuery );List result = fullTextQuery.list();//return a list of managed objects

二选一，你可以选择一种方法进行查询操作。下面的例子是lucene api查询.

Example 5.2. Creating a Lucene query via the QueryParser

SearchFactory searchFactory = fullTextSession.getSearchFactory();
org.apache.lucene.queryParser.QueryParser parser = 
    new QueryParser("title", searchFactory.getAnalyzer(Myth.class) );
try {
    org.apache.lucene.search.Query luceneQuery = parser.parse( "history:storm^3" );
}
catch (ParseException e) {
    //handle parsing failure
}

org.hibernate.Query fullTextQuery = fullTextSession.createFullTextQuery(luceneQuery);List result = fullTextQuery.list();//return a list of managed objects

Note

Hibernate query方法是基于lucene query的:org.hibernate.Query, 这意味着Hibernate query也支持HQL, Native or Criteria). The regular list() , uniqueResult(), iterate() and scroll()等平常我们使用的方法

你也可以使用JPA查询:

Example 5.3. Creating a Search query using the JPA API

EntityManager em = entityManagerFactory.createEntityManager();

FullTextEntityManager fullTextEntityManager = org.hibernate.search.jpa.Search.getFullTextEntityManager(em);

...finalQueryBuilder b = fullTextEntityManager.getSearchFactory().buildQueryBuilder().forEntity(Myth.class).get();

org.apache.lucene.search.Query luceneQuery =  b.keyword().onField("history").boostedTo(3).matching("storm").createQuery();
javax.persistence.Query fullTextQuery = fullTextEntityManager.createFullTextQuery( luceneQuery );

List result = fullTextQuery.getResultList();//return a list of managed objects

Note

接下来的例子都是介绍hibernate apis,但是可以很方便的转换到jpa方式、

5.1. Building queries

5.1.1. Building a Lucene query using the Lucene API

使用lucene api,你可以有几个选项，使用query parser(简单查询)，或者lucene programmatic api(复杂查询）

这个超出我们文档范围。请出门右拐找lucene文档。

5.1.2. Building a Lucene query with the Hibernate Search query DSL

使用lucene programmatic api进行全文检索挺麻烦的...balabala....

Hibernate Search 的DSL查询方法的api可以称作流畅的api(无耻 - -），有几个特性:

方法名言简意赅
省略不必要的配置
It often uses the chaining method pattern（没懂 - -)
方便使用和阅读

现在我们来看如何使用ＡＰＩ，首先需要一个QueryBuilder，绑定一个要查询的类。QueryBuilder知道用什么分析器，

使用什么桥。

你也可以重写域要使用的分析器,但是很少这么做。除非你知道你在做什么、

QueryBuilder mythQB = searchFactory.buildQueryBuilder().forEntity(Myth.class).overridesForField("history","stem_analyzer_definition").get();

使用query builder,要注意的是最终结果都是来自lucene query.因为这个原因，我们可以很容易的将lucene's query

parser或者lucene programmatic api的查询通hibernate search DSL结合在一起，以防DSL不支持一些功能

5.1.2.1. Keyword queries 关键字查询

我们先来查询特定单词

Query luceneQuery = mythQB.keyword().onField("history").matching("storm").createQuery();

keyword()的意思是，查找一个特定的单词。OnField()指明查找哪个域。matching()为要查询的单词。

storm这个值通过history桥
桥的值之后会传递给分析器，分析器对索引进行匹配。

我们来看看被搜索的属性不是String的时候：

@Entity@IndexedpublicclassMyth{  @Field(analyze =Analyze.NO)  @DateBridge(resolution =Resolution.YEAR)publicDate getCreationDate(){return creationDate;}publicDate setCreationDate(Date creationDate){this.creationDate = creationDate;}privateDate creationDate;...}

Date birthdate =...;Query luceneQuery = mythQb.keyword().onField("creationDate").matching(birthdate).createQuery();

Note

使用lucene必须将日期转化为String类型。而hibernate search不用

hibernate search支持各种变换，不单单是Date,也提供其他的桥，提供objectToString方法（太方便啦！lucene只支持

String，而hibernate帮我们封装好啦）

接下来我们来个有点难度的例子。使用连词分析器(ngram analyzers)。连词分析器可以弥补因为用户打错字，导致

搜索不到结果的情况。比如我们搜索(3-grams,应该是3个字母组合的意思)hibernate可以是： hib, ibe, ber, rna, nat, ate.

@AnalyzerDef(name ="ngram",  tokenizer = @TokenizerDef(factory =StandardTokenizerFactory.class),  filters ={    @TokenFilterDef(factory =StandardFilterFactory.class),    @TokenFilterDef(factory =LowerCaseFilterFactory.class),    @TokenFilterDef(factory =StopFilterFactory.class),    @TokenFilterDef(factory =NGramFilterFactory.class,      params ={        @Parameter(name ="minGramSize", value ="3"),        @Parameter(name ="maxGramSize", value ="3")})})@Entity@IndexedpublicclassMyth{  @Field(analyzer=@Analyzer(definition="ngram")  @DateBridge(resolution =Resolution.YEAR)publicString getName(){return name;}publicString setName(Date name){this.name = name;}privateString name;...}

Date birthdate =...;Query luceneQuery = mythQb.keyword().onField("name").matching("Sisiphus").createQuery();

在上面的例子中，我们搜索的关键字Sisiphus，会先转换成小写，然后分成3个字母组合(3-grams) sis, isi, sip, phu, hus. 每个

n-gram都将作为查询关键字。

Note

如果不想使用桥(field bridge)或者分析器，可以使用ignoreAnalyzer()和ignoreFieldBridge（）

查询一个域里面可能包含的多个关键字使用：

//search document with storm or lightning in their historyQuery luceneQuery =    mythQB.keyword().onField("history").matching("storm lightning").createQuery();

查询几个域中可能包含关键字使用：

Query luceneQuery = mythQB.keyword().onFields("history","description","name").matching("storm").createQuery();

我们可以对域设置权重，name这个域权重为5：

Query luceneQuery = mythQB.keyword().onField("history").andField("name").boostedTo(5).andField("description").matching("storm").createQuery();

5.1.2.2. Fuzzy queries 模糊查询（应该只支持英文）

使用模糊字段查询。

Query luceneQuery = mythQB.keyword().fuzzy().withThreshold(.8f).withPrefixLength(1).onField("history").matching("starm").createQuery();

threshold（ 临界值）规定了两个 terms 被认为相同（匹配）的上限，是 0 ～ 1 之间的数，默认是 0.5 。

prefixLength（前缀长度）说明了模糊性（被忽略的前缀长度）：如果被设置为0，则任意一个非零的值被推荐（估计是匹配所有）

5.1.2.3. Wildcard queries 通配符查询

可以执行通配符搜索（查找只知道单词部分内容），“？”代表单个字符，“ * ”代表任意多个字符。注意：出于性能的考虑，查询时不要以通配符开头。

Query luceneQuery = mythQB.keyword().wildcard().onField("history").matching("sto*").createQuery();

5.1.2.4. Phrase queries 短语查询

可以使用它来搜索确切匹配或者相似的句子，可以使用 phrase （）来完成：

Query luceneQuery = mythQB.phrase().onField("history").sentence("Thou shalt not kill").createQuery();

也可以搜索相似的句子，可以通过添加一个 slop factor 来实现。它允许其它单词出现在这个句子中。

Query luceneQuery = mythQB.phrase().withSlop(3).onField("history").sentence("Thou kill").createQuery();

5.1.2.5. Range queries 边界查询

现在介绍边界搜索（可以作用在数字、日期、字符串等上）。边界搜索用来在某两个边界之间进行搜索，或者搜索给定值之上或之下的结果，示例代码如下：

//look for0<= starred <3Query luceneQuery = mythQB.range().onField("starred").from(0).to(3).excludeLimit().createQuery();

//look for myths strictly BCDate beforeChrist =...;Query luceneQuery = mythQB.range().onField("creationDate").below(beforeChrist).excludeLimit().createQuery();

5.1.2.6. Combining queries 组合查询

最后介绍组合查询，可以创建更复杂的查询语句，有以下组合操作可以供使用：

SHOULD: 查询应该包含子查询的结果。
MUST: 必须包含匹配元素的子查询。
MUST NOT: 一定不能包含。

//look for popular modern myths that are not urban

DatetwentiethCentury =...;

Query luceneQuery = mythQB.bool().must( mythQB.keyword().onField("description").matching("urban").createQuery()).not().must( mythQB.range().onField("starred").above(4).createQuery()).must( mythQB.range().onField("creationDate").above(twentiethCentury).createQuery()).createQuery();

//look for popular myths that are preferably urban

Query luceneQuery = mythQB.bool().should( mythQB.keyword().onField("description").matching("urban").createQuery()).must( mythQB.range().onField("starred").above(4).createQuery() ).createQuery();

//look for all myths except religious ones

Query luceneQuery = mythQB.all().except( monthQb.keyword().onField("description_stem").matching("religion").createQuery()).createQuery();

5.1.2.7. Query options

? boostedTo：可以用在查询实体或字段中，使用给定的因子提升整个查询或特定字段。

? withConstantScore (on query)：和boost（作用）一样，所有匹配的查询结果有一个常量分数。

? filteredBy(on query)：使用过滤器过滤查询结果。

? ignoreAnalyzer (on field)：处理字段时忽略analyzer。

? ignoreFieldBridge (on field)：处理字段时忽略field bridge。

来看例子：

Query luceneQuery = mythQB
    .bool()
      .should( mythQB.keyword().onField("description").matching("urban").createQuery() )
      .should( mythQB
        .keyword()
        .onField("name")
          .boostedTo(3)
          .ignoreAnalyzer()
        .matching("urban").createQuery() )
      .must( mythQB
        .range()
          .boostedTo(5).withConstantScore()
        .onField("starred").above(4).createQuery() )
    .createQuery();

5.1.3. Building a Hibernate Search query 构建hibernate search查询

目前为止我们只讨论了如何创建 LuceneQuery ，这只是一系列动作中的第一步，现在看一看如果从 Lucene Query 创建 Hibernate Search Query 。

5.1.3.1. Generality

一旦Lucene Query被创建，他需要被包装成一个Hibernate查询。如果没有特殊说明，它将会对所有的索引实体进行查询，可能返回所有的索引类的类型。

从性能的角度考虑，建议限制返回的实体类型。

Example 5.4. Wrapping a Lucene query into a Hibernate Query

FullTextSession fullTextSession = Search.getFullTextSession( session );

org.hibernate.Query fullTextQuery = fullTextSession.createFullTextQuery( luceneQuery );

Example 5.5. Filtering the search result by entity type

fullTextQuery = fullTextSession    .createFullTextQuery( luceneQuery, Customer.class );

// or

fullTextQuery = fullTextSession    .createFullTextQuery( luceneQuery, Item.class, Actor.class );

在例 5.5 中，第一个例子只返回匹配 Customer 的结果，第二个例子返回匹配 Actor 和 Item 类的机构。结果限制是多态实现的，也就是说如果有两个子类 Salesman 和 Custom 继承自父类 Person ，可以只指定 Person.class 来过滤返回结果。

5.1.3.2. Pagination 分页

由于性能的原因，推荐每次查询返回一定数量的查询结果。事实上用户浏览时从一页翻到另一页是非常常见的情况。你定义翻页的方法正是使用 HQL 或 Criteria 定义分页的方法。

Example 5.6. Defining pagination for a search query

org.hibernate.Query fullTextQuery =     fullTextSession.createFullTextQuery( luceneQuery, Customer.class );

fullTextQuery.setFirstResult(15); //start from the 15th

elementfullTextQuery.setMaxResults(10); //return 10 elements

Tip

可以使用fulltextQuery.getResultSize()获取全部匹配元素的个数。

5.1.3.3. Sorting 排序

apache lucene提供非常强大方便的排序功能，

Example 5.7. Specifying a Lucene Sort in order to sort the results

org.hibernate.search.FullTextQuery query = s.createFullTextQuery( query, Book.class );

org.apache.lucene.search.Sort sort = new Sort(new SortField("title", SortField.STRING));
query.setSort(sort);
List results = query.list();

Tip

注意需要排序的域是不能被标注为分词的( tokenized )

5.1.3.4. Fetching strategy 抓取策略

Example 5.8. Specifying FetchMode on a query

Criteria criteria =     s.createCriteria( Book.class ).setFetchMode( "authors", FetchMode.JOIN );

s.createFullTextQuery( luceneQuery ).setCriteriaQuery( criteria );

上面的例子将返回所有luceneQuery匹配的Books,authors将被作为外部链接加载。

Important

只有设置fetch mode才可以使用criteria的restriction

Important

如果返回多个不同类型实体，则不能使用setCriteriaQuery

5.1.3.5. Projection 投影

有些时候不需要返回整个实体模型，而仅仅是实体中的部分字段。 Hibernate Search 允许你这样做，即返回几个字段。

Example 5.9. Using projection instead of returning the full domain object

org.hibernate.search.FullTextQuery query =     s.createFullTextQuery( luceneQuery, Book.class );

query.setProjection( "id", "summary", "body", "mainAuthor.name" );

List results = query.list();

Object[] firstResult = (Object[]) results.get(0);

Integer id = firstResult[0];

String summary = firstResult[1];

String body = firstResult[2];

String authorName = firstResult[3];

5.1.3.6. Customizing object initialization strategies 自定义对象初始化策略

设置hibernate search先从二级缓存取实体还是先从database中取：

Example 5.11. Check the second-level cache before using a query

FullTextQuery query = session.createFullTextQuery(luceneQuery, User.class);query.initializeObjectWith(    ObjectLookupMethod.SECOND_LEVEL_CACHE,    DatabaseRetrievalMethod.QUERY);

ObjectLookupMethod.PERSISTENCE_CONTEXT: useful if most of the matching entities are already in the persistence context (ie loaded in the Session or EntityManager)
ObjectLookupMethod.SECOND_LEVEL_CACHE: check first the persistence context and then the second-level cache.

5.1.3.7. Limiting the time of a query 限制时间查询

使用Hibernate Search进行全文检索时，你可以使用下面两种方式限制每次查询的时间：

? 当限定时间到时抛出异常

? 当限定时间到时限制查询结果的个数。（EXPERIMENTAL）

两种方式不兼容。

5.2. Retrieving the results 检索结果

一旦建立了Hibernate Search query.执行查询操作就像执行HQL，Criteria查询一样， list(), uniqueResult(), iterate(), scroll()

5.2.1. Performance considerations 考虑效率

如果需要返回特定结果，（比如利用分页），并且希望所有查询结果都运用该规则，推荐 list() or uniqueResult()。

list()可以设置batch-size。当使用 list() , uniqueResult() and iterate()时， 注意hibernate search会处理所有Lucene匹配的索引（包括分页）

如果你希望尽量少去加载lucene document,scroll非常适合。别忘了使用完关闭ScrollableResults对象

Important

分页比用scrolling好

5.2.2. Result size 返回结果数量

有时候我们需要知道搜到到的结果集数量

像我们使用google搜索时，显示的结果数量 "1-10 of about 888,000,000"
分页需要
to implement a multi step search engine (adding approximation if the restricted query return no or not enough results)

将所有匹配到的lucene document都取出来肯定会损耗很多资源。

hibernate search允许获取所有匹配到的索引document，即使你设置了分页参数，.更有趣的是，支持获取所有索引个数，而不需要加搜索条件

Example 5.16. Determining the result size of a query

org.hibernate.search.FullTextQuery query =     s.createFullTextQuery( luceneQuery, Book.class );

//return the number of matching books without loading a single one

assert 3245 == query.getResultSize(); 

org.hibernate.search.FullTextQuery query =     s.createFullTextQuery( luceneQuery, Book.class );

query.setMaxResult(10);List results = query.list();

//return the total number of matching books regardless of pagination

assert 3245 == query.getResultSize();

Note

就像Google,搜索结果数量只是个大概，如果有索引还没有被更新添加

5.2.3. ResultTransformer 结果转换

就像 Section 5.1.3.5, “Projection”章节看到的投影结果就是返回成一个Object数组。

但有时候这样的数据结构不是我们想要的，那么我们可以转换：

Example 5.17. Using ResultTransformer in conjunction with projections

org.hibernate.search.FullTextQuery query =     s.createFullTextQuery( luceneQuery, Book.class );

query.setProjection( "title", "mainAuthor.name" );
query.setResultTransformer( 
    new StaticAliasToBeanResultTransformer( 
        BookView.class, 
        "title", 
        "author" ) 
);
List<BookView> results = (List<BookView>) query.list();

for(BookView view : results) {

log.info( "Book: " + view.getTitle() + ", " + view.getAuthor() );

上面的例子，将投影的两个域title,mainAuthor.name,利用ResultTransformaer封装成BookView(tile,author)类.

5.2.4. Understanding results 理解、调试结果

有时候我们查询得到的结果不是我们想要的，比如返回空结果或者乱七八糟，我们可以利用

luke来调试。但是hibernate search也提供一个操作lucene解释类（ Explanation object ）的方法。

fullTextQuery.explain(int)
使用projection

第一个方式使用ducument id作为参数、获得Explanation对象。document id 可以通过projection或者

FullTextQuery.DOCUMENT_ID

Warning

Document ID 和实体类的ID不是同一个东西

第二个方法：利用FullTextQuery.EXPLANATION常量

Example 5.18. Retrieving the Lucene Explanation object using projection

FullTextQuery ftQuery = s.createFullTextQuery( luceneQuery, Dvd.class )

        .setProjection(

              FullTextQuery.DOCUMENT_ID,

              FullTextQuery.EXPLANATION,

              FullTextQuery.THIS );

@SuppressWarnings("unchecked") List<Object[]> results = ftQuery.list();

for (Object[] result : results) {

    Explanation e = (Explanation) result[1];

    display( e.toString() );

注意，在使用explanation对象的时候，会粗略、损耗性大的再跑一遍与lucene query。所以必须的

时候再使用这个。

5.3. Filters 过滤器

apache lucene允许使用filter过滤器过滤查询结果，也支持自定义的过滤器。应用例子：

security
temporal data (eg. view only last month's data)
population filter (eg. search limited to a given category)
and many more

Hibernate Search过滤器类似Hibernate过滤器：

Example 5.19. Enabling fulltext filters for a given query

fullTextQuery = s.createFullTextQuery( query, Driver.class );

fullTextQuery.enableFullTextFilter("bestDriver");

fullTextQuery.enableFullTextFilter("security").setParameter( "login", "andre" );

fullTextQuery.list(); //returns only best drivers where andre has credentials

上面的例子中我们启用了两个过滤器。

通过@FullTextFilterDef标注声明过滤器。过滤器可以标注在任何被@Indexed的实体类。

过滤器必须实现filter的函数

Example 5.20. Defining and implementing a Filter

@Entity
@Indexed
@FullTextFilterDefs( {
    @FullTextFilterDef(name = "bestDriver", impl = BestDriversFilter.class), 
    @FullTextFilterDef(name = "security", impl = SecurityFilterFactory.class) 
})
public class Driver { ... }

public class BestDriversFilter extends org.apache.lucene.search.Filter {

    public DocIdSet getDocIdSet(IndexReader reader) throws IOException {

        OpenBitSet bitSet = new OpenBitSet( reader.maxDoc() );

        TermDocs termDocs = reader.termDocs( new Term( "score", "5" ) );

        while ( termDocs.next() ) {

            bitSet.set( termDocs.doc() );        }        return bitSet;    }}

下步意义。

5. Querying

Chapter 5. Querying

Note

Note

5.1. Building queries

5.1.1. Building a Lucene query using the Lucene API

5.1.2. Building a Lucene query with the Hibernate Search query DSL

5.1.2.1. Keyword queries 关键字查询

Note

Note

5.1.2.2. Fuzzy queries 模糊查询（应该只支持英文）

5.1.2.3. Wildcard queries 通配符查询

5.1.2.4. Phrase queries 短语查询

5.1.2.5. Range queries 边界查询

5.1.2.6. Combining queries 组合查询

5.1.2.7. Query options

5.1.3. Building a Hibernate Search query 构建hibernate search查询

5.1.3.1. Generality

5.1.3.2. Pagination 分页

Tip

5.1.3.3. Sorting 排序

Tip

5.1.3.4. Fetching strategy 抓取策略

Important

Important

5.1.3.5. Projection 投影

5.1.3.6. Customizing object initialization strategies 自定义对象初始化策略

5.1.3.7. Limiting the time of a query 限制时间查询

5.2. Retrieving the results 检索结果

5.2.1. Performance considerations 考虑效率

Important

5.2.2. Result size 返回结果数量

Note

5.2.3. ResultTransformer 结果转换

5.2.4. Understanding results 理解、调试结果

Warning

5.3. Filters 过滤器

热门文章

最新文章

相关电子书