【lucene】高亮案例
下面给出一份 基于 Lucene 8.5.0 的完整“高亮”示例,可直接拷贝运行(JDK 8+ 即可)。
示例演示:
1. 建索引(字段 必须存储原始文本,并 开启 Term Vector)。
2. 用 QueryScorer + Highlighter 对搜索结果进行高亮。
3. 输出带 `` 标签的高亮片段。
---
① Maven 依赖
```xml
org.apache.lucene
lucene-core
8.5.0
org.apache.lucene
lucene-analyzers-common
8.5.0
org.apache.lucene
lucene-highlighter
8.5.0
```
---
② 完整代码
```java
package demo;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.*;
import org.apache.lucene.index.*;
import org.apache.lucene.search.*;
import org.apache.lucene.search.highlight.*;
import org.apache.lucene.store.*;
import java.nio.file.Paths;
public class Lucene85HighlighterDemo {
public static void main(String[] args) throws Exception {
Directory dir = FSDirectory.open(Paths.get(\"idx_highlight\"));
/* ---------- 1. 建索引 ---------- */
Analyzer analyzer = new StandardAnalyzer();
IndexWriterConfig cfg = new IndexWriterConfig(analyzer);
// 必须开启 Term Vector,才能用 Highlighter
FieldType myType = new FieldType(TextField.TYPE_STORED);
myType.setStoreTermVectors(true);
myType.setStoreTermVectorPositions(true);
myType.setStoreTermVectorOffsets(true);
myType.freeze();
try (IndexWriter w = new IndexWriter(dir, cfg)) {
Document doc1 = new Document();
doc1.add(new StringField(\"id\", \"1\", Field.Store.YES));
doc1.add(new Field(\"content\", \"Apache Lucene is a high-performance search engine library.\", myType));
w.addDocument(doc1);
Document doc2 = new Document();
doc2.add(new StringField(\"id\", \"2\", Field.Store.YES));
doc2.add(new Field(\"content\", \"Lucene powers Elasticsearch and Solr to provide amazing search features.\", myType));
w.addDocument(doc2);
}
/* ---------- 2. 搜索 + 高亮 ---------- */
try (DirectoryReader r = DirectoryReader.open(dir)) {
IndexSearcher searcher = new IndexSearcher(r);
Query query = new QueryParser(\"content\", analyzer).parse(\"lucene\");
TopDocs docs = searcher.search(query, 10);
// 高亮组件
SimpleHTMLFormatter fmt = new SimpleHTMLFormatter(\"\", \"\");
QueryScorer scorer = new QueryScorer(query);
Highlighter hl = new Highlighter(fmt, scorer);
hl.setTextFragmenter(new SimpleFragmenter(50)); // 每段最多 50 字符
for (ScoreDoc sd : docs.scoreDocs) {
Document doc = searcher.doc(sd.doc);
String raw = doc.get(\"content\");
String best = hl.getBestFragment(analyzer, \"content\", raw);
System.out.println(\"id=\" + doc.get(\"id\") + \", score=\" + sd.score);
System.out.println(\"highlight=\" + best);
}
}
}
}
```
---
③ 运行结果示例
```
id=1, score=0.5753648
highlight=Apache Lucene is a high-performance search engine library.
id=2, score=0.5753648
highlight=Lucene powers Elasticsearch and Solr to provide amazing search features.
```
---
④ 关键点回顾
项 要求
字段必须存储原文 `Field.Store.YES`
必须开启 Term Vector `setStoreTermVectors(true)` 等
高亮器 `Highlighter`(普通文本) / `FastVectorHighlighter`(性能更好,但需额外配置)
---
一句话总结
> 只要 字段存储原文并开启 Term Vector,Lucene 8.5 用 Highlighter + QueryScorer 即可轻松实现关键字高亮,无需 ES 也能获得 `` 或 `` 标签效果。