实验一下Lucene是怎么使用的。
参考:http://www.importnew.com/12715.html (例子比较简单)
http://www.yiibai.com/lucene/lucene_first_application.html (例子比较复杂)
这里也有一个例子:http://www.tuicool.com/articles/aqIZNnE
我用的版本比较高,是6.2.1版本,文档查阅:
http://lucene.apache.org/core/6_2_1/core/index.html
首先在Intellij里面创建一个Maven项目。名字为lucene-demo。(主要参考 http://www.importnew.com/12715.html )
其中pom.xml如下:
<groupId>com.myapp</groupId>
<artifactId>lucene-demo</artifactId>
<version>1.0-SNAPSHOT</version>
<dependencies>
<!-- https://mvnrepository.com/artifact/org.apache.lucene/lucene-core -->
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-core</artifactId>
<version>6.2.1</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-queryparser</artifactId>
<version>6.2.1</version>
</dependency>
</dependencies>
讲了一个package:com.myapp.lucene,里面class LuceneDemo,内容如下:
package com.myapp.lucene;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.StringField;
import org.apache.lucene.document.TextField;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.queryparser.classic.ParseException;
import org.apache.lucene.queryparser.classic.QueryParser;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TopScoreDocCollector;
import org.apache.lucene.store.RAMDirectory;
import org.apache.lucene.store.Directory;
import java.io.IOException;
/**
* Created by baidu on 16/10/20.
*/
public class LuceneDemo {
// 0. Specify the analyzer for tokenizing text.
// The same analyzer should be used for indexing and searching
static StandardAnalyzer analyzer;
static Directory index;
static void prepareDoc() throws IOException{
// 0. init analyzer
analyzer = new StandardAnalyzer();
// 1. create index
index = new RAMDirectory();
IndexWriterConfig config = new IndexWriterConfig(analyzer);
IndexWriter w = new IndexWriter(index, config);
addDoc(w, "lucence tutorial", "123456");
addDoc(w, "hi hi hi", "222");
addDoc(w, "ok LUCENCE", "123");
w.close();
}
static void addDoc(IndexWriter w, String text, String more) throws IOException{
Document doc = new Document();
doc.add(new TextField("text", text, Field.Store.YES));
doc.add(new StringField("more", more, Field.Store.YES));
w.addDocument(doc);
}
static void search(String str) throws ParseException, IOException {
// 2. query
Query q = new QueryParser("text", analyzer).parse(str);
// 3. search
int listNum = 10;
IndexReader reader = DirectoryReader.open(index);
IndexSearcher searcher = new IndexSearcher(reader);
TopScoreDocCollector collector = TopScoreDocCollector.create(listNum);
searcher.search(q, collector);
ScoreDoc\[\] hits = collector.topDocs().scoreDocs;
// 4. display
System.out.printf("Found %d docs.\\n", hits.length);
for (int i=0; i<hits.length; i++) {
int docId = hits\[i\].doc;
Document doc = searcher.doc(docId);
System.out.printf("Doc %d: text: %s, more: %s\\n", i+1, doc.get("text"), doc.get("more"));
}
reader.close();
}
public static void main(String\[\] args) {
try {
prepareDoc();
search("Lucence");
} catch (IOException e) {
e.printStackTrace();
} catch (ParseException e) {
e.printStackTrace();
}
}
}
然后运行,能够成功:
Found 2 docs.
Doc 1: text: lucence tutorial, more: 123456
Doc 2: text: ok LUCENCE, more: 123
Process finished with exit code 0
因为用的是RAMDirectory,所以应该没有创建实际的目录和文件。
另外,代码和逻辑中有几点需要注意的地方:
注意,对于需要分词的内容我们使用TextField,对于像id这样不需要分词的内容我们使用StringField。
编码过程中,报过好几次错,关于Exception需要wrap或者throws的情况。
有些API的版本升级了,参数和以前不一样。在实际的代码中根据实际要求有所修改。一般都是简化了。
手机扫一扫
移动阅读更方便
你可能感兴趣的文章