Journal of Learning Apache Lucene - Boosting documents and fields

Not all documents and fields are created equal—or at least you can make sure that’s the case by using boosting. Boosting may be done during indexing or during searching.

#1 Boosting Documents
Document boosting is a feature that makes such a requirement simple to implement. By default, all documents have no boost—or, rather, they all have the same boost factor of 1.0. By changing a document’s boost factor, you can instruct Lucene to consider it more or less important with respect to other documents in the index when computing relevance.

For example:
if (isImportant(lowerDomain)) {
doc.setBoost(1.5F);
} else if (isUnimportant(lowerDomain)) {
doc.setBoost(0.1F);
}

#2 Boosting fields

Just as you can boost documents, you can also boost individual fields. When you boosta document, Lucene internally uses the same boost factor to boost each of its fields. Imagine that another requirement for the email-indexing application is to consider the subject field more important than the field with a sender’s name. In other words, search matches made in the subject field should be more valuable than equivalent
matches in the senderName field in our earlier example. To achieve this behavior, we use the setBoost(float) method of the Field class:

Field subjectField = new Field("subject", subject, Field.Store.YES, Field.Index.ANALYZED);
subjectField.setBoost(1.2F);

Comments

ShilpaNovember 25, 2015 at 5:56 AM
Hello Xu,
We are a corporate training firm. Your courses are very impressive and a lot of our clients have expressed need for online training in this area. We would love to discuss if we could collaborate with you on this so as to market your courses and generate additional revenue. Pls get in touch with me at shilpa.khatana@skillofy.com if you would be interested.
Thanks
Shilpa Khatana
ReplyDelete
Replies

Add comment

I love programming

Search This Blog

Journal of Learning Apache Lucene - Boosting documents and fields

Labels

Comments

Post a Comment

Popular posts from this blog

JasperReports - Configuration Reference

Stretch a row if data overflows in jasper reports

Live - solving the jasper report out of memory and high cpu usage problems