alvinalexander.com | career | drupal | java | mac | mysql | perl | scala | uml | unix  

Lucene example source code file (JRE_VERSION_MIGRATION.txt)

This example Lucene source code file (JRE_VERSION_MIGRATION.txt) is included in the DevDaily.com "Java Source Code Warehouse" project. The intent of this project is to help you "Learn Java by Example" TM.

Java - Lucene tags/keywords

for, for, if, java, java, jre, jre, lettertokenizer, lucene, standardanalyzer, this, to, unicode, unicode

The Lucene JRE_VERSION_MIGRATION.txt source code

If possible, use the same JRE major version at both index and search time.
When upgrading to a different JRE major version, consider re-indexing. 

Different JRE major versions may implement different versions of Unicode,
which will change the way some parts of Lucene treat your text.

For example: with Java 1.4, LetterTokenizer will split around the character U+02C6,
but with Java 5 it will not.
This is because Java 1.4 implements Unicode 3, but Java 5 implements Unicode 4.

For reference, JRE major versions with their corresponding Unicode versions:
Java 1.4, Unicode 3.0
Java 5, Unicode 4.0
Java 6, Unicode 4.0
Java 7, Unicode 5.1

In general, whether or not you need to re-index largely depends upon the data that
you are searching, and what was changed in any given Unicode version. For example, 
if you are completely sure that your content is limited to the "Basic Latin" range 
of Unicode, you can safely ignore this. 

Special Notes:

LUCENE 2.9 TO 3.0, JAVA 1.4 TO JAVA 5 TRANSITION

* StandardAnalyzer will return the same results under Java 5 as it did under 
Java 1.4. This is because it is largely independent of the runtime JRE for
Unicode support, (with the exception of lowercasing).  However, no changes to 
casing have occurred in Unicode 4.0 that affect StandardAnalyzer, so if you are 
using this Analyzer you are NOT affected.

* SimpleAnalyzer, StopAnalyzer, LetterTokenizer, LowerCaseFilter, and 
LowerCaseTokenizer may return different results, along with many other Analyzers
and TokenStreams in Lucene's contrib area. If you are using one of these 
components, you may be affected.

Other Lucene examples (source code examples)

Here is a short list of links related to this Lucene JRE_VERSION_MIGRATION.txt source code file:

... this post is sponsored by my books ...

#1 New Release!

FP Best Seller

 

new blog posts

 

Copyright 1998-2021 Alvin Alexander, alvinalexander.com
All Rights Reserved.

A percentage of advertising revenue from
pages under the /java/jwarehouse URI on this website is
paid back to open source projects.