|
Java example source code file (BidiBase.java)
The BidiBase.java Java example source code/* * Copyright (c) 2009, 2012, Oracle and/or its affiliates. All rights reserved. * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. * * This code is free software; you can redistribute it and/or modify it * under the terms of the GNU General Public License version 2 only, as * published by the Free Software Foundation. Oracle designates this * particular file as subject to the "Classpath" exception as provided * by Oracle in the LICENSE file that accompanied this code. * * This code is distributed in the hope that it will be useful, but WITHOUT * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License * version 2 for more details (a copy is included in the LICENSE file that * accompanied this code). * * You should have received a copy of the GNU General Public License version * 2 along with this work; if not, write to the Free Software Foundation, * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. * * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA * or visit www.oracle.com if you need additional information or have any * questions. */ /* ******************************************************************************* * (C) Copyright IBM Corp. and others, 1996-2009 - All Rights Reserved * * * * The original version of this source code and documentation is copyrighted * * and owned by IBM, These materials are provided under terms of a License * * Agreement between IBM and Sun. This technology is protected by multiple * * US and International patents. This notice and attribution to IBM may not * * to removed. * ******************************************************************************* */ /* FOOD FOR THOUGHT: currently the reordering modes are a mixture of * algorithm for direct BiDi, algorithm for inverse Bidi and the bizarre * concept of RUNS_ONLY which is a double operation. * It could be advantageous to divide this into 3 concepts: * a) Operation: direct / inverse / RUNS_ONLY * b) Direct algorithm: default / NUMBERS_SPECIAL / GROUP_NUMBERS_WITH_L * c) Inverse algorithm: default / INVERSE_LIKE_DIRECT / NUMBERS_SPECIAL * This would allow combinations not possible today like RUNS_ONLY with * NUMBERS_SPECIAL. * Also allow to set INSERT_MARKS for the direct step of RUNS_ONLY and * REMOVE_CONTROLS for the inverse step. * Not all combinations would be supported, and probably not all do make sense. * This would need to document which ones are supported and what are the * fallbacks for unsupported combinations. */ package sun.text.bidi; import java.io.IOException; import java.lang.reflect.Array; import java.lang.reflect.Field; import java.lang.reflect.Method; import java.lang.reflect.InvocationTargetException; import java.text.AttributedCharacterIterator; import java.text.Bidi; import java.util.Arrays; import java.util.MissingResourceException; import sun.text.normalizer.UBiDiProps; import sun.text.normalizer.UCharacter; import sun.text.normalizer.UTF16; /** * * <h2>Bidi algorithm for ICU * * This is an implementation of the Unicode Bidirectional algorithm. The * algorithm is defined in the <a * href="http://www.unicode.org/unicode/reports/tr9/">Unicode Standard Annex #9</a>, * version 13, also described in The Unicode Standard, Version 4.0 . * <p> * * Note: Libraries that perform a bidirectional algorithm and reorder strings * accordingly are sometimes called "Storage Layout Engines". ICU's Bidi and * shaping (ArabicShaping) classes can be used at the core of such "Storage * Layout Engines". * * <h3>General remarks about the API: * * The "limit" of a sequence of characters is the position just after * their last character, i.e., one more than that position. * <p> * * Some of the API methods provide access to "runs". Such a * "run" is defined as a sequence of characters that are at the same * embedding level after performing the Bidi algorithm. * <p> * * <h3>Basic concept: paragraph * A piece of text can be divided into several paragraphs by characters * with the Bidi class <code>Block Separator. For handling of * paragraphs, see: * <ul> * <li>{@link #countParagraphs} * <li>{@link #getParaLevel} * <li>{@link #getParagraph} * <li>{@link #getParagraphByIndex} * </ul> * * <h3>Basic concept: text direction * The direction of a piece of text may be: * <ul> * <li>{@link #LTR} * <li>{@link #RTL} * <li>{@link #MIXED} * </ul> * * <h3>Basic concept: levels * * Levels in this API represent embedding levels according to the Unicode * Bidirectional Algorithm. * Their low-order bit (even/odd value) indicates the visual direction.<p> * * Levels can be abstract values when used for the * <code>paraLevel and OPTION_STREAMING option is used, it is
* recommended to call <code>orderParagraphsLTR() with argument
* <code>orderParagraphsLTR set to true before calling
* <code>setPara() so that later paragraphs may be concatenated to
* previous paragraphs on the right.
* </p>
*
* @see #setReorderingMode
* @see #setReorderingOptions
* @see #getProcessedLength
* @see #orderParagraphsLTR
* @stable ICU 3.8
*/
private static final int OPTION_STREAMING = 4;
/*
* Comparing the description of the Bidi algorithm with this implementation
* is easier with the same names for the Bidi types in the code as there.
* See UCharacterDirection
*/
private static final byte L = 0;
private static final byte R = 1;
private static final byte EN = 2;
private static final byte ES = 3;
private static final byte ET = 4;
private static final byte AN = 5;
private static final byte CS = 6;
static final byte B = 7;
private static final byte S = 8;
private static final byte WS = 9;
private static final byte ON = 10;
private static final byte LRE = 11;
private static final byte LRO = 12;
private static final byte AL = 13;
private static final byte RLE = 14;
private static final byte RLO = 15;
private static final byte PDF = 16;
private static final byte NSM = 17;
private static final byte BN = 18;
private static final int MASK_R_AL = (1 << R | 1 << AL);
private static final char CR = '\r';
private static final char LF = '\n';
static final int LRM_BEFORE = 1;
static final int LRM_AFTER = 2;
static final int RLM_BEFORE = 4;
static final int RLM_AFTER = 8;
/*
* reference to parent paragraph object (reference to self if this object is
* a paragraph object); set to null in a newly opened object; set to a
* real value after a successful execution of setPara or setLine
*/
BidiBase paraBidi;
final UBiDiProps bdp;
/* character array representing the current text */
char[] text;
/* length of the current text */
int originalLength;
/* if the option OPTION_STREAMING is set, this is the length of
* text actually processed by <code>setPara, which may be shorter
* than the original length. Otherwise, it is identical to the original
* length.
*/
public int length;
/* if option OPTION_REMOVE_CONTROLS is set, and/or Bidi
* marks are allowed to be inserted in one of the reordering modes, the
* length of the result string may be different from the processed length.
*/
int resultLength;
/* indicators for whether memory may be allocated after construction */
boolean mayAllocateText;
boolean mayAllocateRuns;
/* arrays with one value per text-character */
byte[] dirPropsMemory = new byte[1];
byte[] levelsMemory = new byte[1];
byte[] dirProps;
byte[] levels;
/* must block separators receive level 0? */
boolean orderParagraphsLTR;
/* the paragraph level */
byte paraLevel;
/* original paraLevel when contextual */
/* must be one of DEFAULT_xxx or 0 if not contextual */
byte defaultParaLevel;
/* the following is set in setPara, used in processPropertySeq */
ImpTabPair impTabPair; /* reference to levels state table pair */
/* the overall paragraph or line directionality*/
byte direction;
/* flags is a bit set for which directional properties are in the text */
int flags;
/* lastArabicPos is index to the last AL in the text, -1 if none */
int lastArabicPos;
/* characters after trailingWSStart are WS and are */
/* implicitly at the paraLevel (rule (L1)) - levels may not reflect that */
int trailingWSStart;
/* fields for paragraph handling */
int paraCount; /* set in getDirProps() */
int[] parasMemory = new int[1];
int[] paras; /* limits of paragraphs, filled in
ResolveExplicitLevels() or CheckExplicitLevels() */
/* for single paragraph text, we only need a tiny array of paras (no allocation) */
int[] simpleParas = {0};
/* fields for line reordering */
int runCount; /* ==-1: runs not set up yet */
BidiRun[] runsMemory = new BidiRun[0];
BidiRun[] runs;
/* for non-mixed text, we only need a tiny array of runs (no allocation) */
BidiRun[] simpleRuns = {new BidiRun()};
/* mapping of runs in logical order to visual order */
int[] logicalToVisualRunsMap;
/* flag to indicate that the map has been updated */
boolean isGoodLogicalToVisualRunsMap;
/* for inverse Bidi with insertion of directional marks */
InsertPoints insertPoints = new InsertPoints();
/* for option OPTION_REMOVE_CONTROLS */
int controlCount;
/*
* Sometimes, bit values are more appropriate
* to deal with directionality properties.
* Abbreviations in these method names refer to names
* used in the Bidi algorithm.
*/
static int DirPropFlag(byte dir) {
return (1 << dir);
}
/*
* The following bit is ORed to the property of characters in paragraphs
* with contextual RTL direction when paraLevel is contextual.
*/
static final byte CONTEXT_RTL_SHIFT = 6;
static final byte CONTEXT_RTL = (byte)(1<* The number of runs depends on the actual text and maybe anywhere between * 1 and <code>maxLength. It is typically small. * * @throws IllegalArgumentException if maxLength or maxRunCount is less than 0 * @stable ICU 3.8 */ public BidiBase(int maxLength, int maxRunCount) { /* check the argument values */ if (maxLength < 0 || maxRunCount < 0) { throw new IllegalArgumentException(); } /* reset the object, all reference variables null, all flags false, all sizes 0. In fact, we don't need to do anything, since class members are initialized as zero when an instance is created. */ /* mayAllocateText = false; mayAllocateRuns = false; orderParagraphsLTR = false; paraCount = 0; runCount = 0; trailingWSStart = 0; flags = 0; paraLevel = 0; defaultParaLevel = 0; direction = 0; */ /* get Bidi properties */ try { bdp = UBiDiProps.getSingleton(); } catch (IOException e) { throw new MissingResourceException(e.getMessage(), "(BidiProps)", ""); } /* allocate memory for arrays as requested */ if (maxLength > 0) { getInitialDirPropsMemory(maxLength); getInitialLevelsMemory(maxLength); } else { mayAllocateText = true; } if (maxRunCount > 0) { // if maxRunCount == 1, use simpleRuns[] if (maxRunCount > 1) { getInitialRunsMemory(maxRunCount); } } else { mayAllocateRuns = true; } } /* * We are allowed to allocate memory if object==null or * mayAllocate==true for each array that we need. * * Assume sizeNeeded>0. * If object != null, then assume size > 0. */ private Object getMemory(String label, Object array, Class<?> arrayClass, boolean mayAllocate, int sizeNeeded) { int len = Array.getLength(array); /* we have at least enough memory and must not allocate */ if (sizeNeeded == len) { return array; } if (!mayAllocate) { /* we must not allocate */ if (sizeNeeded <= len) { return array; } throw new OutOfMemoryError("Failed to allocate memory for " + label); } /* we may try to grow or shrink */ /* FOOD FOR THOUGHT: when shrinking it should be possible to avoid the allocation altogether and rely on this.length */ try { return Array.newInstance(arrayClass, sizeNeeded); } catch (Exception e) { throw new OutOfMemoryError("Failed to allocate memory for " + label); } } /* helper methods for each allocated array */ private void getDirPropsMemory(boolean mayAllocate, int len) { Object array = getMemory("DirProps", dirPropsMemory, Byte.TYPE, mayAllocate, len); dirPropsMemory = (byte[]) array; } void getDirPropsMemory(int len) { getDirPropsMemory(mayAllocateText, len); } private void getLevelsMemory(boolean mayAllocate, int len) { Object array = getMemory("Levels", levelsMemory, Byte.TYPE, mayAllocate, len); levelsMemory = (byte[]) array; } void getLevelsMemory(int len) { getLevelsMemory(mayAllocateText, len); } private void getRunsMemory(boolean mayAllocate, int len) { Object array = getMemory("Runs", runsMemory, BidiRun.class, mayAllocate, len); runsMemory = (BidiRun[]) array; } void getRunsMemory(int len) { getRunsMemory(mayAllocateRuns, len); } /* additional methods used by constructor - always allow allocation */ private void getInitialDirPropsMemory(int len) { getDirPropsMemory(true, len); } private void getInitialLevelsMemory(int len) { getLevelsMemory(true, len); } private void getInitialParasMemory(int len) { Object array = getMemory("Paras", parasMemory, Integer.TYPE, true, len); parasMemory = (int[]) array; } private void getInitialRunsMemory(int len) { getRunsMemory(true, len); } /* perform (P2)..(P3) ------------------------------------------------------- */ private void getDirProps() { int i = 0, i0, i1; flags = 0; /* collect all directionalities in the text */ int uchar; byte dirProp; byte paraDirDefault = 0; /* initialize to avoid compiler warnings */ boolean isDefaultLevel = IsDefaultLevel(paraLevel); /* for inverse Bidi, the default para level is set to RTL if there is a strong R or AL character at either end of the text */ lastArabicPos = -1; controlCount = 0; final int NOT_CONTEXTUAL = 0; /* 0: not contextual paraLevel */ final int LOOKING_FOR_STRONG = 1; /* 1: looking for first strong char */ final int FOUND_STRONG_CHAR = 2; /* 2: found first strong char */ int state; int paraStart = 0; /* index of first char in paragraph */ byte paraDir; /* == CONTEXT_RTL within paragraphs starting with strong R char */ byte lastStrongDir=0; /* for default level & inverse Bidi */ int lastStrongLTR=0; /* for STREAMING option */ if (isDefaultLevel) { paraDirDefault = ((paraLevel & 1) != 0) ? CONTEXT_RTL : 0; paraDir = paraDirDefault; lastStrongDir = paraDirDefault; state = LOOKING_FOR_STRONG; } else { state = NOT_CONTEXTUAL; paraDir = 0; } /* count paragraphs and determine the paragraph level (P2..P3) */ /* * see comment on constant fields: * the LEVEL_DEFAULT_XXX values are designed so that * their low-order bit alone yields the intended default */ for (i = 0; i < originalLength; /* i is incremented in the loop */) { i0 = i; /* index of first code unit */ uchar = UTF16.charAt(text, 0, originalLength, i); i += Character.charCount(uchar); i1 = i - 1; /* index of last code unit, gets the directional property */ dirProp = (byte)bdp.getClass(uchar); flags |= DirPropFlag(dirProp); dirProps[i1] = (byte)(dirProp | paraDir); if (i1 > i0) { /* set previous code units' properties to BN */ flags |= DirPropFlag(BN); do { dirProps[--i1] = (byte)(BN | paraDir); } while (i1 > i0); } if (state == LOOKING_FOR_STRONG) { if (dirProp == L) { state = FOUND_STRONG_CHAR; if (paraDir != 0) { paraDir = 0; for (i1 = paraStart; i1 < i; i1++) { dirProps[i1] &= ~CONTEXT_RTL; } } continue; } if (dirProp == R || dirProp == AL) { state = FOUND_STRONG_CHAR; if (paraDir == 0) { paraDir = CONTEXT_RTL; for (i1 = paraStart; i1 < i; i1++) { dirProps[i1] |= CONTEXT_RTL; } } continue; } } if (dirProp == L) { lastStrongDir = 0; lastStrongLTR = i; /* i is index to next character */ } else if (dirProp == R) { lastStrongDir = CONTEXT_RTL; } else if (dirProp == AL) { lastStrongDir = CONTEXT_RTL; lastArabicPos = i-1; } else if (dirProp == B) { if (i < originalLength) { /* B not last char in text */ if (!((uchar == (int)CR) && (text[i] == (int)LF))) { paraCount++; } if (isDefaultLevel) { state=LOOKING_FOR_STRONG; paraStart = i; /* i is index to next character */ paraDir = paraDirDefault; lastStrongDir = paraDirDefault; } } } } if (isDefaultLevel) { paraLevel = GetParaLevelAt(0); } /* The following line does nothing new for contextual paraLevel, but is needed for absolute paraLevel. */ flags |= DirPropFlagLR(paraLevel); if (orderParagraphsLTR && (flags & DirPropFlag(B)) != 0) { flags |= DirPropFlag(L); } } /* perform (X1)..(X9) ------------------------------------------------------- */ /* determine if the text is mixed-directional or single-directional */ private byte directionFromFlags() { /* if the text contains AN and neutrals, then some neutrals may become RTL */ if (!((flags & MASK_RTL) != 0 || ((flags & DirPropFlag(AN)) != 0 && (flags & MASK_POSSIBLE_N) != 0))) { return Bidi.DIRECTION_LEFT_TO_RIGHT; } else if ((flags & MASK_LTR) == 0) { return Bidi.DIRECTION_RIGHT_TO_LEFT; } else { return MIXED; } } /* * Resolve the explicit levels as specified by explicit embedding codes. * Recalculate the flags to have them reflect the real properties * after taking the explicit embeddings into account. * * The Bidi algorithm is designed to result in the same behavior whether embedding * levels are externally specified (from "styled text", supposedly the preferred * method) or set by explicit embedding codes (LRx, RLx, PDF) in the plain text. * That is why (X9) instructs to remove all explicit codes (and BN). * However, in a real implementation, this removal of these codes and their index * positions in the plain text is undesirable since it would result in * reallocated, reindexed text. * Instead, this implementation leaves the codes in there and just ignores them * in the subsequent processing. * In order to get the same reordering behavior, positions with a BN or an * explicit embedding code just get the same level assigned as the last "real" * character. * * Some implementations, not this one, then overwrite some of these * directionality properties at "real" same-level-run boundaries by * L or R codes so that the resolution of weak types can be performed on the * entire paragraph at once instead of having to parse it once more and * perform that resolution on same-level-runs. * This limits the scope of the implicit rules in effectively * the same way as the run limits. * * Instead, this implementation does not modify these codes. * On one hand, the paragraph has to be scanned for same-level-runs, but * on the other hand, this saves another loop to reset these codes, * or saves making and modifying a copy of dirProps[]. * * * Note that (Pn) and (Xn) changed significantly from version 4 of the Bidi algorithm. * * * Handling the stack of explicit levels (Xn): * * With the Bidi stack of explicit levels, * as pushed with each LRE, RLE, LRO, and RLO and popped with each PDF, * the explicit level must never exceed MAX_EXPLICIT_LEVEL==61. * * In order to have a correct push-pop semantics even in the case of overflows, * there are two overflow counters: * - countOver60 is incremented with each LRx at level 60 * - from level 60, one RLx increases the level to 61 * - countOver61 is incremented with each LRx and RLx at level 61 * * Popping levels with PDF must work in the opposite order so that level 61 * is correct at the correct point. Underflows (too many PDFs) must be checked. * * This implementation assumes that MAX_EXPLICIT_LEVEL is odd. */ private byte resolveExplicitLevels() { int i = 0; byte dirProp; byte level = GetParaLevelAt(0); byte dirct; int paraIndex = 0; /* determine if the text is mixed-directional or single-directional */ dirct = directionFromFlags(); /* we may not need to resolve any explicit levels, but for multiple paragraphs we want to loop on all chars to set the para boundaries */ if ((dirct != MIXED) && (paraCount == 1)) { /* not mixed directionality: levels don't matter - trailingWSStart will be 0 */ } else if ((paraCount == 1) && ((flags & MASK_EXPLICIT) == 0)) { /* mixed, but all characters are at the same embedding level */ /* or we are in "inverse Bidi" */ /* and we don't have contextual multiple paragraphs with some B char */ /* set all levels to the paragraph level */ for (i = 0; i < length; ++i) { levels[i] = level; } } else { /* continue to perform (Xn) */ /* (X1) level is set for all codes, embeddingLevel keeps track of the push/pop operations */ /* both variables may carry the LEVEL_OVERRIDE flag to indicate the override status */ byte embeddingLevel = level; byte newLevel; byte stackTop = 0; byte[] stack = new byte[MAX_EXPLICIT_LEVEL]; /* we never push anything >=MAX_EXPLICIT_LEVEL */ int countOver60 = 0; int countOver61 = 0; /* count overflows of explicit levels */ /* recalculate the flags */ flags = 0; for (i = 0; i < length; ++i) { dirProp = NoContextRTL(dirProps[i]); switch(dirProp) { case LRE: case LRO: /* (X3, X5) */ newLevel = (byte)((embeddingLevel+2) & ~(INTERNAL_LEVEL_OVERRIDE | 1)); /* least greater even level */ if (newLevel <= MAX_EXPLICIT_LEVEL) { stack[stackTop] = embeddingLevel; ++stackTop; embeddingLevel = newLevel; if (dirProp == LRO) { embeddingLevel |= INTERNAL_LEVEL_OVERRIDE; } /* we don't need to set LEVEL_OVERRIDE off for LRE since this has already been done for newLevel which is the source for embeddingLevel. */ } else if ((embeddingLevel & ~INTERNAL_LEVEL_OVERRIDE) == MAX_EXPLICIT_LEVEL) { ++countOver61; } else /* (embeddingLevel & ~INTERNAL_LEVEL_OVERRIDE) == MAX_EXPLICIT_LEVEL-1 */ { ++countOver60; } flags |= DirPropFlag(BN); break; case RLE: case RLO: /* (X2, X4) */ newLevel=(byte)(((embeddingLevel & ~INTERNAL_LEVEL_OVERRIDE) + 1) | 1); /* least greater odd level */ if (newLevel<=MAX_EXPLICIT_LEVEL) { stack[stackTop] = embeddingLevel; ++stackTop; embeddingLevel = newLevel; if (dirProp == RLO) { embeddingLevel |= INTERNAL_LEVEL_OVERRIDE; } /* we don't need to set LEVEL_OVERRIDE off for RLE since this has already been done for newLevel which is the source for embeddingLevel. */ } else { ++countOver61; } flags |= DirPropFlag(BN); break; case PDF: /* (X7) */ /* handle all the overflow cases first */ if (countOver61 > 0) { --countOver61; } else if (countOver60 > 0 && (embeddingLevel & ~INTERNAL_LEVEL_OVERRIDE) != MAX_EXPLICIT_LEVEL) { /* handle LRx overflows from level 60 */ --countOver60; } else if (stackTop > 0) { /* this is the pop operation; it also pops level 61 while countOver60>0 */ --stackTop; embeddingLevel = stack[stackTop]; /* } else { (underflow) */ } flags |= DirPropFlag(BN); break; case B: stackTop = 0; countOver60 = 0; countOver61 = 0; level = GetParaLevelAt(i); if ((i + 1) < length) { embeddingLevel = GetParaLevelAt(i+1); if (!((text[i] == CR) && (text[i + 1] == LF))) { paras[paraIndex++] = i+1; } } flags |= DirPropFlag(B); break; case BN: /* BN, LRE, RLE, and PDF are supposed to be removed (X9) */ /* they will get their levels set correctly in adjustWSLevels() */ flags |= DirPropFlag(BN); break; default: /* all other types get the "real" level */ if (level != embeddingLevel) { level = embeddingLevel; if ((level & INTERNAL_LEVEL_OVERRIDE) != 0) { flags |= DirPropFlagO(level) | DirPropFlagMultiRuns; } else { flags |= DirPropFlagE(level) | DirPropFlagMultiRuns; } } if ((level & INTERNAL_LEVEL_OVERRIDE) == 0) { flags |= DirPropFlag(dirProp); } break; } /* * We need to set reasonable levels even on BN codes and * explicit codes because we will later look at same-level runs (X10). */ levels[i] = level; } if ((flags & MASK_EMBEDDING) != 0) { flags |= DirPropFlagLR(paraLevel); } if (orderParagraphsLTR && (flags & DirPropFlag(B)) != 0) { flags |= DirPropFlag(L); } /* subsequently, ignore the explicit codes and BN (X9) */ /* again, determine if the text is mixed-directional or single-directional */ dirct = directionFromFlags(); } return dirct; } /* * Use a pre-specified embedding levels array: * * Adjust the directional properties for overrides (->LEVEL_OVERRIDE), * ignore all explicit codes (X9), * and check all the preset levels. * * Recalculate the flags to have them reflect the real properties * after taking the explicit embeddings into account. */ private byte checkExplicitLevels() { byte dirProp; int i; this.flags = 0; /* collect all directionalities in the text */ byte level; int paraIndex = 0; for (i = 0; i < length; ++i) { if (levels[i] == 0) { levels[i] = paraLevel; } if (MAX_EXPLICIT_LEVEL < (levels[i]&0x7f)) { if ((levels[i] & INTERNAL_LEVEL_OVERRIDE) != 0) { levels[i] = (byte)(paraLevel|INTERNAL_LEVEL_OVERRIDE); } else { levels[i] = paraLevel; } } level = levels[i]; dirProp = NoContextRTL(dirProps[i]); if ((level & INTERNAL_LEVEL_OVERRIDE) != 0) { /* keep the override flag in levels[i] but adjust the flags */ level &= ~INTERNAL_LEVEL_OVERRIDE; /* make the range check below simpler */ flags |= DirPropFlagO(level); } else { /* set the flags */ flags |= DirPropFlagE(level) | DirPropFlag(dirProp); } if ((level < GetParaLevelAt(i) && !((0 == level) && (dirProp == B))) || (MAX_EXPLICIT_LEVEL <level)) { /* level out of bounds */ throw new IllegalArgumentException("level " + level + " out of bounds at index " + i); } if ((dirProp == B) && ((i + 1) < length)) { if (!((text[i] == CR) && (text[i + 1] == LF))) { paras[paraIndex++] = i + 1; } } } if ((flags&MASK_EMBEDDING) != 0) { flags |= DirPropFlagLR(paraLevel); } /* determine if the text is mixed-directional or single-directional */ return directionFromFlags(); } /*********************************************************************/ /* The Properties state machine table */ /*********************************************************************/ /* */ /* All table cells are 8 bits: */ /* bits 0..4: next state */ /* bits 5..7: action to perform (if > 0) */ /* */ /* Cells may be of format "n" where n represents the next state */ /* (except for the rightmost column). */ /* Cells may also be of format "_(x,y)" where x represents an action */ /* to perform and y represents the next state. */ /* */ /*********************************************************************/ /* Definitions and type for properties state tables */ /*********************************************************************/ private static final int IMPTABPROPS_COLUMNS = 14; private static final int IMPTABPROPS_RES = IMPTABPROPS_COLUMNS - 1; private static short GetStateProps(short cell) { return (short)(cell & 0x1f); } private static short GetActionProps(short cell) { return (short)(cell >> 5); } private static final short groupProp[] = /* dirProp regrouped */ { /* L R EN ES ET AN CS B S WS ON LRE LRO AL RLE RLO PDF NSM BN */ 0, 1, 2, 7, 8, 3, 9, 6, 5, 4, 4, 10, 10, 12, 10, 10, 10, 11, 10 }; private static final short _L = 0; private static final short _R = 1; private static final short _EN = 2; private static final short _AN = 3; private static final short _ON = 4; private static final short _S = 5; private static final short _B = 6; /* reduced dirProp */ /*********************************************************************/ /* */ /* PROPERTIES STATE TABLE */ /* */ /* In table impTabProps, */ /* - the ON column regroups ON and WS */ /* - the BN column regroups BN, LRE, RLE, LRO, RLO, PDF */ /* - the Res column is the reduced property assigned to a run */ /* */ /* Action 1: process current run1, init new run1 */ /* 2: init new run2 */ /* 3: process run1, process run2, init new run1 */ /* 4: process run1, set run1=run2, init new run2 */ /* */ /* Notes: */ /* 1) This table is used in resolveImplicitLevels(). */ /* 2) This table triggers actions when there is a change in the Bidi*/ /* property of incoming characters (action 1). */ /* 3) Most such property sequences are processed immediately (in */ /* fact, passed to processPropertySeq(). */ /* 4) However, numbers are assembled as one sequence. This means */ /* that undefined situations (like CS following digits, until */ /* it is known if the next char will be a digit) are held until */ /* following chars define them. */ /* Example: digits followed by CS, then comes another CS or ON; */ /* the digits will be processed, then the CS assigned */ /* as the start of an ON sequence (action 3). */ /* 5) There are cases where more than one sequence must be */ /* processed, for instance digits followed by CS followed by L: */ /* the digits must be processed as one sequence, and the CS */ /* must be processed as an ON sequence, all this before starting */ /* assembling chars for the opening L sequence. */ /* */ /* */ private static final short impTabProps[][] = { /* L, R, EN, AN, ON, S, B, ES, ET, CS, BN, NSM, AL, Res */ /* 0 Init */ { 1, 2, 4, 5, 7, 15, 17, 7, 9, 7, 0, 7, 3, _ON }, /* 1 L */ { 1, 32+2, 32+4, 32+5, 32+7, 32+15, 32+17, 32+7, 32+9, 32+7, 1, 1, 32+3, _L }, /* 2 R */ { 32+1, 2, 32+4, 32+5, 32+7, 32+15, 32+17, 32+7, 32+9, 32+7, 2, 2, 32+3, _R }, /* 3 AL */ { 32+1, 32+2, 32+6, 32+6, 32+8, 32+16, 32+17, 32+8, 32+8, 32+8, 3, 3, 3, _R }, /* 4 EN */ { 32+1, 32+2, 4, 32+5, 32+7, 32+15, 32+17, 64+10, 11, 64+10, 4, 4, 32+3, _EN }, /* 5 AN */ { 32+1, 32+2, 32+4, 5, 32+7, 32+15, 32+17, 32+7, 32+9, 64+12, 5, 5, 32+3, _AN }, /* 6 AL:EN/AN */ { 32+1, 32+2, 6, 6, 32+8, 32+16, 32+17, 32+8, 32+8, 64+13, 6, 6, 32+3, _AN }, /* 7 ON */ { 32+1, 32+2, 32+4, 32+5, 7, 32+15, 32+17, 7, 64+14, 7, 7, 7, 32+3, _ON }, /* 8 AL:ON */ { 32+1, 32+2, 32+6, 32+6, 8, 32+16, 32+17, 8, 8, 8, 8, 8, 32+3, _ON }, /* 9 ET */ { 32+1, 32+2, 4, 32+5, 7, 32+15, 32+17, 7, 9, 7, 9, 9, 32+3, _ON }, /*10 EN+ES/CS */ { 96+1, 96+2, 4, 96+5, 128+7, 96+15, 96+17, 128+7,128+14, 128+7, 10, 128+7, 96+3, _EN }, /*11 EN+ET */ { 32+1, 32+2, 4, 32+5, 32+7, 32+15, 32+17, 32+7, 11, 32+7, 11, 11, 32+3, _EN }, /*12 AN+CS */ { 96+1, 96+2, 96+4, 5, 128+7, 96+15, 96+17, 128+7,128+14, 128+7, 12, 128+7, 96+3, _AN }, /*13 AL:EN/AN+CS */ { 96+1, 96+2, 6, 6, 128+8, 96+16, 96+17, 128+8, 128+8, 128+8, 13, 128+8, 96+3, _AN }, /*14 ON+ET */ { 32+1, 32+2, 128+4, 32+5, 7, 32+15, 32+17, 7, 14, 7, 14, 14, 32+3, _ON }, /*15 S */ { 32+1, 32+2, 32+4, 32+5, 32+7, 15, 32+17, 32+7, 32+9, 32+7, 15, 32+7, 32+3, _S }, /*16 AL:S */ { 32+1, 32+2, 32+6, 32+6, 32+8, 16, 32+17, 32+8, 32+8, 32+8, 16, 32+8, 32+3, _S }, /*17 B */ { 32+1, 32+2, 32+4, 32+5, 32+7, 32+15, 17, 32+7, 32+9, 32+7, 17, 32+7, 32+3, _B } }; /*********************************************************************/ /* The levels state machine tables */ /*********************************************************************/ /* */ /* All table cells are 8 bits: */ /* bits 0..3: next state */ /* bits 4..7: action to perform (if > 0) */ /* */ /* Cells may be of format "n" where n represents the next state */ /* (except for the rightmost column). */ /* Cells may also be of format "_(x,y)" where x represents an action */ /* to perform and y represents the next state. */ /* */ /* This format limits each table to 16 states each and to 15 actions.*/ /* */ /*********************************************************************/ /* Definitions and type for levels state tables */ /*********************************************************************/ private static final int IMPTABLEVELS_COLUMNS = _B + 2; private static final int IMPTABLEVELS_RES = IMPTABLEVELS_COLUMNS - 1; private static short GetState(byte cell) { return (short)(cell & 0x0f); } private static short GetAction(byte cell) { return (short)(cell >> 4); } private static class ImpTabPair { byte[][][] imptab; short[][] impact; ImpTabPair(byte[][] table1, byte[][] table2, short[] act1, short[] act2) { imptab = new byte[][][] {table1, table2}; impact = new short[][] {act1, act2}; } } /*********************************************************************/ /* */ /* LEVELS STATE TABLES */ /* */ /* In all levels state tables, */ /* - state 0 is the initial state */ /* - the Res column is the increment to add to the text level */ /* for this property sequence. */ /* */ /* The impact arrays for each table of a pair map the local action */ /* numbers of the table to the total list of actions. For instance, */ /* action 2 in a given table corresponds to the action number which */ /* appears in entry [2] of the impact array for that table. */ /* The first entry of all impact arrays must be 0. */ /* */ /* Action 1: init conditional sequence */ /* 2: prepend conditional sequence to current sequence */ /* 3: set ON sequence to new level - 1 */ /* 4: init EN/AN/ON sequence */ /* 5: fix EN/AN/ON sequence followed by R */ /* 6: set previous level sequence to level 2 */ /* */ /* Notes: */ /* 1) These tables are used in processPropertySeq(). The input */ /* is property sequences as determined by resolveImplicitLevels. */ /* 2) Most such property sequences are processed immediately */ /* (levels are assigned). */ /* 3) However, some sequences cannot be assigned a final level till */ /* one or more following sequences are received. For instance, */ /* ON following an R sequence within an even-level paragraph. */ /* If the following sequence is R, the ON sequence will be */ /* assigned basic run level+1, and so will the R sequence. */ /* 4) S is generally handled like ON, since its level will be fixed */ /* to paragraph level in adjustWSLevels(). */ /* */ private static final byte impTabL_DEFAULT[][] = /* Even paragraph level */ /* In this table, conditional sequences receive the higher possible level until proven otherwise. */ { /* L, R, EN, AN, ON, S, B, Res */ /* 0 : init */ { 0, 1, 0, 2, 0, 0, 0, 0 }, /* 1 : R */ { 0, 1, 3, 3, 0x14, 0x14, 0, 1 }, /* 2 : AN */ { 0, 1, 0, 2, 0x15, 0x15, 0, 2 }, /* 3 : R+EN/AN */ { 0, 1, 3, 3, 0x14, 0x14, 0, 2 }, /* 4 : R+ON */ { 0x20, 1, 3, 3, 4, 4, 0x20, 1 }, /* 5 : AN+ON */ { 0x20, 1, 0x20, 2, 5, 5, 0x20, 1 } }; private static final byte impTabR_DEFAULT[][] = /* Odd paragraph level */ /* In this table, conditional sequences receive the lower possible level until proven otherwise. */ { /* L, R, EN, AN, ON, S, B, Res */ /* 0 : init */ { 1, 0, 2, 2, 0, 0, 0, 0 }, /* 1 : L */ { 1, 0, 1, 3, 0x14, 0x14, 0, 1 }, /* 2 : EN/AN */ { 1, 0, 2, 2, 0, 0, 0, 1 }, /* 3 : L+AN */ { 1, 0, 1, 3, 5, 5, 0, 1 }, /* 4 : L+ON */ { 0x21, 0, 0x21, 3, 4, 4, 0, 0 }, /* 5 : L+AN+ON */ { 1, 0, 1, 3, 5, 5, 0, 0 } }; private static final short[] impAct0 = {0,1,2,3,4,5,6}; private static final ImpTabPair impTab_DEFAULT = new ImpTabPair( impTabL_DEFAULT, impTabR_DEFAULT, impAct0, impAct0); private static final byte impTabL_NUMBERS_SPECIAL[][] = { /* Even paragraph level */ /* In this table, conditional sequences receive the higher possible level until proven otherwise. */ /* L, R, EN, AN, ON, S, B, Res */ /* 0 : init */ { 0, 2, 1, 1, 0, 0, 0, 0 }, /* 1 : L+EN/AN */ { 0, 2, 1, 1, 0, 0, 0, 2 }, /* 2 : R */ { 0, 2, 4, 4, 0x13, 0, 0, 1 }, /* 3 : R+ON */ { 0x20, 2, 4, 4, 3, 3, 0x20, 1 }, /* 4 : R+EN/AN */ { 0, 2, 4, 4, 0x13, 0x13, 0, 2 } }; private static final ImpTabPair impTab_NUMBERS_SPECIAL = new ImpTabPair( impTabL_NUMBERS_SPECIAL, impTabR_DEFAULT, impAct0, impAct0); private static final byte impTabL_GROUP_NUMBERS_WITH_R[][] = { /* In this table, EN/AN+ON sequences receive levels as if associated with R until proven that there is L or sor/eor on both sides. AN is handled like EN. */ /* L, R, EN, AN, ON, S, B, Res */ /* 0 init */ { 0, 3, 0x11, 0x11, 0, 0, 0, 0 }, /* 1 EN/AN */ { 0x20, 3, 1, 1, 2, 0x20, 0x20, 2 }, /* 2 EN/AN+ON */ { 0x20, 3, 1, 1, 2, 0x20, 0x20, 1 }, /* 3 R */ { 0, 3, 5, 5, 0x14, 0, 0, 1 }, /* 4 R+ON */ { 0x20, 3, 5, 5, 4, 0x20, 0x20, 1 }, /* 5 R+EN/AN */ { 0, 3, 5, 5, 0x14, 0, 0, 2 } }; private static final byte impTabR_GROUP_NUMBERS_WITH_R[][] = { /* In this table, EN/AN+ON sequences receive levels as if associated with R until proven that there is L on both sides. AN is handled like EN. */ /* L, R, EN, AN, ON, S, B, Res */ /* 0 init */ { 2, 0, 1, 1, 0, 0, 0, 0 }, /* 1 EN/AN */ { 2, 0, 1, 1, 0, 0, 0, 1 }, /* 2 L */ { 2, 0, 0x14, 0x14, 0x13, 0, 0, 1 }, /* 3 L+ON */ { 0x22, 0, 4, 4, 3, 0, 0, 0 }, /* 4 L+EN/AN */ { 0x22, 0, 4, 4, 3, 0, 0, 1 } }; private static final ImpTabPair impTab_GROUP_NUMBERS_WITH_R = new ImpTabPair(impTabL_GROUP_NUMBERS_WITH_R, impTabR_GROUP_NUMBERS_WITH_R, impAct0, impAct0); private static final byte impTabL_INVERSE_NUMBERS_AS_L[][] = { /* This table is identical to the Default LTR table except that EN and AN are handled like L. */ /* L, R, EN, AN, ON, S, B, Res */ /* 0 : init */ { 0, 1, 0, 0, 0, 0, 0, 0 }, /* 1 : R */ { 0, 1, 0, 0, 0x14, 0x14, 0, 1 }, /* 2 : AN */ { 0, 1, 0, 0, 0x15, 0x15, 0, 2 }, /* 3 : R+EN/AN */ { 0, 1, 0, 0, 0x14, 0x14, 0, 2 }, /* 4 : R+ON */ { 0x20, 1, 0x20, 0x20, 4, 4, 0x20, 1 }, /* 5 : AN+ON */ { 0x20, 1, 0x20, 0x20, 5, 5, 0x20, 1 } }; private static final byte impTabR_INVERSE_NUMBERS_AS_L[][] = { /* This table is identical to the Default RTL table except that EN and AN are handled like L. */ /* L, R, EN, AN, ON, S, B, Res */ /* 0 : init */ { 1, 0, 1, 1, 0, 0, 0, 0 }, /* 1 : L */ { 1, 0, 1, 1, 0x14, 0x14, 0, 1 }, /* 2 : EN/AN */ { 1, 0, 1, 1, 0, 0, 0, 1 }, /* 3 : L+AN */ { 1, 0, 1, 1, 5, 5, 0, 1 }, /* 4 : L+ON */ { 0x21, 0, 0x21, 0x21, 4, 4, 0, 0 }, /* 5 : L+AN+ON */ { 1, 0, 1, 1, 5, 5, 0, 0 } }; private static final ImpTabPair impTab_INVERSE_NUMBERS_AS_L = new ImpTabPair (impTabL_INVERSE_NUMBERS_AS_L, impTabR_INVERSE_NUMBERS_AS_L, impAct0, impAct0); private static final byte impTabR_INVERSE_LIKE_DIRECT[][] = { /* Odd paragraph level */ /* In this table, conditional sequences receive the lower possible level until proven otherwise. */ /* L, R, EN, AN, ON, S, B, Res */ /* 0 : init */ { 1, 0, 2, 2, 0, 0, 0, 0 }, /* 1 : L */ { 1, 0, 1, 2, 0x13, 0x13, 0, 1 }, /* 2 : EN/AN */ { 1, 0, 2, 2, 0, 0, 0, 1 }, /* 3 : L+ON */ { 0x21, 0x30, 6, 4, 3, 3, 0x30, 0 }, /* 4 : L+ON+AN */ { 0x21, 0x30, 6, 4, 5, 5, 0x30, 3 }, /* 5 : L+AN+ON */ { 0x21, 0x30, 6, 4, 5, 5, 0x30, 2 }, /* 6 : L+ON+EN */ { 0x21, 0x30, 6, 4, 3, 3, 0x30, 1 } }; private static final short[] impAct1 = {0,1,11,12}; private static final ImpTabPair impTab_INVERSE_LIKE_DIRECT = new ImpTabPair( impTabL_DEFAULT, impTabR_INVERSE_LIKE_DIRECT, impAct0, impAct1); private static final byte impTabL_INVERSE_LIKE_DIRECT_WITH_MARKS[][] = { /* The case handled in this table is (visually): R EN L */ /* L, R, EN, AN, ON, S, B, Res */ /* 0 : init */ { 0, 0x63, 0, 1, 0, 0, 0, 0 }, /* 1 : L+AN */ { 0, 0x63, 0, 1, 0x12, 0x30, 0, 4 }, /* 2 : L+AN+ON */ { 0x20, 0x63, 0x20, 1, 2, 0x30, 0x20, 3 }, /* 3 : R */ { 0, 0x63, 0x55, 0x56, 0x14, 0x30, 0, 3 }, /* 4 : R+ON */ { 0x30, 0x43, 0x55, 0x56, 4, 0x30, 0x30, 3 }, /* 5 : R+EN */ { 0x30, 0x43, 5, 0x56, 0x14, 0x30, 0x30, 4 }, /* 6 : R+AN */ { 0x30, 0x43, 0x55, 6, 0x14, 0x30, 0x30, 4 } }; private static final byte impTabR_INVERSE_LIKE_DIRECT_WITH_MARKS[][] = { /* The cases handled in this table are (visually): R EN L R L AN L */ /* L, R, EN, AN, ON, S, B, Res */ /* 0 : init */ { 0x13, 0, 1, 1, 0, 0, 0, 0 }, /* 1 : R+EN/AN */ { 0x23, 0, 1, 1, 2, 0x40, 0, 1 }, /* 2 : R+EN/AN+ON */ { 0x23, 0, 1, 1, 2, 0x40, 0, 0 }, /* 3 : L */ { 3 , 0, 3, 0x36, 0x14, 0x40, 0, 1 }, /* 4 : L+ON */ { 0x53, 0x40, 5, 0x36, 4, 0x40, 0x40, 0 }, /* 5 : L+ON+EN */ { 0x53, 0x40, 5, 0x36, 4, 0x40, 0x40, 1 }, /* 6 : L+AN */ { 0x53, 0x40, 6, 6, 4, 0x40, 0x40, 3 } }; private static final short impAct2[] = {0,1,7,8,9,10}; private static final ImpTabPair impTab_INVERSE_LIKE_DIRECT_WITH_MARKS = new ImpTabPair(impTabL_INVERSE_LIKE_DIRECT_WITH_MARKS, impTabR_INVERSE_LIKE_DIRECT_WITH_MARKS, impAct0, impAct2); private static final ImpTabPair impTab_INVERSE_FOR_NUMBERS_SPECIAL = new ImpTabPair( impTabL_NUMBERS_SPECIAL, impTabR_INVERSE_LIKE_DIRECT, impAct0, impAct1); private static final byte impTabL_INVERSE_FOR_NUMBERS_SPECIAL_WITH_MARKS[][] = { /* The case handled in this table is (visually): R EN L */ /* L, R, EN, AN, ON, S, B, Res */ /* 0 : init */ { 0, 0x62, 1, 1, 0, 0, 0, 0 }, /* 1 : L+EN/AN */ { 0, 0x62, 1, 1, 0, 0x30, 0, 4 }, /* 2 : R */ { 0, 0x62, 0x54, 0x54, 0x13, 0x30, 0, 3 }, /* 3 : R+ON */ { 0x30, 0x42, 0x54, 0x54, 3, 0x30, 0x30, 3 }, /* 4 : R+EN/AN */ { 0x30, 0x42, 4, 4, 0x13, 0x30, 0x30, 4 } }; private static final ImpTabPair impTab_INVERSE_FOR_NUMBERS_SPECIAL_WITH_MARKS = new ImpTabPair(impTabL_INVERSE_FOR_NUMBERS_SPECIAL_WITH_MARKS, impTabR_INVERSE_LIKE_DIRECT_WITH_MARKS, impAct0, impAct2); private class LevState { byte[][] impTab; /* level table pointer */ short[] impAct; /* action map array */ int startON; /* start of ON sequence */ int startL2EN; /* start of level 2 sequence */ int lastStrongRTL; /* index of last found R or AL */ short state; /* current state */ byte runLevel; /* run level before implicit solving */ } /*------------------------------------------------------------------------*/ static final int FIRSTALLOC = 10; /* * param pos: position where to insert * param flag: one of LRM_BEFORE, LRM_AFTER, RLM_BEFORE, RLM_AFTER */ private void addPoint(int pos, int flag) { Point point = new Point(); int len = insertPoints.points.length; if (len == 0) { insertPoints.points = new Point[FIRSTALLOC]; len = FIRSTALLOC; } if (insertPoints.size >= len) { /* no room for new point */ Point[] savePoints = insertPoints.points; insertPoints.points = new Point[len * 2]; System.arraycopy(savePoints, 0, insertPoints.points, 0, len); } point.pos = pos; point.flag = flag; insertPoints.points[insertPoints.size] = point; insertPoints.size++; } /* perform rules (Wn), (Nn), and (In) on a run of the text ------------------ */ /* * This implementation of the (Wn) rules applies all rules in one pass. * In order to do so, it needs a look-ahead of typically 1 character * (except for W5: sequences of ET) and keeps track of changes * in a rule Wp that affect a later Wq (p<q). * * The (Nn) and (In) rules are also performed in that same single loop, * but effectively one iteration behind for white space. * * Since all implicit rules are performed in one step, it is not necessary * to actually store the intermediate directional properties in dirProps[]. */ private void processPropertySeq(LevState levState, short _prop, int start, int limit) { byte cell; byte[][] impTab = levState.impTab; short[] impAct = levState.impAct; short oldStateSeq,actionSeq; byte level, addLevel; int start0, k; start0 = start; /* save original start position */ oldStateSeq = levState.state; cell = impTab[oldStateSeq][_prop]; levState.state = GetState(cell); /* isolate the new state */ actionSeq = impAct[GetAction(cell)]; /* isolate the action */ addLevel = impTab[levState.state][IMPTABLEVELS_RES]; if (actionSeq != 0) { switch (actionSeq) { case 1: /* init ON seq */ levState.startON = start0; break; case 2: /* prepend ON seq to current seq */ start = levState.startON; break; case 3: /* L or S after possible relevant EN/AN */ /* check if we had EN after R/AL */ if (levState.startL2EN >= 0) { addPoint(levState.startL2EN, LRM_BEFORE); } levState.startL2EN = -1; /* not within previous if since could also be -2 */ /* check if we had any relevant EN/AN after R/AL */ if ((insertPoints.points.length == 0) || (insertPoints.size <= insertPoints.confirmed)) { /* nothing, just clean up */ levState.lastStrongRTL = -1; /* check if we have a pending conditional segment */ level = impTab[oldStateSeq][IMPTABLEVELS_RES]; if ((level & 1) != 0 && levState.startON > 0) { /* after ON */ start = levState.startON; /* reset to basic run level */ } if (_prop == _S) { /* add LRM before S */ addPoint(start0, LRM_BEFORE); insertPoints.confirmed = insertPoints.size; } break; } /* reset previous RTL cont to level for LTR text */ for (k = levState.lastStrongRTL + 1; k < start0; k++) { /* reset odd level, leave runLevel+2 as is */ levels[k] = (byte)((levels[k] - 2) & ~1); } /* mark insert points as confirmed */ insertPoints.confirmed = insertPoints.size; levState.lastStrongRTL = -1; if (_prop == _S) { /* add LRM before S */ addPoint(start0, LRM_BEFORE); insertPoints.confirmed = insertPoints.size; } break; case 4: /* R/AL after possible relevant EN/AN */ /* just clean up */ if (insertPoints.points.length > 0) /* remove all non confirmed insert points */ insertPoints.size = insertPoints.confirmed; levState.startON = -1; levState.startL2EN = -1; levState.lastStrongRTL = limit - 1; break; case 5: /* EN/AN after R/AL + possible cont */ /* check for real AN */ if ((_prop == _AN) && (NoContextRTL(dirProps[start0]) == AN)) { /* real AN */ if (levState.startL2EN == -1) { /* if no relevant EN already found */ /* just note the righmost digit as a strong RTL */ levState.lastStrongRTL = limit - 1; break; } if (levState.startL2EN >= 0) { /* after EN, no AN */ addPoint(levState.startL2EN, LRM_BEFORE); levState.startL2EN = -2; } /* note AN */ addPoint(start0, LRM_BEFORE); break; } /* if first EN/AN after R/AL */ if (levState.startL2EN == -1) { levState.startL2EN = start0; } break; case 6: /* note location of latest R/AL */ levState.lastStrongRTL = limit - 1; levState.startON = -1; break; case 7: /* L after R+ON/EN/AN */ /* include possible adjacent number on the left */ for (k = start0-1; k >= 0 && ((levels[k] & 1) == 0); k--) { } if (k >= 0) { addPoint(k, RLM_BEFORE); /* add RLM before */ insertPoints.confirmed = insertPoints.size; /* confirm it */ } levState.startON = start0; break; case 8: /* AN after L */ /* AN numbers between L text on both sides may be trouble. */ /* tentatively bracket with LRMs; will be confirmed if followed by L */ addPoint(start0, LRM_BEFORE); /* add LRM before */ addPoint(start0, LRM_AFTER); /* add LRM after */ break; case 9: /* R after L+ON/EN/AN */ /* false alert, infirm LRMs around previous AN */ insertPoints.size=insertPoints.confirmed; if (_prop == _S) { /* add RLM before S */ addPoint(start0, RLM_BEFORE); insertPoints.confirmed = insertPoints.size; } break; case 10: /* L after L+ON/AN */ level = (byte)(levState.runLevel + addLevel); for (k=levState.startON; k < start0; k++) { if (levels[k] < level) { levels[k] = level; } } insertPoints.confirmed = insertPoints.size; /* confirm inserts */ levState.startON = start0; break; case 11: /* L after L+ON+EN/AN/ON */ level = levState.runLevel; for (k = start0-1; k >= levState.startON; k--) { if (levels[k] == level+3) { while (levels[k] == level+3) { levels[k--] -= 2; } while (levels[k] == level) { k--; } } if (levels[k] == level+2) { levels[k] = level; continue; } levels[k] = (byte)(level+1); } break; case 12: /* R after L+ON+EN/AN/ON */ level = (byte)(levState.runLevel+1); for (k = start0-1; k >= levState.startON; k--) { if (levels[k] > level) { levels[k] -= 2; } } break; default: /* we should never get here */ throw new IllegalStateException("Internal ICU error in processPropertySeq"); } } if ((addLevel) != 0 || (start < start0)) { level = (byte)(levState.runLevel + addLevel); for (k = start; k < limit; k++) { levels[k] = level; } } } private void resolveImplicitLevels(int start, int limit, short sor, short eor) { LevState levState = new LevState(); int i, start1, start2; short oldStateImp, stateImp, actionImp; short gprop, resProp, cell; short nextStrongProp = R; int nextStrongPos = -1; /* check for RTL inverse Bidi mode */ /* FOOD FOR THOUGHT: in case of RTL inverse Bidi, it would make sense to * loop on the text characters from end to start. * This would need a different properties state table (at least different * actions) and different levels state tables (maybe very similar to the * LTR corresponding ones. */ /* initialize for levels state table */ levState.startL2EN = -1; /* used for INVERSE_LIKE_DIRECT_WITH_MARKS */ levState.lastStrongRTL = -1; /* used for INVERSE_LIKE_DIRECT_WITH_MARKS */ levState.state = 0; levState.runLevel = levels[start]; levState.impTab = impTabPair.imptab[levState.runLevel & 1]; levState.impAct = impTabPair.impact[levState.runLevel & 1]; processPropertySeq(levState, sor, start, start); /* initialize for property state table */ if (dirProps[start] == NSM) { stateImp = (short)(1 + sor); } else { stateImp = 0; } start1 = start; start2 = 0; for (i = start; i <= limit; i++) { if (i >= limit) { gprop = eor; } else { short prop, prop1; prop = NoContextRTL(dirProps[i]); gprop = groupProp[prop]; } oldStateImp = stateImp; cell = impTabProps[oldStateImp][gprop]; stateImp = GetStateProps(cell); /* isolate the new state */ actionImp = GetActionProps(cell); /* isolate the action */ if ((i == limit) && (actionImp == 0)) { /* there is an unprocessed sequence if its property == eor */ actionImp = 1; /* process the last sequence */ } if (actionImp != 0) { resProp = impTabProps[oldStateImp][IMPTABPROPS_RES]; switch (actionImp) { case 1: /* process current seq1, init new seq1 */ processPropertySeq(levState, resProp, start1, i); start1 = i; break; case 2: /* init new seq2 */ start2 = i; break; case 3: /* process seq1, process seq2, init new seq1 */ processPropertySeq(levState, resProp, start1, start2); processPropertySeq(levState, _ON, start2, i); start1 = i; break; case 4: /* process seq1, set seq1=seq2, init new seq2 */ processPropertySeq(levState, resProp, start1, start2); start1 = start2; start2 = i; break; default: /* we should never get here */ throw new IllegalStateException("Internal ICU error in resolveImplicitLevels"); } } } /* flush possible pending sequence, e.g. ON */ processPropertySeq(levState, eor, limit, limit); } /* perform (L1) and (X9) ---------------------------------------------------- */ /* * Reset the embedding levels for some non-graphic characters (L1). * This method also sets appropriate levels for BN, and * explicit embedding types that are supposed to have been removed * from the paragraph in (X9). */ private void adjustWSLevels() { int i; if ((flags & MASK_WS) != 0) { int flag; i = trailingWSStart; while (i > 0) { /* reset a sequence of WS/BN before eop and B/S to the paragraph paraLevel */ while (i > 0 && ((flag = DirPropFlagNC(dirProps[--i])) & MASK_WS) != 0) { if (orderParagraphsLTR && (flag & DirPropFlag(B)) != 0) { levels[i] = 0; } else { levels[i] = GetParaLevelAt(i); } } /* reset BN to the next character's paraLevel until B/S, which restarts above loop */ /* here, i+1 is guaranteed to be <length */ while (i > 0) { flag = DirPropFlagNC(dirProps[--i]); if ((flag & MASK_BN_EXPLICIT) != 0) { levels[i] = levels[i + 1]; } else if (orderParagraphsLTR && (flag & DirPropFlag(B)) != 0) { levels[i] = 0; break; } else if ((flag & MASK_B_S) != 0){ levels[i] = GetParaLevelAt(i); break; } } } } } private int Bidi_Min(int x, int y) { return x < y ? x : y; } private int Bidi_Abs(int x) { return x >= 0 ? x : -x; } /** * Perform the Unicode Bidi algorithm. It is defined in the * <a href="http://www.unicode.org/unicode/reports/tr9/">Unicode Standard Annex #9, * version 13, * also described in The Unicode Standard, Version 4.0 .<p> * * This method takes a piece of plain text containing one or more paragraphs, * with or without externally specified embedding levels from <i>styled * text and computes the left-right-directionality of each character.<p> * * If the entire text is all of the same directionality, then * the method may not perform all the steps described by the algorithm, * i.e., some levels may not be the same as if all steps were performed. * This is not relevant for unidirectional text.<br> * For example, in pure LTR text with numbers the numbers would get * a resolved level of 2 higher than the surrounding text according to * the algorithm. This implementation may set all resolved levels to * the same value in such a case.<p> * * The text can be composed of multiple paragraphs. Occurrence of a block * separator in the text terminates a paragraph, and whatever comes next starts * a new paragraph. The exception to this rule is when a Carriage Return (CR) * is followed by a Line Feed (LF). Both CR and LF are block separators, but * in that case, the pair of characters is considered as terminating the * preceding paragraph, and a new paragraph will be started by a character * coming after the LF. * * Although the text is passed here as a <code>String, it is * stored internally as an array of characters. Therefore the * documentation will refer to indexes of the characters in the text. * * @param text contains the text that the Bidi algorithm will be performed * on. This text can be retrieved with <code>getText() or * <code>getTextAsString. * * @param paraLevel specifies the default level for the text; * it is typically 0 (LTR) or 1 (RTL). * If the method shall determine the paragraph level from the text, * then <code>paraLevel can be set to * either <code>LEVEL_DEFAULT_LTR * or <code>LEVEL_DEFAULT_RTL; if the text contains multiple * paragraphs, the paragraph level shall be determined separately for * each paragraph; if a paragraph does not include any strongly typed * character, then the desired default is used (0 for LTR or 1 for RTL). * Any other value between 0 and <code>MAX_EXPLICIT_LEVEL * is also valid, with odd levels indicating RTL. * * @param embeddingLevels (in) may be used to preset the embedding and override levels, * ignoring characters like LRE and PDF in the text. * A level overrides the directional property of its corresponding * (same index) character if the level has the * <code>LEVEL_OVERRIDE bit set. * Except for that bit, it must be * <code>paraLevel<=embeddingLevels[]<=MAX_EXPLICIT_LEVEL, * with one exception: a level of zero may be specified for a * paragraph separator even if <code>paraLevel>0 when multiple * paragraphs are submitted in the same call to <code>setPara(). * <strong>Caution: A reference to this array, not a copy * of the levels, will be stored in the <code>Bidi object; * the <code>embeddingLevels * should not be modified to avoid unexpected results on subsequent * Bidi operations. However, the <code>setPara() and * <code>setLine() methods may modify some or all of the * levels.<br> * <strong>Note: the embeddingLevels array must
* have one entry for each character in <code>text.
*
* @throws IllegalArgumentException if the values in embeddingLevels are
* not within the allowed range
*
* @see #LEVEL_DEFAULT_LTR
* @see #LEVEL_DEFAULT_RTL
* @see #LEVEL_OVERRIDE
* @see #MAX_EXPLICIT_LEVEL
* @stable ICU 3.8
*/
void setPara(String text, byte paraLevel, byte[] embeddingLevels)
{
if (text == null) {
setPara(new char[0], paraLevel, embeddingLevels);
} else {
setPara(text.toCharArray(), paraLevel, embeddingLevels);
}
}
/**
* Perform the Unicode Bidi algorithm. It is defined in the
* <a href="http://www.unicode.org/unicode/reports/tr9/">Unicode Standard Annex #9,
* version 13,
* also described in The Unicode Standard, Version 4.0 .<p>
*
* This method takes a piece of plain text containing one or more paragraphs,
* with or without externally specified embedding levels from <i>styled
* text and computes the left-right-directionality of each character.<p>
*
* If the entire text is all of the same directionality, then
* the method may not perform all the steps described by the algorithm,
* i.e., some levels may not be the same as if all steps were performed.
* This is not relevant for unidirectional text.<br>
* For example, in pure LTR text with numbers the numbers would get
* a resolved level of 2 higher than the surrounding text according to
* the algorithm. This implementation may set all resolved levels to
* the same value in such a case.<p>
*
* The text can be composed of multiple paragraphs. Occurrence of a block
* separator in the text terminates a paragraph, and whatever comes next starts
* a new paragraph. The exception to this rule is when a Carriage Return (CR)
* is followed by a Line Feed (LF). Both CR and LF are block separators, but
* in that case, the pair of characters is considered as terminating the
* preceding paragraph, and a new paragraph will be started by a character
* coming after the LF.
*
* The text is stored internally as an array of characters. Therefore the
* documentation will refer to indexes of the characters in the text.
*
* @param chars contains the text that the Bidi algorithm will be performed
* on. This text can be retrieved with <code>getText() or
* <code>getTextAsString.* * @param paraLevel specifies the default level for the text; * it is typically 0 (LTR) or 1 (RTL). * If the method shall determine the paragraph level from the text, * then <code>paraLevel can be set to * either <code>LEVEL_DEFAULT_LTR * or <code>LEVEL_DEFAULT_RTL; if the text contains multiple * paragraphs, the paragraph level shall be determined separately for * each paragraph; if a paragraph does not include any strongly typed * character, then the desired default is used (0 for LTR or 1 for RTL). * Any other value between 0 and <code>MAX_EXPLICIT_LEVEL * is also valid, with odd levels indicating RTL. * * @param embeddingLevels (in) may be used to preset the embedding and * override levels, ignoring characters like LRE and PDF in the text. * A level overrides the directional property of its corresponding * (same index) character if the level has the * <code>LEVEL_OVERRIDE bit set. * Except for that bit, it must be * <code>paraLevel<=embeddingLevels[]<=MAX_EXPLICIT_LEVEL, * with one exception: a level of zero may be specified for a * paragraph separator even if <code>paraLevel>0 when multiple * paragraphs are submitted in the same call to <code>setPara(). * <strong>Caution: A reference to this array, not a copy * of the levels, will be stored in the <code>Bidi object; * the <code>embeddingLevels * should not be modified to avoid unexpected results on subsequent * Bidi operations. However, the <code>setPara() and * <code>setLine() methods may modify some or all of the * levels.<br> * <strong>Note: the embeddingLevels array must
* have one entry for each character in <code>text.
*
* @throws IllegalArgumentException if the values in embeddingLevels are
* not within the allowed range
*
* @see #LEVEL_DEFAULT_LTR
* @see #LEVEL_DEFAULT_RTL
* @see #LEVEL_OVERRIDE
* @see #MAX_EXPLICIT_LEVEL
* @stable ICU 3.8
*/
public void setPara(char[] chars, byte paraLevel, byte[] embeddingLevels)
{
/* check the argument values */
if (paraLevel < INTERNAL_LEVEL_DEFAULT_LTR) {
verifyRange(paraLevel, 0, MAX_EXPLICIT_LEVEL + 1);
}
if (chars == null) {
chars = new char[0];
}
/* initialize the Bidi object */
this.paraBidi = null; /* mark unfinished setPara */
this.text = chars;
this.length = this.originalLength = this.resultLength = text.length;
this.paraLevel = paraLevel;
this.direction = Bidi.DIRECTION_LEFT_TO_RIGHT;
this.paraCount = 1;
/* Allocate zero-length arrays instead of setting to null here; then
* checks for null in various places can be eliminated.
*/
dirProps = new byte[0];
levels = new byte[0];
runs = new BidiRun[0];
isGoodLogicalToVisualRunsMap = false;
insertPoints.size = 0; /* clean up from last call */
insertPoints.confirmed = 0; /* clean up from last call */
/*
* Save the original paraLevel if contextual; otherwise, set to 0.
*/
if (IsDefaultLevel(paraLevel)) {
defaultParaLevel = paraLevel;
} else {
defaultParaLevel = 0;
}
if (length == 0) {
/*
* For an empty paragraph, create a Bidi object with the paraLevel and
* the flags and the direction set but without allocating zero-length arrays.
* There is nothing more to do.
*/
if (IsDefaultLevel(paraLevel)) {
this.paraLevel &= 1;
defaultParaLevel = 0;
}
if ((this.paraLevel & 1) != 0) {
flags = DirPropFlag(R);
direction = Bidi.DIRECTION_RIGHT_TO_LEFT;
} else {
flags = DirPropFlag(L);
direction = Bidi.DIRECTION_LEFT_TO_RIGHT;
}
runCount = 0;
paraCount = 0;
paraBidi = this; /* mark successful setPara */
return;
}
runCount = -1;
/*
* Get the directional properties,
* the flags bit-set, and
* determine the paragraph level if necessary.
*/
getDirPropsMemory(length);
dirProps = dirPropsMemory;
getDirProps();
/* the processed length may have changed if OPTION_STREAMING is set */
trailingWSStart = length; /* the levels[] will reflect the WS run */
/* allocate paras memory */
if (paraCount > 1) {
getInitialParasMemory(paraCount);
paras = parasMemory;
paras[paraCount - 1] = length;
} else {
/* initialize paras for single paragraph */
paras = simpleParas;
simpleParas[0] = length;
}
/* are explicit levels specified? */
if (embeddingLevels == null) {
/* no: determine explicit levels according to the (Xn) rules */
getLevelsMemory(length);
levels = levelsMemory;
direction = resolveExplicitLevels();
} else {
/* set BN for all explicit codes, check that all levels are 0 or paraLevel..MAX_EXPLICIT_LEVEL */
levels = embeddingLevels;
direction = checkExplicitLevels();
}
/*
* The steps after (X9) in the Bidi algorithm are performed only if
* the paragraph text has mixed directionality!
*/
switch (direction) {
case Bidi.DIRECTION_LEFT_TO_RIGHT:
/* make sure paraLevel is even */
paraLevel = (byte)((paraLevel + 1) & ~1);
/* all levels are implicitly at paraLevel (important for getLevels()) */
trailingWSStart = 0;
break;
case Bidi.DIRECTION_RIGHT_TO_LEFT:
/* make sure paraLevel is odd */
paraLevel |= 1;
/* all levels are implicitly at paraLevel (important for getLevels()) */
trailingWSStart = 0;
break;
default:
this.impTabPair = impTab_DEFAULT;
/*
* If there are no external levels specified and there
* are no significant explicit level codes in the text,
* then we can treat the entire paragraph as one run.
* Otherwise, we need to perform the following rules on runs of
* the text with the same embedding levels. (X10)
* "Significant" explicit level codes are ones that actually
* affect non-BN characters.
* Examples for "insignificant" ones are empty embeddings
* LRE-PDF, LRE-RLE-PDF-PDF, etc.
*/
if (embeddingLevels == null && paraCount <= 1 &&
(flags & DirPropFlagMultiRuns) == 0) {
resolveImplicitLevels(0, length,
GetLRFromLevel(GetParaLevelAt(0)),
GetLRFromLevel(GetParaLevelAt(length - 1)));
} else {
/* sor, eor: start and end types of same-level-run */
int start, limit = 0;
byte level, nextLevel;
short sor, eor;
/* determine the first sor and set eor to it because of the loop body (sor=eor there) */
level = GetParaLevelAt(0);
nextLevel = levels[0];
if (level < nextLevel) {
eor = GetLRFromLevel(nextLevel);
} else {
eor = GetLRFromLevel(level);
}
do {
/* determine start and limit of the run (end points just behind the run) */
/* the values for this run's start are the same as for the previous run's end */
start = limit;
level = nextLevel;
if ((start > 0) && (NoContextRTL(dirProps[start - 1]) == B)) {
/* except if this is a new paragraph, then set sor = para level */
sor = GetLRFromLevel(GetParaLevelAt(start));
} else {
sor = eor;
}
/* search for the limit of this run */
while (++limit < length && levels[limit] == level) {}
/* get the correct level of the next run */
if (limit < length) {
nextLevel = levels[limit];
} else {
nextLevel = GetParaLevelAt(length - 1);
}
/* determine eor from max(level, nextLevel); sor is last run's eor */
if ((level & ~INTERNAL_LEVEL_OVERRIDE) < (nextLevel & ~INTERNAL_LEVEL_OVERRIDE)) {
eor = GetLRFromLevel(nextLevel);
} else {
eor = GetLRFromLevel(level);
}
/* if the run consists of overridden directional types, then there
are no implicit types to be resolved */
if ((level & INTERNAL_LEVEL_OVERRIDE) == 0) {
resolveImplicitLevels(start, limit, sor, eor);
} else {
/* remove the LEVEL_OVERRIDE flags */
do {
levels[start++] &= ~INTERNAL_LEVEL_OVERRIDE;
} while (start < limit);
}
} while (limit < length);
}
/* reset the embedding levels for some non-graphic characters (L1), (X9) */
adjustWSLevels();
break;
}
resultLength += insertPoints.size;
paraBidi = this; /* mark successful setPara */
}
/**
* Perform the Unicode Bidi algorithm on a given paragraph, as defined in the
* <a href="http://www.unicode.org/unicode/reports/tr9/">Unicode Standard Annex #9,
* version 13,
* also described in The Unicode Standard, Version 4.0 .<p>
*
* This method takes a paragraph of text and computes the
* left-right-directionality of each character. The text should not
* contain any Unicode block separators.<p>
*
* The RUN_DIRECTION attribute in the text, if present, determines the base
* direction (left-to-right or right-to-left). If not present, the base
* direction is computed using the Unicode Bidirectional Algorithm,
* defaulting to left-to-right if there are no strong directional characters
* in the text. This attribute, if present, must be applied to all the text
* in the paragraph.<p>
*
* The BIDI_EMBEDDING attribute in the text, if present, represents
* embedding level information. Negative values from -1 to -62 indicate
* overrides at the absolute value of the level. Positive values from 1 to
* 62 indicate embeddings. Where values are zero or not defined, the base
* embedding level as determined by the base direction is assumed.<p>
*
* The NUMERIC_SHAPING attribute in the text, if present, converts European
* digits to other decimal digits before running the bidi algorithm. This
* attribute, if present, must be applied to all the text in the paragraph.
*
* If the entire text is all of the same directionality, then
* the method may not perform all the steps described by the algorithm,
* i.e., some levels may not be the same as if all steps were performed.
* This is not relevant for unidirectional text.<br>
* For example, in pure LTR text with numbers the numbers would get
* a resolved level of 2 higher than the surrounding text according to
* the algorithm. This implementation may set all resolved levels to
* the same value in such a case.<p>
*
* @param paragraph a paragraph of text with optional character and
* paragraph attribute information
* @stable ICU 3.8
*/
public void setPara(AttributedCharacterIterator paragraph)
{
byte paraLvl;
char ch = paragraph.first();
Boolean runDirection =
(Boolean) paragraph.getAttribute(TextAttributeConstants.RUN_DIRECTION);
Object shaper = paragraph.getAttribute(TextAttributeConstants.NUMERIC_SHAPING);
if (runDirection == null) {
paraLvl = INTERNAL_LEVEL_DEFAULT_LTR;
} else {
paraLvl = (runDirection.equals(TextAttributeConstants.RUN_DIRECTION_LTR)) ?
(byte)Bidi.DIRECTION_LEFT_TO_RIGHT : (byte)Bidi.DIRECTION_RIGHT_TO_LEFT;
}
byte[] lvls = null;
int len = paragraph.getEndIndex() - paragraph.getBeginIndex();
byte[] embeddingLevels = new byte[len];
char[] txt = new char[len];
int i = 0;
while (ch != AttributedCharacterIterator.DONE) {
txt[i] = ch;
Integer embedding =
(Integer) paragraph.getAttribute(TextAttributeConstants.BIDI_EMBEDDING);
if (embedding != null) {
byte level = embedding.byteValue();
if (level == 0) {
/* no-op */
} else if (level < 0) {
lvls = embeddingLevels;
embeddingLevels[i] = (byte)((0 - level) | INTERNAL_LEVEL_OVERRIDE);
} else {
lvls = embeddingLevels;
embeddingLevels[i] = level;
}
}
ch = paragraph.next();
++i;
}
if (shaper != null) {
NumericShapings.shape(shaper, txt, 0, len);
}
setPara(txt, paraLvl, lvls);
}
/**
* Specify whether block separators must be allocated level zero,
* so that successive paragraphs will progress from left to right.
* This method must be called before <code>setPara().
* Paragraph separators (B) may appear in the text. Setting them to level zero
* means that all paragraph separators (including one possibly appearing
* in the last text position) are kept in the reordered text after the text
* that they follow in the source text.
* When this feature is not enabled, a paragraph separator at the last
* position of the text before reordering will go to the first position
* of the reordered text when the paragraph level is odd.
*
* @param ordarParaLTR specifies whether paragraph separators (B) must
* receive level 0, so that successive paragraphs progress from left to right.
*
* @see #setPara
* @stable ICU 3.8
*/
private void orderParagraphsLTR(boolean ordarParaLTR) {
orderParagraphsLTR = ordarParaLTR;
}
/**
* Get the directionality of the text.
*
* @return a value of <code>LTR, RTL or MIXED
* that indicates if the entire text
* represented by this object is unidirectional,
* and which direction, or if it is mixed-directional.
*
* @throws IllegalStateException if this call is not preceded by a successful
* call to <code>setPara or setLine
*
* @see #LTR
* @see #RTL
* @see #MIXED
* @stable ICU 3.8
*/
private byte getDirection()
{
verifyValidParaOrLine();
return direction;
}
/**
* Get the length of the text.
*
* @return The length of the text that the <code>Bidi object was
* created for.
*
* @throws IllegalStateException if this call is not preceded by a successful
* call to <code>setPara or setLine
* @stable ICU 3.8
*/
public int getLength()
{
verifyValidParaOrLine();
return originalLength;
}
/* paragraphs API methods ------------------------------------------------- */
/**
* Get the paragraph level of the text.
*
* @return The paragraph level. If there are multiple paragraphs, their
* level may vary if the required paraLevel is LEVEL_DEFAULT_LTR or
* LEVEL_DEFAULT_RTL. In that case, the level of the first paragraph
* is returned.
*
* @throws IllegalStateException if this call is not preceded by a successful
* call to <code>setPara or setLine
*
* @see #LEVEL_DEFAULT_LTR
* @see #LEVEL_DEFAULT_RTL
* @see #getParagraph
* @see #getParagraphByIndex
* @stable ICU 3.8
*/
public byte getParaLevel()
{
verifyValidParaOrLine();
return paraLevel;
}
/**
* Get the index of a paragraph, given a position within the text.<p>
*
* @param charIndex is the index of a character within the text, in the
* range <code>[0..getProcessedLength()-1].
*
* @return The index of the paragraph containing the specified position,
* starting from 0.
*
* @throws IllegalStateException if this call is not preceded by a successful
* call to <code>setPara or setLine
* @throws IllegalArgumentException if charIndex is not within the legal range
*
* @see com.ibm.icu.text.BidiRun
* @see #getProcessedLength
* @stable ICU 3.8
*/
public int getParagraphIndex(int charIndex)
{
verifyValidParaOrLine();
BidiBase bidi = paraBidi; /* get Para object if Line object */
verifyRange(charIndex, 0, bidi.length);
int paraIndex;
for (paraIndex = 0; charIndex >= bidi.paras[paraIndex]; paraIndex++) {
}
return paraIndex;
}
/**
* <code>setLine() returns a Bidi object to
* contain the reordering information, especially the resolved levels,
* for all the characters in a line of text. This line of text is
* specified by referring to a <code>Bidi object representing
* this information for a piece of text containing one or more paragraphs,
* and by specifying a range of indexes in this text.<p>
* In the new line object, the indexes will range from 0 to <code>limit-start-1.* * This is used after calling <code>setPara() * for a piece of text, and after line-breaking on that text. * It is not necessary if each paragraph is treated as a single line.<p> * * After line-breaking, rules (L1) and (L2) for the treatment of * trailing WS and for reordering are performed on * a <code>Bidi object that represents a line.
*
* <strong>Important: the line Other Java examples (source code examples)Here is a short list of links related to this Java BidiBase.java source code file: |
... this post is sponsored by my books ... | |
#1 New Release! |
FP Best Seller |
Copyright 1998-2024 Alvin Alexander, alvinalexander.com
All Rights Reserved.
A percentage of advertising revenue from
pages under the /java/jwarehouse
URI on this website is
paid back to open source projects.