|
The LSParser.java Java example source code
/*
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
*
* This code is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License version 2 only, as
* published by the Free Software Foundation. Oracle designates this
* particular file as subject to the "Classpath" exception as provided
* by Oracle in the LICENSE file that accompanied this code.
*
* This code is distributed in the hope that it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
* version 2 for more details (a copy is included in the LICENSE file that
* accompanied this code).
*
* You should have received a copy of the GNU General Public License version
* 2 along with this work; if not, write to the Free Software Foundation,
* Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
*
* Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
* or visit www.oracle.com if you need additional information or have any
* questions.
*/
/*
* This file is available under and governed by the GNU General Public
* License version 2 only, as published by the Free Software Foundation.
* However, the following notice accompanied the original version of this
* file and, per its terms, should not be removed:
*
* Copyright (c) 2004 World Wide Web Consortium,
*
* (Massachusetts Institute of Technology, European Research Consortium for
* Informatics and Mathematics, Keio University). All Rights Reserved. This
* work is distributed under the W3C(r) Software License [1] in the hope that
* it will be useful, but WITHOUT ANY WARRANTY; without even the implied
* warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
*
* [1] http://www.w3.org/Consortium/Legal/2002/copyright-software-20021231
*/
package org.w3c.dom.ls;
import org.w3c.dom.Document;
import org.w3c.dom.DOMConfiguration;
import org.w3c.dom.Node;
import org.w3c.dom.DOMException;
/**
* An interface to an object that is able to build, or augment, a DOM tree
* from various input sources.
* <p> LSParser provides an API for parsing XML and building the
* corresponding DOM document structure. A <code>LSParser instance
* can be obtained by invoking the
* <code>DOMImplementationLS.createLSParser() method.
* <p> As specified in [DOM Level 3 Core]
* , when a document is first made available via the LSParser:
* <ul>
* <li> there will
* never be two adjacent nodes of type NODE_TEXT, and there will never be
* empty text nodes.
* </li>
* <li> it is expected that the value and
* <code>nodeValue attributes of an Attr node initially
* return the <a href='http://www.w3.org/TR/2004/REC-xml-20040204#AVNormalize'>XML 1.0
* normalized value</a>. However, if the parameters "
* validate-if-schema</a>" and "
* datatype-normalization</a>" are set to true , depending on the attribute normalization
* used, the attribute values may differ from the ones obtained by the XML
* 1.0 attribute normalization. If the parameters "<a href='http://www.w3.org/TR/DOM-Level-3-Core/core.html#parameter-datatype-normalization'>
* datatype-normalization</a>" is set to false , the XML 1.0 attribute normalization is
* guaranteed to occur, and if the attributes list does not contain
* namespace declarations, the <code>attributes attribute on
* <code>Element node represents the property [attributes] defined in [XML Information Set]
* .
* </li>
* </ul>
* <p> Asynchronous LSParser objects are expected to also
* implement the <code>events::EventTarget interface so that event
* listeners can be registered on asynchronous <code>LSParser
* objects.
* <p> Events supported by asynchronous LSParser objects are:
* <dl>
* <dt>load
* <dd>
* The <code>LSParser finishes to load the document. See also the
* definition of the <code>LSLoadEvent interface.
* <dt>progress
* <dd> The
* <code>LSParser signals progress as data is parsed. This
* specification does not attempt to define exactly when progress events
* should be dispatched. That is intentionally left as
* implementation-dependent. Here is one example of how an application might
* dispatch progress events: Once the parser starts receiving data, a
* progress event is dispatched to indicate that the parsing starts. From
* there on, a progress event is dispatched for every 4096 bytes of data
* that is received and processed. This is only one example, though, and
* implementations can choose to dispatch progress events at any time while
* parsing, or not dispatch them at all. See also the definition of the
* <code>LSProgressEvent interface.
* </dl>
* <p >Note: All events defined in this specification use the
* namespace URI <code>"http://www.w3.org/2002/DOMLS".
* <p> While parsing an input source, errors are reported to the application
* through the error handler (<code>LSParser.domConfig's "
* error-handler</a>" parameter). This specification does in no way try to define all possible
* errors that can occur while parsing XML, or any other markup, but some
* common error cases are defined. The types (<code>DOMError.type) of
* errors and warnings defined by this specification are:
* <dl>
* <dt>
* <code>"check-character-normalization-failure" [error]
* <dd> Raised if
* the parameter "<a href='http://www.w3.org/TR/DOM-Level-3-Core/core.html#parameter-check-character-normalization'>
* check-character-normalization</a>" is set to true and a string is encountered that fails normalization
* checking. </dd>
* <dt>"doctype-not-allowed" [fatal]
* <dd> Raised if the
* configuration parameter "disallow-doctype" is set to <code>true
* and a doctype is encountered. </dd>
* <dt>"no-input-specified" [fatal]
* <dd>
* Raised when loading a document and no input is specified in the
* <code>LSInput object.
* <dt>
* <code>"pi-base-uri-not-preserved" [warning]
* <dd> Raised if a processing
* instruction is encountered in a location where the base URI of the
* processing instruction can not be preserved. One example of a case where
* this warning will be raised is if the configuration parameter "<a href='http://www.w3.org/TR/DOM-Level-3-Core/core.html#parameter-entities'>
* entities</a>" is set to false and the following XML file is parsed:
* <pre>
* <!DOCTYPE root [ <!ENTITY e SYSTEM 'subdir/myentity.ent' ]>
* <root> &e; </root></pre>
* And <code>subdir/myentity.ent
* contains:
* <pre><one> <two/> </one> <?pi 3.14159?>
* <more/></pre>
* </dd>
* <dt>"unbound-prefix-in-entity" [warning]
* <dd> An
* implementation dependent warning that may be raised if the configuration
* parameter "<a href='http://www.w3.org/TR/DOM-Level-3-Core/core.html#parameter-namespaces'>
* namespaces</a>" is set to true and an unbound namespace prefix is
* encountered in an entity's replacement text. Raising this warning is not
* enforced since some existing parsers may not recognize unbound namespace
* prefixes in the replacement text of entities. </dd>
* <dt>
* <code>"unknown-character-denormalization" [fatal]
* <dd> Raised if the
* configuration parameter "ignore-unknown-character-denormalizations" is
* set to <code>false and a character is encountered for which the
* processor cannot determine the normalization properties. </dd>
* <dt>
* <code>"unsupported-encoding" [fatal]
* <dd> Raised if an unsupported
* encoding is encountered. </dd>
* <dt>"unsupported-media-type" [fatal]
* <dd>
* Raised if the configuration parameter "supported-media-types-only" is set
* to <code>true and an unsupported media type is encountered.
* </dl>
* <p> In addition to raising the defined errors and warnings, implementations
* are expected to raise implementation specific errors and warnings for any
* other error and warning cases such as IO errors (file not found,
* permission denied,...), XML well-formedness errors, and so on.
* <p>See also the Document Object Model (DOM) Level 3 Load
and Save Specification</a>.
*/
public interface LSParser {
/**
* The <code>DOMConfiguration object used when parsing an input
* source. This <code>DOMConfiguration is specific to the parse
* operation. No parameter values from this <code>DOMConfiguration
* object are passed automatically to the <code>DOMConfiguration
* object on the <code>Document that is created, or used, by the
* parse operation. The DOM application is responsible for passing any
* needed parameter values from this <code>DOMConfiguration
* object to the <code>DOMConfiguration object referenced by the
* <code>Document object.
* <br> In addition to the parameters recognized in on the
* DOMConfiguration</a> interface defined in [DOM Level 3 Core]
* , the <code>DOMConfiguration objects for LSParser
* add or modify the following parameters:
* <dl>
* <dt>
* <code>"charset-overrides-xml-encoding"
* <dd>
* <dl>
* <dt>true
* <dd>[optional] (default) If a higher level protocol such as HTTP [IETF RFC 2616] provides an
* indication of the character encoding of the input stream being
* processed, that will override any encoding specified in the XML
* declaration or the Text declaration (see also section 4.3.3,
* "Character Encoding in Entities", in [<a href='http://www.w3.org/TR/2004/REC-xml-20040204'>XML 1.0]).
* Explicitly setting an encoding in the <code>LSInput overrides
* any encoding from the protocol. </dd>
* <dt>false
* <dd>[required] The parser ignores any character set encoding information from
* higher-level protocols. </dd>
* </dl>
* <dt>"disallow-doctype"
* <dd>
* <dl>
* <dt>
* <code>true
* <dd>[optional] Throw a fatal "doctype-not-allowed" error if a doctype node is found while parsing the document. This is
* useful when dealing with things like SOAP envelopes where doctype
* nodes are not allowed. </dd>
* <dt>false
* <dd>[required] (default) Allow doctype nodes in the document.
* </dl>
* <dt>
* <code>"ignore-unknown-character-denormalizations"
* <dd>
* <dl>
* <dt>
* <code>true
* <dd>[required] (default) If, while verifying full normalization when [XML 1.1] is
* supported, a processor encounters characters for which it cannot
* determine the normalization properties, then the processor will
* ignore any possible denormalizations caused by these characters.
* This parameter is ignored for [<a href='http://www.w3.org/TR/2004/REC-xml-20040204'>XML 1.0].
* <dt>
* <code>false
* <dd>[optional] Report an fatal "unknown-character-denormalization" error if a character is encountered for which the processor cannot
* determine the normalization properties. </dd>
* </dl>
* <dt>"infoset"
* <dd> See
* the definition of <code>DOMConfiguration for a description of
* this parameter. Unlike in [<a href='http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407'>DOM Level 3 Core]
* , this parameter will default to <code>true for
* <code>LSParser.
* <dt>"namespaces"
* <dd>
* <dl>
* <dt>true
* <dd>[required] (default) Perform the namespace processing as defined in [XML Namespaces]
* and [<a href='http://www.w3.org/TR/2004/REC-xml-names11-20040204/'>XML Namespaces 1.1]
* . </dd>
* <dt>false
* <dd>[optional] Do not perform the namespace processing.
* </dl>
* <dt>
* <code>"resource-resolver"
* <dd>[required] A reference to a LSResourceResolver object, or null. If
* the value of this parameter is not null when an external resource
* (such as an external XML entity or an XML schema location) is
* encountered, the implementation will request that the
* <code>LSResourceResolver referenced in this parameter resolves
* the resource. </dd>
* <dt>"supported-media-types-only"
* <dd>
* <dl>
* <dt>
* <code>true
* <dd>[optional] Check that the media type of the parsed resource is a supported media
* type. If an unsupported media type is encountered, a fatal error of
* type <b>"unsupported-media-type" will be raised. The media types defined in [IETF RFC 3023] must always
* be accepted. </dd>
* <dt>false
* <dd>[required] (default) Accept any media type.
* </dl>
* <dt>"validate"
* <dd> See the definition of
* <code>DOMConfiguration for a description of this parameter.
* Unlike in [<a href='http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407'>DOM Level 3 Core]
* , the processing of the internal subset is always accomplished, even
* if this parameter is set to <code>false.
* <dt>
* <code>"validate-if-schema"
* <dd> See the definition of
* <code>DOMConfiguration for a description of this parameter.
* Unlike in [<a href='http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407'>DOM Level 3 Core]
* , the processing of the internal subset is always accomplished, even
* if this parameter is set to <code>false.
* <dt>
* <code>"well-formed"
* <dd> See the definition of
* <code>DOMConfiguration for a description of this parameter.
* Unlike in [<a href='http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407'>DOM Level 3 Core]
* , this parameter cannot be set to <code>false.
* </dl>
*/
public DOMConfiguration getDomConfig();
/**
* When a filter is provided, the implementation will call out to the
* filter as it is constructing the DOM tree structure. The filter can
* choose to remove elements from the document being constructed, or to
* terminate the parsing early.
* <br> The filter is invoked after the operations requested by the
* <code>DOMConfiguration parameters have been applied. For
* example, if "<a href='http://www.w3.org/TR/DOM-Level-3-Core/core.html#parameter-validate'>
* validate</a>" is set to true , the validation is done before invoking the
* filter.
*/
public LSParserFilter getFilter();
/**
* When a filter is provided, the implementation will call out to the
* filter as it is constructing the DOM tree structure. The filter can
* choose to remove elements from the document being constructed, or to
* terminate the parsing early.
* <br> The filter is invoked after the operations requested by the
* <code>DOMConfiguration parameters have been applied. For
* example, if "<a href='http://www.w3.org/TR/DOM-Level-3-Core/core.html#parameter-validate'>
* validate</a>" is set to true , the validation is done before invoking the
* filter.
*/
public void setFilter(LSParserFilter filter);
/**
* <code>true if the LSParser is asynchronous,
* <code>false if it is synchronous.
*/
public boolean getAsync();
/**
* <code>true if the LSParser is currently busy
* loading a document, otherwise <code>false.
*/
public boolean getBusy();
/**
* Parse an XML document from a resource identified by a
* <code>LSInput.
* @param input The <code>LSInput from which the source of the
* document is to be read.
* @return If the <code>LSParser is a synchronous
* <code>LSParser, the newly created and populated
* <code>Document is returned. If the LSParser is
* asynchronous, <code>null is returned since the document
* object may not yet be constructed when this method returns.
* @exception DOMException
* INVALID_STATE_ERR: Raised if the <code>LSParser's
* <code>LSParser.busy attribute is true .
* @exception LSException
* PARSE_ERR: Raised if the <code>LSParser was unable to load
* the XML document. DOM applications should attach a
* <code>DOMErrorHandler using the parameter "
* error-handler</a>" if they wish to get details on the error.
*/
public Document parse(LSInput input)
throws DOMException, LSException;
/**
* Parse an XML document from a location identified by a URI reference [<a href='http://www.ietf.org/rfc/rfc2396.txt'>IETF RFC 2396]. If the URI
* contains a fragment identifier (see section 4.1 in [<a href='http://www.ietf.org/rfc/rfc2396.txt'>IETF RFC 2396]), the
* behavior is not defined by this specification, future versions of
* this specification may define the behavior.
* @param uri The location of the XML document to be read.
* @return If the <code>LSParser is a synchronous
* <code>LSParser, the newly created and populated
* <code>Document is returned, or null if an error
* occured. If the <code>LSParser is asynchronous,
* <code>null is returned since the document object may not yet
* be constructed when this method returns.
* @exception DOMException
* INVALID_STATE_ERR: Raised if the <code>LSParser.busy
* attribute is <code>true.
* @exception LSException
* PARSE_ERR: Raised if the <code>LSParser was unable to load
* the XML document. DOM applications should attach a
* <code>DOMErrorHandler using the parameter "
* error-handler</a>" if they wish to get details on the error.
*/
public Document parseURI(String uri)
throws DOMException, LSException;
// ACTION_TYPES
/**
* Append the result of the parse operation as children of the context
* node. For this action to work, the context node must be an
* <code>Element or a DocumentFragment .
*/
public static final short ACTION_APPEND_AS_CHILDREN = 1;
/**
* Replace all the children of the context node with the result of the
* parse operation. For this action to work, the context node must be an
* <code>Element, a Document , or a
* <code>DocumentFragment.
*/
public static final short ACTION_REPLACE_CHILDREN = 2;
/**
* Insert the result of the parse operation as the immediately preceding
* sibling of the context node. For this action to work the context
* node's parent must be an <code>Element or a
* <code>DocumentFragment.
*/
public static final short ACTION_INSERT_BEFORE = 3;
/**
* Insert the result of the parse operation as the immediately following
* sibling of the context node. For this action to work the context
* node's parent must be an <code>Element or a
* <code>DocumentFragment.
*/
public static final short ACTION_INSERT_AFTER = 4;
/**
* Replace the context node with the result of the parse operation. For
* this action to work, the context node must have a parent, and the
* parent must be an <code>Element or a
* <code>DocumentFragment.
*/
public static final short ACTION_REPLACE = 5;
/**
* Parse an XML fragment from a resource identified by a
* <code>LSInput and insert the content into an existing document
* at the position specified with the <code>context and
* <code>action arguments. When parsing the input stream, the
* context node (or its parent, depending on where the result will be
* inserted) is used for resolving unbound namespace prefixes. The
* context node's <code>ownerDocument node (or the node itself if
* the node of type <code>DOCUMENT_NODE) is used to resolve
* default attributes and entity references.
* <br> As the new data is inserted into the document, at least one
* mutation event is fired per new immediate child or sibling of the
* context node.
* <br> If the context node is a Document node and the action
* is <code>ACTION_REPLACE_CHILDREN, then the document that is
* passed as the context node will be changed such that its
* <code>xmlEncoding, documentURI ,
* <code>xmlVersion, inputEncoding ,
* <code>xmlStandalone, and all other such attributes are set to
* what they would be set to if the input source was parsed using
* <code>LSParser.parse().
* <br> This method is always synchronous, even if the
* <code>LSParser is asynchronous (LSParser.async is
* <code>true).
* <br> If an error occurs while parsing, the caller is notified through
* the <code>ErrorHandler instance associated with the "
* error-handler</a>" parameter of the DOMConfiguration .
* <br> When calling parseWithContext , the values of the
* following configuration parameters will be ignored and their default
* values will always be used instead: "<a href='http://www.w3.org/TR/DOM-Level-3-Core/core.html#parameter-validate'>
* validate</a>", "
* validate-if-schema</a>", and "
* element-content-whitespace</a>". Other parameters will be treated normally, and the parser is expected
* to call the <code>LSParserFilter just as if a whole document
* was parsed.
* @param input The <code>LSInput from which the source document
* is to be read. The source document must be an XML fragment, i.e.
* anything except a complete XML document (except in the case where
* the context node of type <code>DOCUMENT_NODE, and the action
* is <code>ACTION_REPLACE_CHILDREN), a DOCTYPE (internal
* subset), entity declaration(s), notation declaration(s), or XML or
* text declaration(s).
* @param contextArg The node that is used as the context for the data
* that is being parsed. This node must be a <code>Document
* node, a <code>DocumentFragment node, or a node of a type
* that is allowed as a child of an <code>Element node, e.g. it
* cannot be an <code>Attribute node.
* @param action This parameter describes which action should be taken
* between the new set of nodes being inserted and the existing
* children of the context node. The set of possible actions is
* defined in <code>ACTION_TYPES above.
* @return Return the node that is the result of the parse operation. If
* the result is more than one top-level node, the first one is
* returned.
* @exception DOMException
* HIERARCHY_REQUEST_ERR: Raised if the content cannot replace, be
* inserted before, after, or as a child of the context node (see also
* <code>Node.insertBefore or Node.replaceChild in [DOM Level 3 Core]
* ).
* <br> NOT_SUPPORTED_ERR: Raised if the LSParser doesn't
* support this method, or if the context node is of type
* <code>Document and the DOM implementation doesn't support
* the replacement of the <code>DocumentType child or
* <code>Element child.
* <br> NO_MODIFICATION_ALLOWED_ERR: Raised if the context node is a
* read only node and the content is being appended to its child list,
* or if the parent node of the context node is read only node and the
* content is being inserted in its child list.
* <br> INVALID_STATE_ERR: Raised if the LSParser.busy
* attribute is <code>true.
* @exception LSException
* PARSE_ERR: Raised if the <code>LSParser was unable to load
* the XML fragment. DOM applications should attach a
* <code>DOMErrorHandler using the parameter "
* error-handler</a>" if they wish to get details on the error.
*/
public Node parseWithContext(LSInput input,
Node contextArg,
short action)
throws DOMException, LSException;
/**
* Abort the loading of the document that is currently being loaded by
* the <code>LSParser. If the LSParser is currently
* not busy, a call to this method does nothing.
*/
public void abort();
}
Other Java examples (source code examples)
Here is a short list of links related to this Java LSParser.java source code file:
|