alvinalexander.com | career | drupal | java | mac | mysql | perl | scala | uml | unix  

Hibernate example source code file (Envers.xml)

This example Hibernate source code file (Envers.xml) is included in the DevDaily.com "Java Source Code Warehouse" project. The intent of this project is to help you "Learn Java by Example" TM.

Java - Hibernate tags/keywords

cdata, cdata, envers, envers, for, hibernate, if, rev, string, the, the, this, this, you

The Hibernate Envers.xml source code

<?xml version='1.0' encoding='utf-8' ?>
<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
<!ENTITY % BOOK_ENTITIES SYSTEM "Hibernate_Development_Guide.ent">
%BOOK_ENTITIES;
]>

<chapter>
    <title>Envers

    <preface>
        <title>Preface
        <para>
            The aim of Hibernate Envers is to provide historical versioning of your application's entity data.  Much
            like source control management tools such as Subversion or Git, Hibernate Envers manages a notion of revisions
            if your application data through the use of audit tables.  Each transaction relates to one global revision number
            which can be used to identify groups of changes (much like a change set in source control).  As the revisions
            are global, having a revision number, you can query for various entities at that revision, retrieving a
            (partial) view of the database at that revision. You can find a revision number having a date, and the other
            way round, you can get the date at which a revision was committed.
        </para>
    </preface>

    <section>
        <title>Basics

        <para>
            To audit changes that are performed on an entity, you only need two things: the
            <literal>hibernate-envers jar on the classpath and an @Audited annotation
            on the entity.
        </para>

        <para>
            And that's all - you can create, modify and delete the entites as always. If you look at the generated
            schema for your entities, or at the data persisted by Hibernate, you will notice that there are no changes.
            However, for each audited entity, a new table is introduced - <literal>entity_table_AUD,
            which stores the historical data, whenever you commit a transaction.
        </para>

        <para>
            Instead of annotating the whole class and auditing all properties, you can annotate
            only some persistent properties with <literal>@Audited. This will cause only
            these properties to be audited.
        </para>

        <para>
            The audit (history) of an entity can be accessed using the <literal>AuditReader interface, which
            can be obtained having an open <listeral>EntityManager or Session via
            the <literal>AuditReaderFactory. See the javadocs for these classes for details on the
            functionality offered.
        </para>
    </section>

    <section>
        <title>Configuration
        <para>
            It is possible to configure various aspects of Hibernate Envers behavior, such as table names, etc.
        </para>

        <table frame="topbot">
            <title>Envers Configuration Properties
            <tgroup cols="3">
                <colspec colname="c1" colwidth="1*"/>
                <colspec colname="c2" colwidth="1*"/>
                <colspec colname="c2" colwidth="1*"/>

                <thead>
                    <row>
                        <entry>Property name
                        <entry>Default value
                        <entry>Description
                    </row>
                </thead>

                <tbody>
                    <row>
                        <entry>
                            <property>org.hibernate.envers.audit_table_prefix
                        </entry>
                        <entry>
                        </entry>
                        <entry>
                            String that will be prepended to the name of an audited entity to create the name of the
                            entity, that will hold audit information.
                        </entry>
                    </row>
                    <row>
                        <entry>
                            <property>org.hibernate.envers.audit_table_suffix
                        </entry>
                        <entry>
                            _AUD
                        </entry>
                        <entry>
                            String that will be appended to the name of an audited entity to create the name of the
                            entity, that will hold audit information. If you audit an entity with a table name Person,
                            in the default setting Envers will generate a <literal>Person_AUD table to store
                            historical data.
                        </entry>
                    </row>
                    <row>
                        <entry>
                            <property>org.hibernate.envers.revision_field_name
                        </entry>
                        <entry>
                            REV
                        </entry>
                        <entry>
                            Name of a field in the audit entity that will hold the revision number.
                        </entry>
                    </row>
                    <row>
                        <entry>
                            <property>org.hibernate.envers.revision_type_field_name
                        </entry>
                        <entry>
                            REVTYPE
                        </entry>
                        <entry>
                            Name of a field in the audit entity that will hold the type of the revision (currently,
                            this can be: add, mod, del).
                        </entry>
                    </row>
                    <row>
                        <entry>
                            <property>org.hibernate.envers.revision_on_collection_change
                        </entry>
                        <entry>
                            true
                        </entry>
                        <entry>
                            Should a revision be generated when a not-owned relation field changes (this can be either
                            a collection in a one-to-many relation, or the field using "mappedBy" attribute in a
                            one-to-one relation).
                        </entry>
                    </row>
                    <row>
                        <entry>
                            <property>org.hibernate.envers.do_not_audit_optimistic_locking_field
                        </entry>
                        <entry>
                            true
                        </entry>
                        <entry>
                            When true, properties to be used for optimistic locking, annotated with
                            <literal>@Version, will be automatically not audited (their history won't be
                            stored; it normally doesn't make sense to store it).
                        </entry>
                    </row>
                    <row>
                        <entry>
                            <property>org.hibernate.envers.store_data_at_delete
                        </entry>
                        <entry>
                            false
                        </entry>
                        <entry>
                            Should the entity data be stored in the revision when the entity is deleted (instead of only
                            storing the id and all other properties as null). This is not normally needed, as the data is
                            present in the last-but-one revision. Sometimes, however, it is easier and more efficient to
                            access it in the last revision (then the data that the entity contained before deletion is
                            stored twice).
                        </entry>
                    </row>
                    <row>
                        <entry>
                            <property>org.hibernate.envers.default_schema
                        </entry>
                        <entry>
                            null (same schema as table being audited)
                        </entry>
                        <entry>
                            The default schema name that should be used for audit tables. Can be overridden using the
                            <literal>@AuditTable(schema="...") annotation. If not present, the schema will
                            be the same as the schema of the table being audited.
                        </entry>
                    </row>
                    <row>
                        <entry>
                            <property>org.hibernate.envers.default_catalog
                        </entry>
                        <entry>
                            null (same catalog as table being audited)
                        </entry>
                        <entry>
                            The default catalog name that should be used for audit tables. Can be overridden using the
                            <literal>@AuditTable(catalog="...") annotation. If not present, the catalog will
                            be the same as the catalog of the normal tables.
                        </entry>
                    </row>
                    <row>
                        <entry>
                            <property>org.hibernate.envers.audit_strategy
                        </entry>
                        <entry>
                            org.hibernate.envers.strategy.DefaultAuditStrategy
                        </entry>
                        <entry>
                            The audit strategy that should be used when persisting audit data. The default stores only
                            the revision, at which an entity was modified. An alternative, the
                            <literal>org.hibernate.envers.strategy.ValidityAuditStrategy stores both the
                            start revision and the end revision. Together these define when an audit row was valid,
                            hence the name ValidityAuditStrategy.
                        </entry>
                    </row>
                    <row>
                        <entry>
                            <property>org.hibernate.envers.audit_strategy_validity_end_rev_field_name
                        </entry>
                        <entry>
                            REVEND
                        </entry>
                        <entry>
                            The column name that will hold the end revision number in audit entities. This property is
                            only valid if the validity audit strategy is used.
                        </entry>
                    </row>
                    <row>
                        <entry>
                            <property>org.hibernate.envers.audit_strategy_validity_store_revend_timestamp
                        </entry>
                        <entry>
                            false
                        </entry>
                        <entry>
                            Should the timestamp of the end revision be stored, until which the data was valid, in
                            addition to the end revision itself.  This is useful to be able to purge old Audit records
                            out of a relational database by using table partitioning.  Partitioning requires a column
                            that exists within the table.  This property is only evaluated if the ValidityAuditStrategy
                            is used.
                        </entry>
                    </row>
                    <row>
                        <entry>
                            <property>org.hibernate.envers.audit_strategy_validity_revend_timestamp_field_name
                        </entry>
                        <entry>
                            REVEND_TSTMP
                        </entry>
                        <entry>
                            Column name of the timestamp of the end revision until which the data was valid.  Only used
                            if the ValidityAuditStrategy is used, and
                            <property>org.hibernate.envers.audit_strategy_validity_store_revend_timestamp
                            evaluates to true
                        </entry>
                    </row>
                    <row>
                        <entry>
                            <property>org.hibernate.envers.track_entities_changed_in_revision
                        </entry>
                        <entry>
                            false
                        </entry>
                        <entry>
                            Should entity types, that have been modified during each revision, be tracked. The default
                            implementation creates <literal>REVCHANGES table that stores entity names
                            of modified persistent objects. Single record encapsulates the revision identifier
                            (foreign key to <literal>REVINFO table) and a string value. For more
                            information refer to <xref linkend="envers-tracking-modified-entities-revchanges"/>
                            and <xref linkend="envers-tracking-modified-entities-queries"/>.
                        </entry>
                    </row>
                </tbody>
            </tgroup>
        </table>

        <important>
            <para>
                The following configuration options have been added recently and should be regarded as experimental:
                <orderedlist>
                    <listitem>
                        org.hibernate.envers.track_entities_changed_in_revision
                    </listitem>
                </orderedlist>
            </para>
        </important>
    </section>

    <section>
        <title>Additional mapping annotations

        <para>
            The name of the audit table can be set on a per-entity basis, using the
            <literal>@AuditTable annotation. It may be tedious to add this
            annotation to every audited entity, so if possible, it's better to use a prefix/suffix.
        </para>

        <para>
            If you have a mapping with secondary tables, audit tables for them will be generated in
            the same way (by adding the prefix and suffix). If you wish to overwrite this behaviour,
            you can use the <literal>@SecondaryAuditTable and
            <literal>@SecondaryAuditTables annotations.
        </para>

        <para>
            If you'd like to override auditing behaviour of some fields/properties in an embedded component, you can use
            the <literal>@AuditOverride(s) annotation on the usage site of the component.
        </para>

        <para>
            If you want to audit a relation mapped with <literal>@OneToMany+@JoinColumn,
            please see <xref linkend="envers-mappingexceptions"/> for a description of the additional
            <literal>@AuditJoinTable annotation that you'll probably want to use.
        </para>

        <para>
            If you want to audit a relation, where the target entity is not audited (that is the case for example with
            dictionary-like entities, which don't change and don't have to be audited), just annotate it with
            <literal>@Audited(targetAuditMode = RelationTargetAuditMode.NOT_AUDITED). Then, when reading historic
            versions of your entity, the relation will always point to the "current" related entity.
        </para>

        <para>
            If you'd like to audit properties of a superclass of an entity, which are not explicitly audited (which
            don't have the <literal>@Audited annotation on any properties or on the class),
            you can list the superclasses in the <literal>auditParents attribute of the
            <interfacename>@Audited annotation.
        </para>
    </section>

    <section>
        <title>Choosing an audit strategy
        <para>
            After the basic configuration it is important to choose the audit strategy that will be used to persist
            and retrieve audit information. There is a trade-off between the performance of persisting and the
            performance of querying the audit information. Currently there two audit strategies.
        </para>
        <orderedlist>
            <listitem>
                <para>
                    The default audit strategy persists the audit data together with a start revision. For each row
                    inserted, updated or deleted in an audited table, one or more rows are inserted in the audit
                    tables, together with the start revision of its validity. Rows in the audit tables are never
                    updated after insertion.  Queries of audit information use subqueries to select the applicable
                    rows in the audit tables.  These subqueries are notoriously slow and difficult to index.
                </para>
            </listitem>
            <listitem>
                <para>
                    The alternative is a validity audit strategy. This strategy stores the start-revision and the
                    end-revision of audit information. For each row inserted, updated or deleted in an audited table,
                    one or more rows are inserted in the audit tables, together with the start revision of its
                    validity. But at the same time the end-revision field of the previous audit rows (if available)
                    are set to this revision.  Queries on the audit information can then use 'between start and end
                    revision' instead of subqueries as used by the default audit strategy.
                </para>
                <para>
                    The consequence of this strategy is that persisting audit information will be a bit slower,
                    because of the extra updates involved, but retrieving audit information will be a lot faster.
                    This can be improved by adding extra indexes.
                </para>
            </listitem>
        </orderedlist>
    </section>

    <section id="envers-revisionlog">
        <title>Revision Log
        <subtitle>Logging data for revisions

        <para>
            When Envers starts a new revision, it creates a new <firstterm>revision entity which stores
            information about the revision.  By default, that includes just
        </para>
        <orderedlist>
            <listitem>
                <firstterm>revision number - An integral value (int/Integer or
                <literal>long/Long).  Essentially the primary key of the revision
            </listitem>
            <listitem>
                <firstterm>revision timestamp - either a long/Long or
                <classname>java.util.Date value representing the instant at which the revision was made.
                When using a <classname>java.util.Date, instead of a long/Long for
                the revision timestamp, take care not to store it to a column data type which will loose precision.
            </listitem>
        </orderedlist>

        <para>
            Envers handles this information as an entity.  By default it uses its own internal class to act as the
            entity, mapped to the <literal>REVINFO table.
            You can, however, supply your own approach to collecting this information which might be useful to
            capture additional details such as who made a change or the ip address from which the request came.  There
            are 2 things you need to make this work.
        </para>
        <orderedlist>
            <listitem>
                <para>
                    First, you will need to tell Envers about the entity you wish to use.  Your entity must use the
                    <interfacename>@org.hibernate.envers.RevisionEntity annotation.  It must
                    define the 2 attributes described above annotated with
                    <interfacename>@org.hibernate.envers.RevisionNumber and
                    <interfacename>@org.hibernate.envers.RevisionTimestamp, respectively.  You can extend
                    from <classname>org.hibernate.envers.DefaultRevisionEntity, if you wish, to inherit all
                    these required behaviors.
                </para>
                <para>
                    Simply add the custom revision entity as you do your normal entities.  Envers will "find it".  Note
                    that it is an error for there to be multiple entities marked as
                    <interfacename>@org.hibernate.envers.RevisionEntity
                </para>
            </listitem>
            <listitem>
                <para>
                    Second, you need to tell Envers how to create instances of your revision entity which is handled
                    by the <methodname>newRevision method of the
                    <interfacename>org.jboss.envers.RevisionListener interface.
                </para>
                <para>
                    You tell Envers your custom <interfacename>org.hibernate.envers.RevisionListener
                    implementation to use by specifying it on the
                    <interfacename>@org.hibernate.envers.RevisionEntity annotation, using the
                    <methodname>value attribute.
                </para>
            </listitem>
        </orderedlist>
        <programlisting>

        <para>
            An alternative method to using the <interfacename>org.hibernate.envers.RevisionListener
            is to instead call the <methodname>getCurrentRevision method of the
            <interfacename>org.hibernate.envers.AuditReader interface to obtain the current revision,
            and fill it with desired information.  The method accepts a <literal>persist parameter indicating
            whether the revision entity should be persisted prior to returning from this method. <literal>true
            ensures that the returned entity has access to its identifier value (revision number), but the revision
            entity will be persisted regardless of whether there are any audited entities changed. <literal>false
            means that the revision number will be <literal>null, but the revision entity will be persisted
            only if some audited entities have changed.
        </para>


        <example>
            <title>Example of storing username with revision

            <programlisting>
                <filename>ExampleRevEntity.java

            <programlisting>
                <filename>ExampleListener.java

        </example>

        <section id="envers-tracking-modified-entities-revchanges">
            <title>Tracking entity names modified during revisions
            <para>
                By default entity types that have been changed in each revision are not being tracked. This implies the
                necessity to query all tables storing audited data in order to retrieve changes made during
                specified revision. Envers provides a simple mechanism that creates <literal>REVCHANGES
                table which stores entity names of modified persistent objects. Single record encapsulates the revision
                identifier (foreign key to <literal>REVINFO table) and a string value.
            </para>
            <para>
                Tracking of modified entity names can be enabled in three different ways:
            </para>
            <orderedlist>
                <listitem>
                    Set <property>org.hibernate.envers.track_entities_changed_in_revision parameter to
                    <literal>true. In this case
                    <classname>org.hibernate.envers.DefaultTrackingModifiedEntitiesRevisionEntity will
                    be implicitly used as the revision log entity.
                </listitem>
                <listitem>
                    Create a custom revision entity that extends
                    <classname>org.hibernate.envers.DefaultTrackingModifiedEntitiesRevisionEntity class.
                    <programlisting>
<![CDATA[@Entity
@RevisionEntity
public class ExtendedRevisionEntity
             extends DefaultTrackingModifiedEntitiesRevisionEntity {
    ...
}]]></programlisting>
                </listitem>
                <listitem>
                    Mark an appropriate field of a custom revision entity with
                    <interfacename>@org.hibernate.envers.ModifiedEntityNames annotation. The property is
                    required to be of <literal>]]> type.
                    <programlisting>
<![CDATA[@Entity
@RevisionEntity
public class AnnotatedTrackingRevisionEntity {
    ...

    @ElementCollection
    @JoinTable(name = "REVCHANGES", joinColumns = @JoinColumn(name = "REV"))
    @Column(name = "ENTITYNAME")
    @ModifiedEntityNames
    private Set<String> modifiedEntityNames;
    
    ...
}]]></programlisting>
                </listitem>
            </orderedlist>
            <para>
                Users, that have chosen one of the approaches listed above, can retrieve all entities modified in a
                specified revision by utilizing API described in <xref linkend="envers-tracking-modified-entities-queries"/>.
            </para>
            <para>
                Users are also allowed to implement custom mechanism of tracking modified entity types. In this case, they
                shall pass their own implementation of
                <interfacename>org.hibernate.envers.EntityTrackingRevisionListener interface as the value
                of <interfacename>@org.hibernate.envers.RevisionEntity annotation.
                <interfacename>EntityTrackingRevisionListener interface exposes one method that notifies
                whenever audited entity instance has been added, modified or removed within current revision boundaries.
            </para>
            
            <example>
                <title>Custom implementation of tracking entity classes modified during revisions
                <programlisting>
                    <filename>CustomEntityTrackingRevisionListener.java
<![CDATA[
public class CustomEntityTrackingRevisionListener
             implements EntityTrackingRevisionListener {
    @Override
    public void entityChanged(Class entityClass, String entityName,
                              Serializable entityId, RevisionType revisionType,
                              Object revisionEntity) {
        String type = entityClass.getName();
        ((CustomTrackingRevisionEntity)revisionEntity).addModifiedEntityType(type);
    }

    @Override
    public void newRevision(Object revisionEntity) {
    }
}]]></programlisting>
                <programlisting>
                    <filename>CustomTrackingRevisionEntity.java
<![CDATA[
@Entity
@RevisionEntity(CustomEntityTrackingRevisionListener.class)
public class CustomTrackingRevisionEntity {
    @Id
    @GeneratedValue
    @RevisionNumber
    private int customId;

    @RevisionTimestamp
    private long customTimestamp;

    @OneToMany(mappedBy="revision", cascade={CascadeType.PERSIST, CascadeType.REMOVE})
    private Set<ModifiedEntityTypeEntity> modifiedEntityTypes =
                                              new HashSet<ModifiedEntityTypeEntity>();
    
    public void addModifiedEntityType(String entityClassName) {
        modifiedEntityTypes.add(new ModifiedEntityTypeEntity(this, entityClassName));
    }
    
    ...
}
]]></programlisting>
                <programlisting>
                    <filename>ModifiedEntityTypeEntity.java
<![CDATA[
@Entity
public class ModifiedEntityTypeEntity {
    @Id
    @GeneratedValue
    private Integer id;

    @ManyToOne
    private CustomTrackingRevisionEntity revision;
    
    private String entityClassName;
    
    ...
}
]]></programlisting>
                <programlisting>
            </example>
        </section>

    </section>

    <section id="envers-queries">

        <title>Queries

        <para>
            You can think of historic data as having two dimension. The first - horizontal -
            is the state of the database at a given revision. Thus, you can
            query for entities as they were at revision N. The second - vertical - are the
            revisions, at which entities changed. Hence, you can query for revisions,
            in which a given entity changed.
        </para>

        <para>
            The queries in Envers are similar to
            <ulink url="http://www.hibernate.org/hib_docs/v3/reference/en/html/querycriteria.html">Hibernate Criteria,
            so if you are common with them, using Envers queries will be much easier.
        </para>

        <para>
            The main limitation of the current queries implementation is that you cannot
            traverse relations. You can only specify constraints on the ids of the
            related entities, and only on the "owning" side of the relation. This however
            will be changed in future releases.
        </para>

        <para>
            Please note, that queries on the audited data will be in many cases much slower
            than corresponding queries on "live" data, as they involve correlated subselects.
        </para>

        <para>
            In the future, queries will be improved both in terms of speed and possibilities, when using the valid-time
            audit strategy, that is when storing both start and end revisions for entities. See
            <xref linkend="configuration"/>.
        </para>

        <section id="entities-at-revision">

            <title>Querying for entities of a class at a given revision

            <para>
                The entry point for this type of queries is:
            </para>

            <programlisting>

            <para>
                You can then specify constraints, which should be met by the entities returned, by
                adding restrictions, which can be obtained using the <literal>AuditEntity
                factory class. For example, to select only entities, where the "name" property
                is equal to "John":
            </para>

            <programlisting>

            <para>
                And to select only entites that are related to a given entity:
            </para>

            <programlisting>

            <para>
                You can limit the number of results, order them, and set aggregations and projections
                (except grouping) in the usual way.
                When your query is complete, you can obtain the results by calling the
                <literal>getSingleResult() or getResultList() methods.
            </para>

            <para>
                A full query, can look for example like this:
            </para>

            <programlisting>

        </section>

        <section id="revisions-of-entity">

            <title>Querying for revisions, at which entities of a given class changed

            <para>
                The entry point for this type of queries is:
            </para>

            <programlisting>

            <para>
                You can add constraints to this query in the same way as to the previous one.
                There are some additional possibilities:
            </para>

            <orderedlist>
                <listitem>
                    <para>
                        using <literal>AuditEntity.revisionNumber() you can specify constraints, projections
                        and order on the revision number, in which the audited entity was modified
                    </para>
                </listitem>
                <listitem>
                    <para>
                        similarly, using <literal>AuditEntity.revisionProperty(propertyName) you can specify constraints,
                        projections and order on a property of the revision entity, corresponding to the revision
                        in which the audited entity was modified
                    </para>
                </listitem>
                <listitem>
                    <para>
                        <literal>AuditEntity.revisionType() gives you access as above to the type of
                        the revision (ADD, MOD, DEL).
                    </para>
                </listitem>
            </orderedlist>

            <para>
                Using these methods,
                you can order the query results by revision number, set projection or constraint
                the revision number to be greater or less than a specified value, etc. For example, the
                following query will select the smallest revision number, at which entity of class
                <literal>MyEntity with id entityId has changed, after revision
                number 42:
            </para>

            <programlisting>

            <para>
                The second additional feature you can use in queries for revisions is the ability
                to maximalize/minimize a property. For example, if you want to select the
                revision, at which the value of the <literal>actualDate for a given entity
                was larger then a given value, but as small as possible:
            </para>

            <programlisting>

            <para>
                The <literal>minimize() and maximize() methods return a criteria,
                to which you can add constraints, which must be met by the entities with the
                maximized/minimized properties.
            </para>

            <para>
                You probably also noticed that there are two boolean parameters, passed when
                creating the query. The first one, <literal>selectEntitiesOnly, is only valid when
                you don't set an explicit projection. If true, the result of the query will be
                a list of entities (which changed at revisions satisfying the specified
                constraints).
            </para>

            <para>
                If false, the result will be a list of three element arrays. The
                first element will be the changed entity instance. The second will be an entity
                containing revision data (if no custom entity is used, this will be an instance
                of <literal>DefaultRevisionEntity). The third will be the type of the
                revision (one of the values of the <literal>RevisionType enumeration:
                ADD, MOD, DEL).
            </para>

            <para>
                The second parameter, <literal>selectDeletedEntities, specifies if revisions,
                in which the entity was deleted should be included in the results. If yes, such entities
                will have the revision type DEL and all fields, except the id,
                <literal>null.
            </para>

        </section>

        <section id="envers-tracking-modified-entities-queries">
            <title>Querying for entities modified in a given revision
            <para>
                The basic query allows retrieving entity names and corresponding Java classes changed in a specified revision:
            </para>
            <programlisting>> modifiedEntityTypes = getAuditReader()
    .getCrossTypeRevisionChangesReader().findEntityTypes(revisionNumber);]]></programlisting>
            <para>
                Other queries (also accessible from <interfacename>org.hibernate.envers.CrossTypeRevisionChangesReader):
            </para>
            <orderedlist>
                <listitem>
                    <firstterm>List]]> findEntities(Number)
                    - Returns snapshots of all audited entities changed (added, updated and removed) in a given revision.
                    Executes <literal>n+1 SQL queries, where n is a number of different entity
                    classes modified within specified revision.
                </listitem>
                <listitem>
                    <firstterm>List]]> findEntities(Number, RevisionType)
                    - Returns snapshots of all audited entities changed (added, updated or removed) in a given revision
                    filtered by modification type. Executes <literal>n+1 SQL queries, where n
                    is a number of different entity classes modified within specified revision.
                </listitem>
                <listitem>
                    <firstterm>>]]> findEntitiesGroupByRevisionType(Number)
                    - Returns a map containing lists of entity snapshots grouped by modification operation (e.g.
                    addition, update and removal). Executes <literal>3n+1 SQL queries, where n
                    is a number of different entity classes modified within specified revision.
                </listitem>
            </orderedlist>
            <para>
                Note that methods described above can be legally used only when default mechanism of
                tracking changed entity names is enabled (see <xref linkend="envers-tracking-modified-entities-revchanges"/>).
            </para>
        </section>

    </section>

    <section>
        <title>Conditional auditing
        <para>
            Envers persists audit data in reaction to various Hibernate events (e.g. post update, post insert, and
            so on), using a series of even listeners from the <literal>org.hibernate.envers.event
            package. By default, if the Envers jar is in the classpath, the event listeners are auto-registered with
            Hibernate.
        </para>
        <para>
            Conditional auditing can be implemented by overriding some of the Envers event listeners.
            To use customized Envers event listeners, the following steps are needed:
            <orderedlist>
            <listitem>
                Turn off automatic Envers event listeners registration by setting the
                <literal>hibernate.listeners.envers.autoRegister
                Hibernate property to <literal>false.
            </listitem>
            <listitem>
                Create subclasses for appropriate event listeners. For example, if you want to conditionally audit
                entity insertions, extend the
                <literal>org.hibernate.envers.eventEnversPostInsertEventListenerImpl
                class. Place the conditional-auditing logic in the subclasses, call the super method if auditing
                should be performed.
            </listitem>
            <listitem>
                Create your own implementation of <literal>org.hibernate.integrator.spi.Integrator,
                similar to <literal>org.hibernate.envers.event.EnversIntegrator. Use your event listener
                classes instead of the default ones.
            </listitem>
            <listitem>
                For the integrator to be automatically used when Hibernate starts up, you will need to add a
                <literal>META-INF/services/org.hibernate.integrator.spi.Integrator file to your jar.
                The file should contain the fully qualified name of the class implementing the interface.
            </listitem>
            </orderedlist>
        </para>
    </section>

    <section>
        <title>Understanding the Envers Schema

        <para>
            For each audited entity (that is, for each entity containing at least one audited field), an audit table is
            created.  By default, the audit table's name is created by adding a "_AUD" suffix to the original table name,
            but this can be overridden by specifying a different suffix/prefix in the configuration or per-entity using
            the <interfacename>@org.hibernate.envers.AuditTable annotation.
        </para>

        <orderedlist>
            <title>Audit table columns
            <listitem>
                <para>
                    id of the original entity (this can be more then one column in the case of composite primary keys)
                </para>
            </listitem>
            <listitem>
                <para>
                    revision number - an integer.  Matches to the revision number in the revision entity table.
                </para>
            </listitem>
            <listitem>
                <para>
                    revision type - a small integer
                </para>
            </listitem>
            <listitem>
                <para>
                    audited fields from the original entity
                </para>
            </listitem>
        </orderedlist>

        <para>
            The primary key of the audit table is the combination of the original id of the entity and the revision
            number - there can be at most one historic entry for a given entity instance at a given revision.
        </para>

        <para>
            The current entity data is stored in the original table and in the audit table.  This is a duplication of
            data, however as this solution makes the query system much more powerful, and as memory is cheap, hopefully
            this won't be a major drawback for the users.  A row in the audit table with entity id ID, revision N and
            data D means: entity with id ID has data D from revision N upwards.  Hence, if we want to find an entity at
            revision M, we have to search for a row in the audit table, which has the revision number smaller or equal
            to M, but as large as possible. If no such row is found, or a row with a "deleted" marker is found, it means
            that the entity didn't exist at that revision.
        </para>

        <para>
            The "revision type" field can currently have three values: 0, 1, 2, which means ADD, MOD and DEL,
            respectively. A row with a revision of type DEL will only contain the id of the entity and no data (all
            fields NULL), as it only serves as a marker saying "this entity was deleted at that revision".
        </para>

        <para>
            Additionally, there is a <term>revision entity table which contains the information about the
            global revision.  By default the generated table is named <database class="table">REVINFO and
            contains just 2 columns: <database class="field">ID and TIMESTAMP.
            A row is inserted into this table on each new revision, that is, on each commit of a transaction, which
            changes audited data.  The name of this table can be configured, the name of its columns as well as adding
            additional columns can be achieved as discussed in <xref linkend="envers-revisionlog"/>.
        </para>

        <para>
            While global revisions are a good way to provide correct auditing of relations, some people have pointed out
            that this may be a bottleneck in systems, where data is very often modified.  One viable solution is to
            introduce an option to have an entity "locally revisioned", that is revisions would be created for it
            independently.  This wouldn't enable correct versioning of relations, but wouldn't also require the
            <database class="table">REVINFO table.  Another possibility is to introduce a notion of
            "revisioning groups": groups of entities which share revision numbering.  Each such group would have to
            consist of one or more strongly connected component of the graph induced by relations between entities.
            Your opinions on the subject are very welcome on the forum! :)
        </para>

    </section>

    <section id="envers-generateschema">
        <title>Generating schema with Ant

        <para>
            If you'd like to generate the database schema file with the Hibernate Tools Ant task,
            you'll probably notice that the generated file doesn't contain definitions of audit
            tables. To generate also the audit tables, you simply need to use
            <literal>org.hibernate.tool.ant.EnversHibernateToolTask instead of the usual
            <literal>org.hibernate.tool.ant.HibernateToolTask. The former class extends
            the latter, and only adds generation of the version entities. So you can use the task
            just as you used to.
        </para>

        <para>
            For example:
        </para>

        <programlisting>

        <para>
            Will generate the following schema:
        </para>

        <programlisting>
    </section>


    <section id="envers-mappingexceptions">
        <title>Mapping exceptions

        <section>

            <title>What isn't and will not be supported

            <para>
                Bags (the corresponding Java type is List), as they can contain non-unique elements.
                The reason is that persisting, for example a bag of String-s, violates a principle
                of relational databases: that each table is a set of tuples. In case of bags,
                however (which require a join table), if there is a duplicate element, the two
                tuples corresponding to the elements will be the same. Hibernate allows this,
                however Envers (or more precisely: the database connector) will throw an exception
                when trying to persist two identical elements, because of a unique constraint violation.
            </para>

            <para>
                There are at least two ways out if you need bag semantics:
            </para>

            <orderedlist>
                <listitem>
                    <para>
                        use an indexed collection, with the <literal>@IndexColumn annotation, or
                    </para>
                </listitem>
                <listitem>
                    <para>
                        provide a unique id for your elements with the <literal>@CollectionId annotation.
                    </para>
                </listitem>
            </orderedlist>

        </section>

        <section>

            <title>What isn't and will be supported

            <orderedlist>
                <listitem>
                    <para>
                        collections of components
                    </para>
                </listitem>
            </orderedlist>

        </section>

        <section>

            <title>@OneToMany+@JoinColumn

            <para>
                When a collection is mapped using these two annotations, Hibernate doesn't
                generate a join table. Envers, however, has to do this, so that when you read the
                revisions in which the related entity has changed, you don't get false results.
            </para>
            <para>
                To be able to name the additional join table, there is a special annotation:
                <literal>@AuditJoinTable, which has similar semantics to JPA's
                <literal>@JoinTable.
            </para>

            <para>
                One special case are relations mapped with <literal>@OneToMany+@JoinColumn on
                the one side, and <literal>@ManyToOne+@JoinColumn(insertable=false, updatable=false)
                on the many side.
                Such relations are in fact bidirectional, but the owning side is the collection (see alse
                <ulink url="http://docs.jboss.org/hibernate/stable/annotations/reference/en/html_single/#entity-hibspec-collection-extratype">here).
            </para>
            <para>
                To properly audit such relations with Envers, you can use the <literal>@AuditMappedBy annotation.
                It enables you to specify the reverse property (using the <literal>mappedBy element). In case
                of indexed collections, the index column must also be mapped in the referenced entity (using
                <literal>@Column(insertable=false, updatable=false), and specified using
                <literal>positionMappedBy. This annotation will affect only the way
                Envers works. Please note that the annotation is experimental and may change in the future.
            </para>

        </section>
    </section>

    <section id="envers-partitioning">
        <title>Advanced: Audit table partitioning

        <section id="envers-partitioning-benefits">

            <title>Benefits of audit table partitioning

            <para>
                Because audit tables tend to grow indefinitely they can quickly become really large. When the audit tables have grown
                to a certain limit (varying per RDBMS and/or operating system) it makes sense to start using table partitioning.
                SQL table partitioning offers a lot of advantages including, but certainly not limited to:
                <orderedlist>
                    <listitem>
                        <para>
                            Improved query performance by selectively moving rows to various partitions (or even purging old rows)
                        </para>
                    </listitem>
                    <listitem>
                        <para>
                            Faster data loads, index creation, etc.
                        </para>
                    </listitem>
                </orderedlist>
            </para>

        </section>

        <section id="envers-partitioning-columns">

            <title>Suitable columns for audit table partitioning
            <para>
                Generally SQL tables must be partitioned on a column that exists within the table. As a rule it makes sense to use
                either the <emphasis>end revision or the end revision timestamp column for
                partioning of audit tables.
                <note>
                    <para>
                        End revision information is not available for the default AuditStrategy.
                    </para>

                    <para>
                        Therefore the following Envers configuration options are required:
                    </para>
                    <para>
                        <literal>org.hibernate.envers.audit_strategy =
                        <literal>org.hibernate.envers.strategy.ValidityAuditStrategy
                    </para>
                    <para>
                        <literal>org.hibernate.envers.audit_strategy_validity_store_revend_timestamp =
                        <literal>true
                    </para>

                    <para>
                        Optionally, you can also override the default values following properties:
                    </para>
                    <para>
                        <literal>org.hibernate.envers.audit_strategy_validity_end_rev_field_name
                    </para>
                    <para>
                        <literal>org.hibernate.envers.audit_strategy_validity_revend_timestamp_field_name
                    </para>

                    <para>
                        For more information, see <xref linkend="configuration"/>.
                    </para>
                </note>
            </para>

            <para>
                The reason why the end revision information should be used for audit table partioning is based on the assumption that
                audit tables should be partionioned on an 'increasing level of interestingness', like so:
            </para>

            <para>
                <orderedlist>
                    <listitem>
                        <para>
                            A couple of partitions with audit data that is not very (or no longer) interesting.
                            This can be stored on slow media, and perhaps even be purged eventually.
                        </para>
                    </listitem>
                    <listitem>
                        <para>
                            Some partitions for audit data that is potentially interesting.
                        </para>
                    </listitem>
                    <listitem>
                        <para>
                            One partition for audit data that is most likely to be interesting.
                            This should be stored on the fastest media, both for reading and writing.
                        </para>
                    </listitem>
                </orderedlist>
            </para>


        </section>

        <section id="envers-partitioning-example">

            <title>Audit table partitioning example
            <para>
                In order to determine a suitable column for the 'increasing level of interestingness',
                consider a simplified example of a salary registration for an unnamed agency.
            </para>

            <para>
                Currently, the salary table contains the following rows for a certain person X:

                <table frame="topbot">
                    <title>Salaries table
                    <tgroup cols="2">
                        <colspec colname="c1" colwidth="1*"/>
                        <colspec colname="c2" colwidth="1*"/>
                        <thead>
                            <row>
                                <entry>Year
                                <entry>Salary (USD)
                            </row>
                        </thead>
                        <tbody>
                            <row>
                                <entry>2006
                                <entry>3300
                            </row>
                            <row>
                                <entry>2007
                                <entry>3500
                            </row>
                            <row>
                                <entry>2008
                                <entry>4000
                            </row>
                            <row>
                                <entry>2009
                                <entry>4500
                            </row>
                        </tbody>
                    </tgroup>
                </table>
            </para>

            <para>
                The salary for the current fiscal year (2010) is unknown. The agency requires that all changes in registered
                salaries for a fiscal year are recorded (i.e. an audit trail). The rationale behind this is that decisions
                made at a certain date are based on the registered salary at that time. And at any time it must be possible
                reproduce the reason why a certain decision was made at a certain date.
            </para>

            <para>
                The following audit information is available, sorted on in order of occurrence:

                <table frame="topbot">
                    <title>Salaries - audit table
                    <tgroup cols="5">
                        <colspec colname="c1" colwidth="1*"/>
                        <colspec colname="c2" colwidth="1*"/>
                        <colspec colname="c3" colwidth="1*"/>
                        <colspec colname="c4" colwidth="1*"/>
                        <colspec colname="c5" colwidth="1*"/>
                        <thead>
                            <row>
                                <entry>Year
                                <entry>Revision type
                                <entry>Revision timestamp
                                <entry>Salary (USD)
                                <entry>End revision timestamp
                            </row>
                        </thead>
                        <tbody>
                            <row>
                                <entry>2006
                                <entry>ADD
                                <entry>2007-04-01
                                <entry>3300
                                <entry>null
                            </row>
                            <row>
                                <entry>2007
                                <entry>ADD
                                <entry>2008-04-01
                                <entry>35
                                <entry>2008-04-02
                            </row>
                            <row>
                                <entry>2007
                                <entry>MOD
                                <entry>2008-04-02
                                <entry>3500
                                <entry>null
                            </row>
                            <row>
                                <entry>2008
                                <entry>ADD
                                <entry>2009-04-01
                                <entry>3700
                                <entry>2009-07-01
                            </row>
                            <row>
                                <entry>2008
                                <entry>MOD
                                <entry>2009-07-01
                                <entry>4100
                                <entry>2010-02-01
                            </row>
                            <row>
                                <entry>2008
                                <entry>MOD
                                <entry>2010-02-01
                                <entry>4000
                                <entry>null
                            </row>
                            <row >
                                <entry>2009
                                <entry>ADD
                                <entry>2010-04-01
                                <entry>4500
                                <entry>null
                            </row>
                        </tbody>
                    </tgroup>
                </table>
            </para>

            <section id="envers-partitioning-example-column">

                <title>Determining a suitable partitioning column
                <para>
                    To partition this data, the 'level of interestingness' must be defined.
                    Consider the following:
                    <orderedlist>
                        <listitem>
                            <para>
                                For fiscal year 2006 there is only one revision. It has the oldest <emphasis>revision timestamp
                                of all audit rows, but should still be regarded as interesting because it is the latest modification
                                for this fiscal year in the salary table; its <emphasis>end revision timestamp is null.
                            </para>
                            <para>
                                Also note that it would be very unfortunate if in 2011 there would be an update of the salary for fiscal
                                year 2006 (which is possible in until at least 10 years after the fiscal year) and the audit
                                information would have been moved to a slow disk (based on the age of the
                                <emphasis>revision timestamp). Remember that in this case Envers will have to update
                                the <emphasis>end revision timestamp of the most recent audit row.
                            </para>
                        </listitem>
                        <listitem>
                            <para>
                                There are two revisions in the salary of fiscal year 2007 which both have nearly the same
                                <emphasis>revision timestamp and a different end revision timestamp.
                                On first sight it is evident that the first revision was a mistake and probably uninteresting.
                                The only interesting revision for 2007 is the one with <emphasis>end revision timestamp null.
                            </para>
                        </listitem>
                    </orderedlist>

                    Based on the above, it is evident that only the <emphasis>end revision timestamp is suitable for
                    audit table partitioning. The <emphasis>revision timestamp is not suitable.
                </para>

            </section>

            <section id="envers-partitioning-example-scheme">

                <title>Determining a suitable partitioning scheme
                <para>
                    A possible partitioning scheme for the salary table would be as follows:
                    <orderedlist>
                        <listitem>
                            <para>
                                <emphasis>end revision timestamp year = 2008
                            </para>
                            <para>
                                This partition contains audit data that is not very (or no longer) interesting.
                            </para>
                        </listitem>
                        <listitem>
                            <para>
                                <emphasis>end revision timestamp year = 2009
                            </para>
                            <para>
                                This partition contains audit data that is potentially interesting.
                            </para>
                        </listitem>
                        <listitem>
                            <para>
                                <emphasis>end revision timestamp year >= 2010 or null
                            </para>
                            <para>
                                This partition contains the most interesting audit data.
                            </para>
                        </listitem>
                    </orderedlist>
                </para>

                <para>
                    This partitioning scheme also covers the potential problem of the update of the
                    <emphasis>end revision timestamp, which occurs if a row in the audited table is modified.
                    Even though Envers will update the <emphasis>end revision timestamp of the audit row to
                    the system date at the instant of modification, the audit row will remain in the same partition
                    (the 'extension bucket').
                </para>

                <para>
                    And sometime in 2011, the last partition (or 'extension bucket') is split into two new partitions:
                    <orderedlist>
                        <listitem>
                            <para>
                                <emphasis>end revision timestamp year = 2010
                            </para>
                            <para>
                                This partition contains audit data that is potentially interesting (in 2011).
                            </para>
                        </listitem>
                        <listitem>
                            <para>
                                <emphasis>end revision timestamp year >= 2011 or null
                            </para>
                            <para>
                                This partition contains the most interesting audit data and is the new 'extension bucket'.
                            </para>
                        </listitem>
                    </orderedlist>
                </para>

            </section>

        </section>
    </section>

    <section id="envers-links">
        <title>Envers links

        <orderedlist>
            <listitem>
                <para>
                    <ulink url="http://hibernate.org">Hibernate main page
                </para>
            </listitem>
            <listitem>
                <para>
                    <ulink url="http://community.jboss.org/en/envers?view=discussions">Forum
                </para>
            </listitem>
            <listitem>
                <para>
                    <ulink url="http://opensource.atlassian.com/projects/hibernate/browse/HHH">JIRA issue tracker
                    (when adding issues concerning Envers, be sure to select the "envers" component!)
                </para>
            </listitem>
            <listitem>
                <para>
                    <ulink url="irc://irc.freenode.net:6667/envers">IRC channel
                </para>
            </listitem>
            <listitem>
                <para>
                    <ulink url="http://www.jboss.org/feeds/view/envers">Envers Blog
                </para>
            </listitem>
            <listitem>
                <para>
                    <ulink url="https://community.jboss.org/wiki/EnversFAQ">FAQ
                </para>
            </listitem>
        </orderedlist>

    </section>

</chapter>

Other Hibernate examples (source code examples)

Here is a short list of links related to this Hibernate Envers.xml source code file:

... this post is sponsored by my books ...

#1 New Release!

FP Best Seller

 

new blog posts

 

Copyright 1998-2024 Alvin Alexander, alvinalexander.com
All Rights Reserved.

A percentage of advertising revenue from
pages under the /java/jwarehouse URI on this website is
paid back to open source projects.