home | career | drupal | java | mac | mysql | perl | scala | uml | unix  

Java example source code file (GradientNormalization.java)

This example Java source code file (GradientNormalization.java) is included in the alvinalexander.com "Java Source Code Warehouse" project. The intent of this project is to help you "Learn Java by Example" TM.

Learn more about this Java project at its project page.

Java - Java tags/keywords

clipelementwiseabsolutevalue, clipl2perlayer, clipl2perparamtype, gradientnormalization, none, renormalizel2perlayer, renormalizel2perparamtype

The GradientNormalization.java Java example source code

package org.deeplearning4j.nn.conf;

/**Gradient normalization strategies. These are applied on raw gradients, before the gradients are passed to the
 * updater (SGD, RMSProp, Momentum, etc)<br>
 * <p>None = no gradient normalization (default)

* * <p>RenormalizeL2PerLayer = rescale gradients by dividing by the L2 norm of all gradients for the layer.

* * <p>RenormalizeL2PerParamType = rescale gradients by dividing by the L2 norm of the gradients, separately for * each type of parameter within the layer.<br> * This differs from RenormalizeL2PerLayer in that here, each parameter type (weight, bias etc) is normalized separately.<br> * For example, in a MLP/FeedForward network (where G is the gradient vector), the output is as follows: * <ul style="list-style-type:none"> * <li>GOut_weight = G_weight / l2(G_weight) * <li>GOut_bias = G_bias / l2(G_bias) * </ul> * </p> * * <p>ClipElementWiseAbsoluteValue = clip the gradients on a per-element basis.
* For each gradient g, set g <- sign(g)*max(maxAllowedValue,|g|). (thesis), * <a href="http://www.fit.vutbr.cz/~imikolov/rnnlm/thesis.pdf">http://www.fit.vutbr.cz/~imikolov/rnnlm/thesis.pdf * in the context of learning recurrent neural networks.<br> * Threshold for clipping can be set in Layer configuration, using gradientNormalizationThreshold(double threshold) * </p> * * <p>ClipL2PerLayer = conditional renormalization. Somewhat similar to RenormalizeL2PerLayer, this strategy * scales the gradients <i>if and only if the L2 norm of the gradients (for entire layer) exceeds a specified * threshold. Specifically, if G is gradient vector for the layer, then: * <ul style="list-style-type:none"> * <li>GOut = G     if l2Norm(G) < threshold (i.e., no change) * <li>GOut = threshold * G / l2Norm(G)     otherwise * </ul> * Thus, the l2 norm of the scaled gradients will not exceed the specified threshold, though may be smaller than it<br> * See: Pascanu, Mikolov, Bengio (2012), <i>On the difficulty of training Recurrent Neural Networks, * <a href="http://arxiv.org/abs/1211.5063">http://arxiv.org/abs/1211.5063
* Threshold for clipping can be set in Layer configuration, using gradientNormalizationThreshold(double threshold) * </p> * * <p>ClipL2PerParamType = conditional renormalization. Very similar to ClipL2PerLayer, however instead of clipping * per layer, do clipping on each parameter type separately.<br> * For example in a recurrent neural network, input weight gradients, recurrent weight gradients and bias gradient are all * clipped separately. Thus if one set of gradients are very large, these may be clipped while leaving the other gradients * unmodified.<br> * Threshold for clipping can be set in Layer configuration, using gradientNormalizationThreshold(double threshold)</p> * * @author Alex Black */ public enum GradientNormalization { None, RenormalizeL2PerLayer, RenormalizeL2PerParamType, ClipElementWiseAbsoluteValue, ClipL2PerLayer, ClipL2PerParamType }

Other Java examples (source code examples)

Here is a short list of links related to this Java GradientNormalization.java source code file:



my book on functional programming

 

new blog posts

 

Copyright 1998-2019 Alvin Alexander, alvinalexander.com
All Rights Reserved.

A percentage of advertising revenue from
pages under the /java/jwarehouse URI on this website is
paid back to open source projects.