alvinalexander.com | career | drupal | java | mac | mysql | perl | scala | uml | unix  
<td valign="bottom" align="center" style=" font-size:10pt;">Y </tr> <tr> <td valign="bottom" align="center" style=" font-size:10pt;">1 <td valign="bottom" align="center" style=" font-size:10pt;">34.234064369 </tr> <tr> <td valign="bottom" align="center" style=" font-size:10pt;">2 <td valign="bottom" align="center" style=" font-size:10pt;">68.2681162306108 </tr> <tr> <td valign="bottom" align="center" style=" font-size:10pt;">3 <td valign="bottom" align="center" style=" font-size:10pt;">118.615899084602 </tr> <tr> <td valign="bottom" align="center" style=" font-size:10pt;">4 <td valign="bottom" align="center" style=" font-size:10pt;">184.138197238557 </tr> <tr> <td valign="bottom" align="center" style=" font-size:10pt;">5 <td valign="bottom" align="center" style=" font-size:10pt;">266.599877916276 </tr> <tr> <td valign="bottom" align="center" style=" font-size:10pt;">6 <td valign="bottom" align="center" style=" font-size:10pt;">364.147735251579 </tr> <tr> <td valign="bottom" align="center" style=" font-size:10pt;">7 <td valign="bottom" align="center" style=" font-size:10pt;">478.019226091914 </tr> <tr> <td valign="bottom" align="center" style=" font-size:10pt;">8 <td valign="bottom" align="center" style=" font-size:10pt;">608.140949270688 </tr> <tr> <td valign="bottom" align="center" style=" font-size:10pt;">9 <td valign="bottom" align="center" style=" font-size:10pt;">754.598868667148 </tr> <tr> <td valign="bottom" align="center" style=" font-size:10pt;">10 <td valign="bottom" align="center" style=" font-size:10pt;">916.128818085883 </tr> </table> <p> First we need to implement the interface <a href="../apidocs/org/apache/commons/math3/analysis/DifferentiableMultivariateVectorFunction.html">DifferentiableMultivariateVectorFunction. This requires the implementation of the method signatures: </p> <ul> <li>MultivariateMatrixFunction jacobian() <li>double[] value(double[] point) </ul> <p> We'll tackle the implementation of the <code>MultivariateMatrixFunction jacobian() method first. You may wish to familiarize yourself with what a Jacobian Matrix is. In this case the Jacobian is the partial derivative of the function with respect to the parameters a, b and c. These derivatives are computed as follows: <ul> <li>d(ax2 + bx + c)/da = x2 <li>d(ax2 + bx + c)/db = x <li>d(ax2 + bx + c)/dc = 1 </ul> </p> <p> For a quadratic which has three variables the Jacobian Matrix will have three columns, one for each variable, and the number of rows will equal the number of rows in our data set, which in this case is ten. So for example for <tt>[a = 1, b = 1, c = 1], the Jacobian Matrix is (excluding the first column which shows the value of x): </p> <table cellspacing="0" cellpadding="3"> <tr> <td valign="bottom" align="left" style=" font-size:10pt;">x <td valign="bottom" align="left" style=" font-size:10pt;">d(ax2 + bx + c)/da <td valign="bottom" align="left" style=" font-size:10pt;">d(ax2 + bx + c)/db <td valign="bottom" align="left" style=" font-size:10pt;">d(ax2 + bx + c)/dc </tr> <tr> <td valign="bottom" align="center" style=" font-size:10pt;">1 <td valign="bottom" align="center" style=" font-size:10pt;">1 <td valign="bottom" align="center" style=" font-size:10pt;">1 <td valign="bottom" align="center" style=" font-size:10pt;">1 </tr> <tr> <td valign="bottom" align="center" style=" font-size:10pt;">2 <td valign="bottom" align="center" style=" font-size:10pt;">4 <td valign="bottom" align="center" style=" font-size:10pt;">2 <td valign="bottom" align="center" style=" font-size:10pt;">1 </tr> <tr> <td valign="bottom" align="center" style=" font-size:10pt;">3 <td valign="bottom" align="center" style=" font-size:10pt;">9 <td valign="bottom" align="center" style=" font-size:10pt;">3 <td valign="bottom" align="center" style=" font-size:10pt;">1 </tr> <tr> <td valign="bottom" align="center" style=" font-size:10pt;">4 <td valign="bottom" align="center" style=" font-size:10pt;">16 <td valign="bottom" align="center" style=" font-size:10pt;">4 <td valign="bottom" align="center" style=" font-size:10pt;">1 </tr> <tr> <td valign="bottom" align="center" style=" font-size:10pt;">5 <td valign="bottom" align="center" style=" font-size:10pt;">25 <td valign="bottom" align="center" style=" font-size:10pt;">5 <td valign="bottom" align="center" style=" font-size:10pt;">1 </tr> <tr> <td valign="bottom" align="center" style=" font-size:10pt;">6 <td valign="bottom" align="center" style=" font-size:10pt;">36 <td valign="bottom" align="center" style=" font-size:10pt;">6 <td valign="bottom" align="center" style=" font-size:10pt;">1 </tr> <tr> <td valign="bottom" align="center" style=" font-size:10pt;">7 <td valign="bottom" align="center" style=" font-size:10pt;">49 <td valign="bottom" align="center" style=" font-size:10pt;">7 <td valign="bottom" align="center" style=" font-size:10pt;">1 </tr> <tr> <td valign="bottom" align="center" style=" font-size:10pt;">8 <td valign="bottom" align="center" style=" font-size:10pt;">64 <td valign="bottom" align="center" style=" font-size:10pt;">8 <td valign="bottom" align="center" style=" font-size:10pt;">1 </tr> <tr> <td valign="bottom" align="center" style=" font-size:10pt;">9 <td valign="bottom" align="center" style=" font-size:10pt;">81 <td valign="bottom" align="center" style=" font-size:10pt;">9 <td valign="bottom" align="center" style=" font-size:10pt;">1 </tr> <tr> <td valign="bottom" align="center" style=" font-size:10pt;">10 <td valign="bottom" align="center" style=" font-size:10pt;">100 <td valign="bottom" align="center" style=" font-size:10pt;">10 <td valign="bottom" align="center" style=" font-size:10pt;">1 </tr> </table> <p> The implementation of the <code>MultivariateMatrixFunction jacobian() for this problem looks like this (The x parameter is an ArrayList containing the independent values of the data set): </p> <source> private double[][] jacobian(double[] variables) { double[][] jacobian = new double[x.size()][3]; for (int i = 0; i < jacobian.length; ++i) { jacobian[i][0] = x.get(i) * x.get(i); jacobian[i][1] = x.get(i); jacobian[i][2] = 1.0; } return jacobian; } public MultivariateMatrixFunction jacobian() { return new MultivariateMatrixFunction() { private static final long serialVersionUID = -8673650298627399464L; public double[][] value(double[] point) { return jacobian(point); } }; } </source> <p> Note that if for some reason the derivative of the objective function with respect to its variables is difficult to obtain, <a href="http://en.wikipedia.org/wiki/Numerical_differentiation">Numerical differentiation can be used. </p> <p> The implementation of the <code>double[] value(double[] point) method, which returns a <code>double array containing the values the objective function returns per given independent value and the current set of variables or parameters, can be seen below: </p> <source> public double[] value(double[] variables) { double[] values = new double[x.size()]; for (int i = 0; i < values.length; ++i) { values[i] = (variables[0] * x.get(i) + variables[1]) * x.get(i) + variables[2]; } return values; } </source> <p> Below is the the class containing all the implementation details (Taken from the Apache Commons Math <b>org.apache.commons.math3.optimization.general.LevenbergMarquardtOptimizerTest): </p> <source> private static class QuadraticProblem implements DifferentiableMultivariateVectorFunction, Serializable { private static final long serialVersionUID = 7072187082052755854L; private List<Double> x; private List<Double> y; public QuadraticProblem() { x = new ArrayList<Double>(); y = new ArrayList<Double>(); } public void addPoint(double x, double y) { this.x.add(x); this.y.add(y); } public double[] calculateTarget() { double[] target = new double[y.size()]; for (int i = 0; i < y.size(); i++) { target[i] = y.get(i).doubleValue(); } return target; } private double[][] jacobian(double[] variables) { double[][] jacobian = new double[x.size()][3]; for (int i = 0; i < jacobian.length; ++i) { jacobian[i][0] = x.get(i) * x.get(i); jacobian[i][1] = x.get(i); jacobian[i][2] = 1.0; } return jacobian; } public double[] value(double[] variables) { double[] values = new double[x.size()]; for (int i = 0; i < values.length; ++i) { values[i] = (variables[0] * x.get(i) + variables[1]) * x.get(i) + variables[2]; } return values; } public MultivariateMatrixFunction jacobian() { return new MultivariateMatrixFunction() { private static final long serialVersionUID = -8673650298627399464L; public double[][] value(double[] point) { return jacobian(point); } }; } } </source> <p> The below code shows how to go about using the above class and a LevenbergMarquardtOptimizer instance to produce an optimal set of quadratic curve fitting parameters: </p> <source> QuadraticProblem problem = new QuadraticProblem(); problem.addPoint(1, 34.234064369); problem.addPoint(2, 68.2681162306); problem.addPoint(3, 118.6158990846); problem.addPoint(4, 184.1381972386); problem.addPoint(5, 266.5998779163); problem.addPoint(6, 364.1477352516); problem.addPoint(7, 478.0192260919); problem.addPoint(8, 608.1409492707); problem.addPoint(9, 754.5988686671); problem.addPoint(10, 916.1288180859); LevenbergMarquardtOptimizer optimizer = new LevenbergMarquardtOptimizer(); final double[] weights = { 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 }; final double[] initialSolution = {1, 1, 1}; PointVectorValuePair optimum = optimizer.optimize(100, problem, problem.calculateTarget(), weights, initialSolution); final double[] optimalValues = optimum.getPoint(); System.out.println("A: " + optimalValues[0]); System.out.println("B: " + optimalValues[1]); System.out.println("C: " + optimalValues[2]); </source> <p> If you run the above sample you will see the following printed by the console: </p> <pre> A: 7.998832172372726 B: 10.001841530162448 C: 16.324008168386605 </pre> </dd> <p> In addition to least squares solving, the <a href="../apidocs/org/apache/commons/math3/optimization/general/NonLinearConjugateGradientOptimizer.html"> NonLinearConjugateGradientOptimizer</a> class provides a non-linear conjugate gradient algorithm to optimize <a href="../apidocs/org/apache/commons/math3/analysis/DifferentiableMultivariateFunction.html"> DifferentiableMultivariateFunction</a>. Both the Fletcher-Reeves and the Polak-Ribière search direction update methods are supported. It is also possible to set up a preconditioner or to change the line-search algorithm of the inner loop if desired (the default one is a Brent solver). </p> <p> The <a href="../apidocs/org/apache/commons/math3/optimization/direct/PowellOptimizer.html"> PowellOptimizer</a> provides an optimization method for non-differentiable functions. </p> </subsection> </section> </body> </document>

Other Java examples (source code examples)

Here is a short list of links related to this Java optimization.xml source code file:

Java example source code file (optimization.xml)

This example Java source code file (optimization.xml) is included in the alvinalexander.com "Java Source Code Warehouse" project. The intent of this project is to help you "Learn Java by Example" TM.

Learn more about this Java project at its project page.

Java - Java tags/keywords

apache, arraylist, differentiablemultivariatevectorfunction, jacobian, levenbergmarquardtoptimizer, license, matrix, multivariatematrixfunction, quadraticproblem, the, these, this

The optimization.xml Java example source code

<?xml version="1.0"?>

<!--
   Licensed to the Apache Software Foundation (ASF) under one or more
  contributor license agreements.  See the NOTICE file distributed with
  this work for additional information regarding copyright ownership.
  The ASF licenses this file to You under the Apache License, Version 2.0
  (the "License"); you may not use this file except in compliance with
  the License.  You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.
  -->

<?xml-stylesheet type="text/xsl" href="./xdoc.xsl"?>
<document url="optimization.html">

  <properties>
    <title>The Commons Math User Guide - Optimization
  </properties>

  <body>
    <section name="12 Optimization">
      <p>The contents of this section currently describes deprecated classes.
        Please refer to the new <a href="../apidocs/org/apache/commons/math3/optim/package-summary.html">API
        description</a>.
      </p>
      <p>Least squares optimizers are not in this package anymore, they have been moved
        in a dedicated least-squares sub-package described in the <a href="./leastsquares.html">least squares
        section.
      </p>

      <subsection name="12.1 Overview" href="overview">
        <p>
          The optimization package provides algorithms to optimize (i.e. either minimize
          or maximize) some objective or cost function. The package is split in several
          sub-packages dedicated to different kind of functions or algorithms.
          <ul>
            <li>the univariate package handles univariate scalar functions,
            <li>the linear package handles multivariate vector linear functions
                with linear constraints,</li>
            <li>the direct package handles multivariate scalar functions
            using direct search methods (i.e. not using derivatives),</li>
            <li>the general package handles multivariate scalar or vector functions
            using derivatives.</li>
            <li>the fitting package handles curve fitting by univariate real functions
          </ul>
        </p>
        <p>
        The top level optimization package provides common interfaces for the optimization
        algorithms provided in sub-packages. The main interfaces defines defines optimizers
        and convergence checkers. The functions that are optimized by the algorithms provided
        by this package and its sub-packages are a subset of the one defined in the
        <code>analysis package, namely the real and vector valued functions. These
        functions are called objective function here. When the goal is to minimize, the
        functions are often called cost function, this name is not used in this package.
        </p>
        <p>
        The type of goal, i.e. minimization or maximization, is defined by the enumerated
        <a href="../apidocs/org/apache/commons/math3/optimization/GoalType.html">
        GoalType</a> which has only two values: MAXIMIZE and MINIMIZE.
        </p>
        <p>
        Optimizers are the algorithms that will either minimize or maximize, the objective
        function by changing its input variables set until an optimal set is found. There
        are only four interfaces defining the common behavior of optimizers, one for each
        supported type of objective function:
        <ul>
          <li>
              UnivariateOptimizer</a> for 
          <li>
              MultivariateOptimizer</a> for 
          <li>
              DifferentiableMultivariateOptimizer</a> for 
          <li>
              DifferentiableMultivariateVectorOptimizer</a> for 
        </ul>
        </p>

        <p>
        Despite there are only four types of supported optimizers, it is possible to optimize
        a transform a <a
        href="../apidocs/org/apache/commons/math3/analysis/MultivariateVectorFunction.html">
        non-differentiable multivariate vectorial function</a> by converting it to a .
        </p>
        <p>
          These algorithms usage is very similar to root-finding algorithms usage explained
          in the analysis package. The main difference is that the <code>solve methods in root
          finding algorithms is replaced by <code>optimize methods.
        </p>
      </subsection>
      <subsection name="12.3 Linear Programming" href="linear">
        <p>
          This package provides an implementation of George Dantzig's simplex algorithm
          for solving linear optimization problems with linear equality and inequality
          constraints.
        </p>
      </subsection>
      <subsection name="12.4 Direct Methods" href="direct">
        <p>
          Direct search methods only use cost function values, they don't
          need derivatives and don't either try to compute approximation of
          the derivatives. According to a 1996 paper by Margaret H. Wright
          (<a href="http://cm.bell-labs.com/cm/cs/doc/96/4-02.ps.gz">Direct
          Search Methods: Once Scorned, Now Respectable</a>), they are used
          when either the computation of the derivative is impossible (noisy
          functions, unpredictable discontinuities) or difficult (complexity,
          computation cost). In the first cases, rather than an optimum, a
          <em>not too bad point is desired. In the latter cases, an
          optimum is desired but cannot be reasonably found. In all cases
          direct search methods can be useful.
        </p>
        <p>
          Simplex-based direct search methods are based on comparison of
          the cost function values at the vertices of a simplex (which is a
          set of n+1 points in dimension n) that is updated by the algorithms
          steps.
        </p>
        <p>
          The instances can be built either in single-start or in
          multi-start mode. Multi-start is a traditional way to try to avoid
          being trapped in a local minimum and miss the global minimum of a
          function. It can also be used to verify the convergence of an
          algorithm. In multi-start mode, the <code>minimizesmethod
          returns the best minimum found after all starts, and the <code>etMinima
          method can be used to retrieve all minima from all starts (including the one
          already provided by the <code>minimizes method).
        </p>
        <p>
          The <code>direct package provides four solvers:
          <ul>
            <li>the classical 
            <li>Virginia Torczon's 
            <li>Nikolaus Hansen's 
            <li>Mike Powell's  such that a cost function
          J = sum(w<sub>i(mesi - modi)2) is
          minimized. The various (target<sub>i - modeli(pk))
          terms are called residuals. They represent the deviation between a set of
          target values target<sub>i and theoretical values computed from
          models model<sub>i depending on free parameters pk.
          The w<sub>i factors are weights. One classical use case is when the
          target values are experimental observations or measurements.
        </p>
        <p>
          Solving a least-squares problem is finding the free parameters p<sub>k
          of the theoretical models such that they are close to the target values, i.e.
          when the residual are small.
        </p>
        <p>
          Two optimizers are available in the general package, both devoted to least-squares
          problems. The first one is based on the <a
          href="../apidocs/org/apache/commons/math3/optimization/general/GaussNewtonOptimizer.html">
          Gauss-Newton</a> method. The second one is the  method of the optimizer, along with the target and weight arrays,
          thus allowing the optimizer to compute the residuals at will. The last parameter to the
          <code>estimate method is the point from which the optimizer will start its
          search for the optimal point.
        </p>

    <dl>
    <dt>Quadratic Problem Example
    <dd>


    We are looking to find the best parameters [a, b, c] for the quadratic function
    <b>f(x) = a x2 + b x + c.
    The data set below was generated using [a = 8, b = 10, c = 16]. 
    A random number between zero and one was added to each y value calculated.

    <table cellspacing="0" cellpadding="3">
<tr>
<td  valign="bottom"  align="center"  style=" font-size:10pt;">X
... this post is sponsored by my books ...

#1 New Release!

FP Best Seller

 

new blog posts

 

Copyright 1998-2024 Alvin Alexander, alvinalexander.com
All Rights Reserved.

A percentage of advertising revenue from
pages under the /java/jwarehouse URI on this website is
paid back to open source projects.