| [201199] | 1 | /*
 | 
|---|
 | 2 |  * Project: MoleCuilder
 | 
|---|
 | 3 |  * Description: creates and alters molecular systems
 | 
|---|
 | 4 |  * Copyright (C)  2010 University of Bonn. All rights reserved.
 | 
|---|
 | 5 |  * Please see the LICENSE file or "Copyright notice" in builder.cpp for details.
 | 
|---|
 | 6 |  */
 | 
|---|
 | 7 | 
 | 
|---|
 | 8 | /**
 | 
|---|
 | 9 |  * \file potentials.dox
 | 
|---|
 | 10 |  *
 | 
|---|
 | 11 |  * Created on: Nov 28, 2012
 | 
|---|
 | 12 |  *    Author: heber
 | 
|---|
 | 13 |  */
 | 
|---|
 | 14 | 
 | 
|---|
 | 15 | /** \page potentials Empirical Potentials and FunctionModels
 | 
|---|
 | 16 |  *
 | 
|---|
 | 17 |  * On this page we explain what is meant with the Potentials sub folder.
 | 
|---|
 | 18 |  *
 | 
|---|
 | 19 |  * First, we are based on fragmenting a molecular system, i.e. dissecting its
 | 
|---|
 | 20 |  * bond structure into connected subgraphs, calculating the energies of the
 | 
|---|
 | 21 |  * fragments (ab-initio) and summing up to a good approximation of the total
 | 
|---|
| [caece4] | 22 |  * energy of the whole system, \sa fragmentation.
 | 
|---|
| [201199] | 23 |  * Second, having calculated these energies, there quickly comes up the thought
 | 
|---|
 | 24 |  * that one actually calculates quite similar systems all time and if one could
 | 
|---|
 | 25 |  * not cache results in an intelligent (i.e. interpolating) fashion ...
 | 
|---|
 | 26 |  *
 | 
|---|
 | 27 |  * That's where so-called empirical potentials come into play. They are
 | 
|---|
 | 28 |  * functions depending on a number of "fitted" parameters and the variable
 | 
|---|
 | 29 |  * distances within a molecular fragment (i.e. the bond lengths) in order to
 | 
|---|
 | 30 |  * give a value for the total energy without the need to solve a complex
 | 
|---|
| [caece4] | 31 |  * ab-initio model (essentially, not solving the electronic Schrödinger equation
 | 
|---|
 | 32 |  * anymore).
 | 
|---|
| [201199] | 33 |  *
 | 
|---|
 | 34 |  * Empirical potentials have been thought of by fellows such as Lennard-Jones,
 | 
|---|
 | 35 |  * Morse, Tersoff, Stillinger and Weber, etc. And in their honor, the
 | 
|---|
 | 36 |  * potential form is named after its inventor. Hence, we speak e.g. of a
 | 
|---|
 | 37 |  * Lennard-Jones potential.
 | 
|---|
 | 38 |  *
 | 
|---|
 | 39 |  * So, what we have to do in order to cache results is the following procedure:
 | 
|---|
 | 40 |  * -# gather similar fragments
 | 
|---|
 | 41 |  * -# perform a fit procedure to obtain the parameters for the empirical
 | 
|---|
 | 42 |  *    potential
 | 
|---|
 | 43 |  * -# evaluate the potential instead of an ab-initio calculation
 | 
|---|
 | 44 |  *
 | 
|---|
 | 45 |  * The terms we use, model the classes that are implemented:
 | 
|---|
 | 46 |  * -# EmpiricalPotential: Contains the interface to a function that can be
 | 
|---|
 | 47 |  *    evaluated given a number of arguments_t, i.e. distances. Also, one might
 | 
|---|
 | 48 |  *    want to evaluate derivatives.
 | 
|---|
 | 49 |  * -# FunctionModel: Is a function that can be fitted, i.e. that has internal
 | 
|---|
 | 50 |  *    parameters to be set and got.
 | 
|---|
 | 51 |  * -# argument_t: The Argument stores not only the distance but also the index
 | 
|---|
 | 52 |  *    pair of the associated atoms and also their charges, to let the potential
 | 
|---|
 | 53 |  *    check on validity.
 | 
|---|
 | 54 |  * -# SerializablePotential: Eventually, one wants to store to or parse from
 | 
|---|
 | 55 |  *    a file all potential parameters. This functionality is encoded in this
 | 
|---|
 | 56 |  *    class.
 | 
|---|
 | 57 |  * -# HomologyGraph: "Similar" fragments in our case have to have the same bond
 | 
|---|
 | 58 |  *    graph. It is stored in the HomologyGraph that acts as representative
 | 
|---|
 | 59 |  * -# HomologyContainer: This container combines, in multimap fashion, all
 | 
|---|
 | 60 |  *    similar fragments with their energies together, with the HomologyGraph
 | 
|---|
 | 61 |  *    as their "key".
 | 
|---|
 | 62 |  * -# TrainingData: Here, we combine InputVector and OutputVector that contain
 | 
|---|
 | 63 |  *    the set of distances required for the FunctionModel (e.g. only a single
 | 
|---|
 | 64 |  *    distance/argument for a pair potential, three for an angle potential,
 | 
|---|
 | 65 |  *    etc.) and also the expected OutputVector. This in combination with the
 | 
|---|
 | 66 |  *    FunctionModel is the basis for the non-linear regression used for the
 | 
|---|
 | 67 |  *    fitting procedure.
 | 
|---|
 | 68 |  * -# Extractors: These set of functions yield the set of distances from a
 | 
|---|
 | 69 |  *    given fragment that is stored in the HomologyContainer.
 | 
|---|
 | 70 |  * -# FunctionApproximation: Contains the interface to the levmar package where
 | 
|---|
 | 71 |  *    the Levenberg-Marquardt (Newton + Trust region) algorithm is used to
 | 
|---|
 | 72 |  *    perform the fit.
 | 
|---|
 | 73 |  *
 | 
|---|
| [48d20d] | 74 |  * \section potentials-howto-use Howto use the potentials
 | 
|---|
| [201199] | 75 |  *
 | 
|---|
 | 76 |  *  We just give a brief run-down in terms of code on how to use the potentials.
 | 
|---|
| [48d20d] | 77 |  *  Here, we just describe what to do in order to perform the fitting. This is
 | 
|---|
 | 78 |  *  basically what is implemented in FragmentationFitPotentialAction.
 | 
|---|
| [201199] | 79 |  *
 | 
|---|
 | 80 |  *  \code
 | 
|---|
 | 81 |  *  // we need the homology container and the representative graph we want to
 | 
|---|
 | 82 |  *  // fit to.
 | 
|---|
 | 83 |  *  HomologyContainer homologies;
 | 
|---|
 | 84 |  *  const HomologyGraph graph = getSomeGraph(homologies);
 | 
|---|
 | 85 |  *  Fragment::charges_t h2o;
 | 
|---|
 | 86 |  *  h2o += 8,1,1;
 | 
|---|
 | 87 |  *  // TrainingData needs so called Extractors to get the required distances
 | 
|---|
| [caece4] | 88 |  *  // from the stored fragment. These functions are bound via boost::bind.
 | 
|---|
| [201199] | 89 |  *  TrainingData AngleData(
 | 
|---|
 | 90 |  *      boost::bind(&Extractors::gatherDistancesFromFragment,
 | 
|---|
 | 91 |  *          boost::bind(&Fragment::getPositions, _1),
 | 
|---|
 | 92 |  *          boost::bind(&Fragment::getCharges, _1),
 | 
|---|
 | 93 |  *          boost::cref(h2o),
 | 
|---|
 | 94 |  *          _2)
 | 
|---|
 | 95 |  *      );
 | 
|---|
 | 96 |  *  // now we extract the distances and energies and store them
 | 
|---|
 | 97 |  *  AngleData(homologies.getHomologousGraphs(graph));
 | 
|---|
 | 98 |  *  // give ParticleTypes of this potential to make it unique
 | 
|---|
 | 99 |  *  PairPotential_Angle::ParticleTypes_t types =
 | 
|---|
 | 100 |  *        boost::assign::list_of<PairPotential_Angle::ParticleType_t>
 | 
|---|
 | 101 |  *        (8)(1)(1)
 | 
|---|
 | 102 |  *      ;
 | 
|---|
 | 103 |  *  PairPotential_Angle angle(types);
 | 
|---|
 | 104 |  *  // give initial parameter
 | 
|---|
 | 105 |  *  FunctionModel::parameters_t params(PairPotential_Angle::MAXPARAMS, 0.);
 | 
|---|
| [48d20d] | 106 |  *  // ... set some potential-specific initial parameters in params struct
 | 
|---|
| [201199] | 107 |  *  angle.setParameters(params);
 | 
|---|
 | 108 |  *
 | 
|---|
 | 109 |  *  // use the potential as a FunctionModel along with prepared TrainingData
 | 
|---|
 | 110 |  *  FunctionModel &model = angle;
 | 
|---|
 | 111 |  *  FunctionApproximation approximator(AngleData, model);
 | 
|---|
 | 112 |  *  approximator(FunctionApproximation::ParameterDerivative);
 | 
|---|
 | 113 |  *
 | 
|---|
 | 114 |  *  // obtain resulting parameters and check remaining L_2 and L_max error
 | 
|---|
 | 115 |  *  angleparams = model.getParameters();
 | 
|---|
 | 116 |  *  LOG(1, "INFO: L2sum = " << AngleData(model)
 | 
|---|
 | 117 |  *      << ", LMax = " << AngleData(model) << ".");
 | 
|---|
 | 118 |  *  \endcode
 | 
|---|
 | 119 |  *
 | 
|---|
 | 120 |  *  The evaluation of the fitted potential is then trivial, e.g.
 | 
|---|
 | 121 |  *  \code
 | 
|---|
 | 122 |  *  // constructed someplace
 | 
|---|
 | 123 |  *  PairPotential_Angle angle(...);
 | 
|---|
 | 124 |  *
 | 
|---|
 | 125 |  *  // evaluate
 | 
|---|
 | 126 |  *  FunctionModel::arguments_t args;
 | 
|---|
| [48d20d] | 127 |  *  // .. initialise args to the desired distances
 | 
|---|
| [201199] | 128 |  *  const double value = angle(args)[0]; // output is a vector!
 | 
|---|
 | 129 |  *  \endcode
 | 
|---|
 | 130 |  *
 | 
|---|
| [48d20d] | 131 |  * \section potentials-stability-of-fit note in stability of fit
 | 
|---|
 | 132 |  *
 | 
|---|
 | 133 |  *  As we always start from random initial parameters (within a certain sensible
 | 
|---|
| [0932c2] | 134 |  *  range at least), the non-linear fit does not always converge. Note that the
 | 
|---|
 | 135 |  *  random values are drawn from the defined distribution and the uniform distributionm
 | 
|---|
 | 136 |  *  engine is obtained from the currently set, see \ref randomnumbers. Hence, you
 | 
|---|
 | 137 |  *  can manipulate both in order to get different results or to set the seed such that
 | 
|---|
 | 138 |  *  some "randomly" drawn value always work well (e.g. for testing). 
 | 
|---|
 | 139 |  *  
 | 
|---|
 | 140 |  *  In any case, For this case the FragmentationFitPotentialAction has the option 
 | 
|---|
 | 141 |  *  "take-best-of" to allow for multiple fits where the best (in terms of l2 error) 
 | 
|---|
 | 142 |  *  is taken eventually. Furthermore, you can use the "set-threshold" option to
 | 
|---|
 | 143 |  *  stop restarting the fit procedure first when the L2 error has dropped below the
 | 
|---|
 | 144 |  *  given threshold.
 | 
|---|
| [48d20d] | 145 |  *
 | 
|---|
 | 146 |  * \section potentials-howto-add Howto add new potentials
 | 
|---|
 | 147 |  *
 | 
|---|
 | 148 |  *  Adding a new potential requires the following:
 | 
|---|
 | 149 |  *  -# Add the new modules to Potentials/Specifics
 | 
|---|
 | 150 |  *  -# Add a unit test on the potential in Potentials/Specifics/unittests
 | 
|---|
 | 151 |  *  -# Give the potential a type name and add it to PotentialTypes.def. Note
 | 
|---|
 | 152 |  *     that the name must not contain white space.
 | 
|---|
 | 153 |  *  -# Add the potential name as case to PotentialFactory such that it knows
 | 
|---|
 | 154 |  *     how to instantiate your new potential when requested.
 | 
|---|
| [0932c2] | 155 |  *  -# Remember to use the the RandomNumberGenerator for getting random starting
 | 
|---|
 | 156 |  *     values!
 | 
|---|
| [48d20d] | 157 |  *
 | 
|---|
 | 158 |  * PotentialTypes.def contains a boost::preprocessor sequence of all
 | 
|---|
 | 159 |  * potential names. PotentialFactory uses this sequence to build its enum to
 | 
|---|
 | 160 |  * type map and inverse which the user sees when specifying the potential to
 | 
|---|
 | 161 |  * fit via PotentialTypeValidator.
 | 
|---|
 | 162 |  *
 | 
|---|
 | 163 |  *
 | 
|---|
 | 164 |  * \date 2013-04-09
 | 
|---|
| [201199] | 165 |  */
 | 
|---|