source: src/FunctionApproximation/TrainingData.hpp@ 83956e

AutomationFragmentation_failures Candidate_v1.6.1 Candidate_v1.7.0 ChemicalSpaceEvaluator Exclude_Hydrogens_annealWithBondGraph ForceAnnealing_with_BondGraph_contraction-expansion StoppableMakroAction stable
Last change on this file since 83956e was e1fe7e, checked in by Frederik Heber <heber@…>, 11 years ago

FunctionModel now uses list_of_arguments to split sequence of subsets of distances.

  • this fixes ambiguities with the set of distances: Imagine the distances within a water molecule as OH (A) and HH (B). We then may have a sequence of argument_t as AABAAB. And with the current implementation of CompoundPotential::splitUpArgumentsByModels() we would always choose the latter (and more complex) model. Hence, we make two calls to TriplePotential_Angle, instead of calls twice to PairPotential_Harmonic for A, one to PairPotential_Harmonic for B, and once to TriplePotential_Angle for AAB.
  • now, we new list looks like A,A,B,AAB where each tuple of distances can be uniquely associated with a specific potential.
  • changed signatures of EmpiricalPotential::operator(), ::derivative(), ::parameter_derivative(). This involved changing all of the current specific potentials and CompoundPotential.
  • TrainingData must discern between the InputVector_t (just all distances) and the FilteredInputVector_t (tuples of subsets of distances).
  • FunctionApproximation now has list_of_arguments_t as parameter to evaluate() and evaluate_derivative().
  • DOCU: docu change in TrainingData.
  • Property mode set to 100644
File size: 4.7 KB
Line 
1/*
2 * TrainingData.hpp
3 *
4 * Created on: 15.10.2012
5 * Author: heber
6 */
7
8#ifndef TRAININGDATA_HPP_
9#define TRAININGDATA_HPP_
10
11// include config.h
12#ifdef HAVE_CONFIG_H
13#include <config.h>
14#endif
15
16#include <iosfwd>
17#include <boost/function.hpp>
18
19#include "Fragmentation/Homology/HomologyContainer.hpp"
20#include "FunctionApproximation/FunctionApproximation.hpp"
21#include "FunctionApproximation/FunctionModel.hpp"
22
23/** This class encapsulates the training data for a given potential function
24 * to learn.
25 *
26 * The data is added piece-wise by calling the operator() with a specific
27 * Fragment.
28 *
29 * TrainingData::operator() takes the set of all possible pair-wise distances
30 * (InputVector_t) and transforms it via the given filter into a list of subsets
31 * of distances (FilteredInputVector_t) that is feedable to the model.
32 *
33 */
34class TrainingData
35{
36public:
37 //!> typedef for a range within the HomologyContainer at which fragments to look at
38 typedef std::pair<
39 HomologyContainer::const_iterator,
40 HomologyContainer::const_iterator> range_t;
41 //!> Training tuple input vector pair
42 typedef FunctionApproximation::inputs_t InputVector_t;
43 //!> Training tuple modified input vector pair
44 typedef FunctionApproximation::filtered_inputs_t FilteredInputVector_t;
45 //!> Training tuple output vector pair
46 typedef FunctionApproximation::outputs_t OutputVector_t;
47 //!> Typedef for a table with columns of all distances and the energy
48 typedef std::vector< std::vector<double> > DistanceEnergyTable_t;
49 //!> Typedef for a map of each fragment with error.
50 typedef std::multimap< double, size_t > L2ErrorConfigurationIndexMap_t;
51
52public:
53 /** Constructor for class TrainingData.
54 *
55 */
56 explicit TrainingData(const FunctionModel::filter_t &_filter) :
57 filter(_filter)
58 {}
59
60 /** Destructor for class TrainingData.
61 *
62 */
63 ~TrainingData()
64 {}
65
66 /** We go through the given \a range of homologous fragments and call
67 * TrainingData::filter on them in order to gather the distance and
68 * the energy value, stored internally.
69 *
70 * \param range given range within a HomologyContainer of homologous fragments
71 */
72 void operator()(const range_t &range);
73
74 /** Getter for const access to internal training data inputs.
75 *
76 * \return const ref to training tuple of input vector
77 */
78 const FilteredInputVector_t& getTrainingInputs() const {
79 return ArgumentVector;
80 }
81
82 /** Getter for const access to internal list of all pair-wise distances.
83 *
84 * \return const ref to all arguments
85 */
86 const InputVector_t& getAllArguments() const {
87 return DistanceVector;
88 }
89
90 /** Getter for const access to internal training data outputs.
91 *
92 * \return const ref to training tuple of output vector
93 */
94 const OutputVector_t& getTrainingOutputs() const {
95 return EnergyVector;
96 }
97
98 /** Returns the average of each component over all OutputVectors.
99 *
100 * This is useful for initializing the offset of the potential.
101 *
102 * @return average output vector
103 */
104 const FunctionModel::results_t getTrainingOutputAverage() const;
105
106 /** Calculate the L2 error of a given \a model against the stored training data.
107 *
108 * \param model model whose L2 error to calculate
109 * \return sum of squared differences at training tuples
110 */
111 const double getL2Error(const FunctionModel &model) const;
112
113 /** Calculate the Lmax error of a given \a model against the stored training data.
114 *
115 * \param model model whose Lmax error to calculate
116 * \return maximum difference over all training tuples
117 */
118 const double getLMaxError(const FunctionModel &model) const;
119
120 /** Calculate the Lmax error of a given \a model against the stored training data.
121 *
122 * \param model model whose Lmax error to calculate
123 * \param range given range within a HomologyContainer of homologous fragments
124 * \return map with L2 error per configuration
125 */
126 const L2ErrorConfigurationIndexMap_t getWorstFragmentMap(
127 const FunctionModel &model,
128 const range_t &range) const;
129
130 /** Creates a table of columns with all distances and the energy.
131 *
132 * \return array with first columns containing distances, last column energy
133 */
134 const DistanceEnergyTable_t getDistanceEnergyTable() const;
135
136private:
137 // prohibit use of default constructor, as we always require extraction functor.
138 TrainingData();
139
140private:
141 //!> private training data vector
142 InputVector_t DistanceVector;
143 OutputVector_t EnergyVector;
144 //!> list of all filtered arguments over all tuples
145 FilteredInputVector_t ArgumentVector;
146 //!> function to be used for training input data extraction from a fragment
147 const FunctionModel::filter_t filter;
148};
149
150// print training data for debugging
151std::ostream &operator<<(std::ostream &out, const TrainingData &data);
152
153#endif /* TRAININGDATA_HPP_ */
Note: See TracBrowser for help on using the repository browser.