Changeset f5ea10


Ignore:
Timestamp:
Jun 22, 2018, 7:26:09 AM (7 years ago)
Author:
Frederik Heber <frederik.heber@…>
Branches:
Candidate_v1.6.1, ChemicalSpaceEvaluator
Children:
99c705
Parents:
8d56a6
git-author:
Frederik Heber <frederik.heber@…> (09/26/17 22:30:01)
git-committer:
Frederik Heber <frederik.heber@…> (06/22/18 07:26:09)
Message:

Added Graph6Reader, extended BoostGraphCreator, added ChemicalSpaceEvaluatorAction.

  • added visible generateAllInducedSubgraphs to Extractors.
  • TESTS: due to new option "graph6" containing a digit we needed to modify moltest_check.py to also scan for digits and not just letters.
  • DOCU: Added evaluate-chemical-space to userguide.
Files:
5 added
10 edited

Legend:

Unmodified
Added
Removed
  • doc/userguide/userguide.xml

    r8d56a6 rf5ea10  
    12681268          to both other atoms serves as rotation joint).</para>
    12691269        </section>
     1270        <section xml:id="bond.evaluate-chemical-space">
     1271          <title xml:id="bond.evaluate-chemical-space.title">Evaluate the Chemical Space</title>
     1272          <para>Imagine that we are given a graph consisting of nodes and edges.
     1273          As we have been speaking extensively of the adjacency graph before, then
     1274          any graph where nodes are assigned a chemical element and edges are given
     1275          a degree, effectively represent a molecule of covalently bonded atoms.</para>
     1276          <para>This is the notion behind constructing parts of the chemical space that
     1277          encompasses all molecules that either already exist or could be devised,
     1278          i.e. that are stable.</para>
     1279          <para>One method of creating points in this chemical space, i.e. stable
     1280          molecules, is presented in the article by Hamaekers et al. 2017, where
     1281          they follow the well-known <emphasis>octet rule</emphasis>.</para>
     1282          <para>In this action essentially the same method is implemented. Given
     1283          an arbitrary graph, encoded as Graph6 string, such as produced by
     1284          the <link xlink:href="http://users.cecs.anu.edu.au/~bdm/nauty/">nauty</link>
     1285          toolset (Brendan McKay and Adolfo Piperno), and a set of chemical
     1286          elements, the action creates every possible molecule using the graph
     1287          and the given elements for each of its nodes, changing bond degrees
     1288          as long as to fulfil the octet rule.</para>
     1289          <programlisting>--evaluate-chemical-space \
     1290  --graph6 "B`" --elements C C</programlisting>
     1291        <para>The graph6 string "B`" represents the simplest graph consisting
     1292        of just two nodes, connected by a single edge. Designating each node
     1293        to be a carbon atom, this action will then produce the carbonhydrates
     1294        C2H6, C2H4, and C2H2.</para>
     1295        <para>At the moment, the molecules are not created but their signatures
     1296        are used in looking up the fragments in the internal homology container,
     1297        see <link linkend="fragmentation">Fragmentation </link> and
     1298        <link linkend="homology">Homologies</link>. Instead of creating atomic
     1299        coordinates, approximate energies are given that allow to estimate
     1300        whether the molecule candidate is stable or not. Note that this requires
     1301        that a suitable homology container file has been loaded first that contains
     1302        all necessary fragments.</para>
     1303        <note>The graph is automatically saturated with hydrogens in obeying
     1304        the octet rule, i.e. hydrogen should not be given in the list of elements.</note>
     1305        <para>Using nauty that produces all possible graphs with a fixed number
     1306        of nodes, these can be simply fed into <productname>MoleCuilder </productname>
     1307        to produce all associated molecules.</para>
     1308        </section>
    12701309      </section>
    12711310      <section xml:id="molecule">
  • src/Actions/GlobalListOfActions.hpp

    r8d56a6 rf5ea10  
    7575  (GeometryPositionToVector) \
    7676  (GeometryRemove) \
     77  (GraphChemicalSpaceEvaluator) \
    7778  (GraphUpdateMolecules) \
    7879  (GraphCorrectBondDegree) \
  • src/Actions/Makefile.am

    r8d56a6 rf5ea10  
    312312
    313313GRAPHACTIONSOURCE = \
     314  Actions/GraphAction/ChemicalSpaceEvaluatorAction.cpp \
    314315  Actions/GraphAction/CorrectBondDegreeAction.cpp \
    315316  Actions/GraphAction/CreateAdjacencyAction.cpp \
     
    319320  Actions/GraphAction/UpdateMoleculesAction.cpp
    320321GRAPHACTIONHEADER = \
     322  Actions/GraphAction/ChemicalSpaceEvaluatorAction.hpp \
    321323  Actions/GraphAction/CorrectBondDegreeAction.hpp \
    322324  Actions/GraphAction/CreateAdjacencyAction.hpp \
     
    326328  Actions/GraphAction/UpdateMoleculesAction.hpp
    327329GRAPHACTIONDEFS = \
     330  Actions/GraphAction/ChemicalSpaceEvaluatorAction.def \
    328331  Actions/GraphAction/CorrectBondDegreeAction.def \
    329332  Actions/GraphAction/CreateAdjacencyAction.def \
  • src/FunctionApproximation/Extractors.cpp

    r8d56a6 rf5ea10  
    3434#include <config.h>
    3535#endif
    36 
    37 #include <boost/graph/adjacency_list.hpp>
    38 #include <boost/graph/breadth_first_search.hpp>
    39 #include <boost/graph/subgraph.hpp>
    4036
    4137//#include "CodePatterns/MemDebug.hpp"
     
    4844#include <vector>
    4945#include <boost/assign.hpp>
    50 #include <boost/bimap.hpp>
    51 #include <boost/bimap/set_of.hpp>
    52 #include <boost/bimap/multiset_of.hpp>
    5346#include <boost/bind.hpp>
    5447#include <boost/foreach.hpp>
     
    6760
    6861using namespace boost::assign;
     62
     63using namespace Extractors;
    6964
    7065FunctionModel::arguments_t
     
    186181}
    187182
    188 typedef size_t level_t;
    189 typedef size_t node_t;
    190 typedef std::multimap< level_t, node_t > nodes_per_level_t;
    191 typedef std::set<node_t> nodes_t;
    192 typedef std::set<nodes_t> set_of_nodes_t;
    193 typedef boost::property_map < boost::adjacency_list <>, boost::vertex_index_t >::type index_map_t;
    194 
    195 typedef boost::bimap<
    196     boost::bimaps::set_of< size_t >,
    197     boost::bimaps::multiset_of< Extractors::ParticleType_t >
    198 > type_index_lookup_t;
    199 
    200 typedef std::set<node_t> set_type;
    201 typedef std::set<set_type> powerset_type;
    202 
    203 typedef boost::adjacency_list < boost::vecS, boost::vecS, boost::undirectedS,
    204     boost::no_property, boost::no_property > UndirectedGraph;
    205 typedef boost::subgraph< UndirectedGraph > UndirectedSubgraph;
    206 
    207 typedef std::map< node_t, std::pair<Extractors::ParticleType_t, size_t> > node_FragmentNode_map_t;
    208 
    209 typedef std::map< argument_t::indices_t, size_t> argument_placement_map_t;
    210 
    211 typedef std::map<size_t, size_t> argindex_to_nodeindex_t;
    212 
    213183void insertIntoNodeFragmentMap(
    214184    node_FragmentNode_map_t &_node_FragmentNode_map,
     
    334304 * \param index_map with indices per \a graph' vertex
    335305 */
    336 static void generateAllInducedConnectedSubgraphs(
     306void Extractors::generateAllInducedConnectedSubgraphs(
    337307    const size_t N,
    338308    const level_t level,
     
    438408}
    439409
    440 static HomologyGraph createHomologyGraphFromNodes(
     410HomologyGraph Extractors::createHomologyGraphFromNodes(
    441411    const nodes_t &nodes,
    442412    const type_index_lookup_t &type_index_lookup,
     
    501471
    502472  return HomologyGraph(graph_nodes, graph_edges);
    503 }
    504 
    505 /**
    506  * I have no idea why this is so complicated with BGL ...
    507  *
    508  * This is taken from the book "The Boost Graph Library: User Guide and Reference Manual, Portable Documents",
    509  * chapter "Basic Graph Algorithms", example on calculating the bacon number.
    510  */
    511 template <typename DistanceMap>
    512 class distance_recorder : public boost::default_bfs_visitor
    513 {
    514 public:
    515   distance_recorder(DistanceMap dist) : d(dist) {}
    516 
    517   template <typename Edge, typename Graph>
    518   void tree_edge(Edge e, const Graph &g) const {
    519     typename boost::graph_traits<Graph>::vertex_descriptor u = source(e,g), v = target(e,g);
    520     d[v] = d[u] + 1;
    521   }
    522 
    523 private:
    524   DistanceMap d;
    525 };
    526 
    527 template <typename DistanceMap>
    528 distance_recorder<DistanceMap> record_distance(DistanceMap d)
    529 {
    530   return distance_recorder<DistanceMap>(d);
    531473}
    532474
  • src/FunctionApproximation/Extractors.hpp

    r8d56a6 rf5ea10  
    1414#endif
    1515
     16#include <boost/bimap.hpp>
     17#include <boost/bimap/set_of.hpp>
     18#include <boost/bimap/multiset_of.hpp>
     19#include <boost/graph/adjacency_list.hpp>
     20#include <boost/graph/breadth_first_search.hpp>
     21#include <boost/graph/subgraph.hpp>
    1622#include <boost/function.hpp>
     23
     24#include <map>
     25#include <set>
    1726
    1827#include "Fragmentation/EdgesPerFragment.hpp"
     
    5160  typedef std::vector<ParticleType_t> ParticleTypes_t;
    5261
     62  typedef size_t level_t;
     63  typedef size_t node_t;
     64  typedef std::multimap< level_t, node_t > nodes_per_level_t;
     65  typedef std::set<node_t> nodes_t;
     66  typedef std::set<nodes_t> set_of_nodes_t;
     67
     68  typedef boost::bimap<
     69      boost::bimaps::set_of< size_t >,
     70      boost::bimaps::multiset_of< Extractors::ParticleType_t >
     71  > type_index_lookup_t;
     72
     73  typedef std::set<node_t> set_type;
     74  typedef std::set<set_type> powerset_type;
     75
     76  typedef boost::adjacency_list < boost::vecS, boost::vecS, boost::undirectedS,
     77      boost::property<boost::vertex_name_t, atomId_t>,
     78      boost::property<boost::vertex_color_t, boost::default_color_type> /* needed for limited-depth DFS,
     79      otherwise the property_map gets full size of graph */
     80      > UndirectedGraph;
     81  typedef boost::subgraph< UndirectedGraph > UndirectedSubgraph;
     82
     83  typedef boost::property_map < UndirectedGraph, boost::vertex_index_t >::type index_map_t;
     84
     85  typedef std::map< node_t, std::pair<Extractors::ParticleType_t, size_t> > node_FragmentNode_map_t;
     86
     87  typedef std::map< argument_t::indices_t, size_t> argument_placement_map_t;
     88
     89  typedef std::map<size_t, size_t> argindex_to_nodeindex_t;
     90
     91  /**
     92   * I have no idea why this is so complicated with BGL ...
     93   *
     94   * This is taken from the book "The Boost Graph Library: User Guide and Reference Manual, Portable Documents",
     95   * chapter "Basic Graph Algorithms", example on calculating the bacon number.
     96   */
     97  template <typename DistanceMap>
     98  class distance_recorder : public boost::default_bfs_visitor
     99  {
     100  public:
     101    distance_recorder(DistanceMap dist) : d(dist) {}
     102
     103    template <typename Edge, typename Graph>
     104    void tree_edge(Edge e, const Graph &g) const {
     105      typename boost::graph_traits<Graph>::vertex_descriptor u = source(e,g), v = target(e,g);
     106      d[v] = d[u] + 1;
     107    }
     108
     109  private:
     110    DistanceMap d;
     111  };
     112
     113  template <typename DistanceMap>
     114  distance_recorder<DistanceMap> record_distance(DistanceMap d)
     115  {
     116    return distance_recorder<DistanceMap>(d);
     117  }
     118
     119  HomologyGraph createHomologyGraphFromNodes(
     120      const nodes_t &nodes,
     121      const type_index_lookup_t &type_index_lookup,
     122      const UndirectedGraph &graph,
     123      const index_map_t &index_map
     124      );
     125
     126  void generateAllInducedConnectedSubgraphs(
     127      const size_t N,
     128      const level_t level,
     129      const nodes_t &nodes,
     130      set_of_nodes_t &set_of_nodes,
     131      const nodes_per_level_t &nodes_per_level,
     132      const UndirectedGraph &graph,
     133      const std::vector<size_t> &_distance,
     134      const index_map_t &index_map);
     135
    53136  /** Namespace for some internal helper functions.
    54137   *
  • src/Graph/BoostGraphCreator.cpp

    r8d56a6 rf5ea10  
    4343
    4444#include "Atom/atom.hpp"
     45#include "Graph/Graph6Reader.hpp"
    4546#include "molecule.hpp"
    4647
     
    9495}
    9596
     97void BoostGraphCreator::createFromGraph6String(
     98    const std::string &_graph_string)
     99{
     100  Graph6Reader reader;
     101  {
     102    std::stringstream inputstream(_graph_string);
     103    reader(inputstream);
     104  }
     105
     106  graph = UndirectedGraph();
     107
     108  // add nodes
     109  for(int numbers = 0; numbers < reader.get_num_nodes(); ++numbers) {
     110    const atomId_t atomid = numbers;
     111    Vertex v = boost::add_vertex(atomid, graph);
     112    const atomId_t vertexname = boost::get(boost::get(boost::vertex_name, graph), v);
     113    const nodeId_t vertexindex = boost::get(boost::get(boost::vertex_index, graph), v);
     114    LOG(2, "DEBUG: Adding node " << vertexindex << " associated to atom #" << vertexname);
     115    ASSERT( vertexname == atomid,
     116        "BoostGraphCreator::createFromRange() - atomid "+toString(atomid)
     117        +" is not name of vertex "+toString(vertexname)+".");
     118    atomids_nodeids.insert( std::make_pair(vertexname, vertexindex) );
     119  }
     120
     121  // add edges
     122  const Graph6Reader::edges_t &edges = reader.get_edges();
     123  for(Graph6Reader::edges_t::const_iterator iter = edges.begin();
     124      iter != edges.end(); ++iter) {
     125    // graph6 contains only upper triangle of adjacency matrix, hence add only once
     126    const nodeId_t leftnodeid = iter->first;
     127    const nodeId_t rightnodeid = iter->second;
     128    boost::add_edge(leftnodeid, rightnodeid, graph);
     129  }
     130}
     131
    96132BoostGraphCreator::nodeId_t BoostGraphCreator::getNodeId(
    97133    const atomId_t &_atomid) const
  • src/Graph/BoostGraphCreator.hpp

    r8d56a6 rf5ea10  
    9090      const predicate_t &_pred);
    9191
     92  /** Creates the boost::graph from the defining graph6 string where the atom
     93   * nodes map is simply the identity.
     94   *
     95   * \param _graph_string graph6 string defining the graph
     96   */
     97  void createFromGraph6String(
     98      const std::string &_graph_string);
     99
    92100  /** Getter for the created graph.
    93101   *
  • src/Graph/Makefile.am

    r8d56a6 rf5ea10  
    1010        Graph/ConnectedSubgraph.cpp \
    1111        Graph/CyclicStructureAnalysis.cpp \
    12         Graph/DepthFirstSearchAnalysis.cpp
    13                                  
     12        Graph/DepthFirstSearchAnalysis.cpp \
     13        Graph/Graph6Reader.cpp
     14
    1415GRAPHHEADER = \
    1516        Graph/AdjacencyList.hpp \
     
    2324        Graph/CyclicStructureAnalysis.hpp \
    2425        Graph/DepthFirstSearchAnalysis.hpp \
     26        Graph/Graph6Reader.hpp \
    2527        Graph/ListOfLocalAtoms.hpp
    2628
  • tests/Python/AllActions/moltest_check.py

    r8d56a6 rf5ea10  
    2424def CheckParameters(docstring):
    2525    result = 0
    26     params = re.findall(r'\(str\)([-_a-zA-Z]*)', docstring)
     26    params = re.findall(r'\(str\)([-_a-zA-Z0-9]*)', docstring)
    2727
    2828    for param in params:
  • tests/Python/AllActions/options.dat

    r8d56a6 rf5ea10  
    9292fragment_path   "test/"
    9393fragment_prefix "BondFragment"
     94graph6  "B`"
    9495grid_level      "5"
    9596help    "help"
Note: See TracChangeset for help on using the changeset viewer.