Maximum common induced subgraph

Source: Wikipedia, the free encyclopedia.

In graph theory and theoretical computer science, a maximum common induced subgraph of two graphs G and H is a graph that is an induced subgraph of both G and H, and that has as many vertices as possible.

Finding this graph is

NP-hard
. In the associated
NP-complete.[1] It is a generalization of the induced subgraph isomorphism problem
, which arises when k equals the number of vertices in the smaller of G and H, so that this entire graph must appear as an induced subgraph of the other graph.

Based on

polynomial time
on -vertex graphs, always finds a solution within a factor of of optimal, for any .
[3]

One possible solution for this problem is to build a modular product graph of G and H. In this graph, the largest clique corresponds to a maximum common induced subgraph of G and H. Therefore, algorithms for finding maximum cliques can be used to find the maximum common induced subgraph.[4] Moreover, a modified maximum-clique algorithm can be used to find a maximum common connected subgraph.[5]

The McSplit algorithm (along with its McSplit↓ variant) is a forward checking algorithm that does not use the clique encoding, but uses a compact data structure to keep track of the vertices in graph H to which each vertex in graph G may be mapped. Both versions of the McSplit algorithm outperform the clique encoding for many graph classes.[6] A more efficient implementation of McSplit is McSplitDAL+PR, which combines a Reinforcement Learning agent with some heuristic scores computed with the PageRank algorithm.[7]

Maximum common induced subgraph algorithms have a long tradition in cheminformatics and pharmacophore mapping.[8]

See also

References