Maximum common induced subgraph

In graph theory and theoretical computer science, a maximum common induced subgraph of two graphs G and H is a graph that is an induced subgraph of both G and H, and that has as many vertices as possible.

Finding this graph is

NP-hard

. In the associated

NP-complete.^[1] It is a generalization of the induced subgraph isomorphism problem

, which arises when k equals the number of vertices in the smaller of G and H, so that this entire graph must appear as an induced subgraph of the other graph.

Based on

polynomial time

on

n

-vertex graphs, always finds a solution within a factor of

n^{1-\epsilon }

of optimal, for any

\epsilon >0

.[3]

One possible solution for this problem is to build a modular product graph of G and H. In this graph, the largest clique corresponds to a maximum common induced subgraph of G and H. Therefore, algorithms for finding maximum cliques can be used to find the maximum common induced subgraph.^[4] Moreover, a modified maximum-clique algorithm can be used to find a maximum common connected subgraph.^[5]

The McSplit algorithm (along with its McSplit↓ variant) is a forward checking algorithm that does not use the clique encoding, but uses a compact data structure to keep track of the vertices in graph H to which each vertex in graph G may be mapped. Both versions of the McSplit algorithm outperform the clique encoding for many graph classes.^[6] A more efficient implementation of McSplit is McSplitDAL+PR, which combines a Reinforcement Learning agent with some heuristic scores computed with the PageRank algorithm.^[7]

Applications

Maximum common induced subgraph algorithms form the basis for both graph differencing and graph alignment. Graph differencing identifies and highlights differences between two graphs by pinpointing changes, additions, or deletions. Graph alignment involves finding correspondences between the vertices and edges of two graphs to identify similar structures.

Maximum common induced subgraph algorithms have a long tradition in bioinformatics, cheminformatics,^[8]^[9] pharmacophore mapping,^[10] pattern recognition,^[11] computer vision, code analysis, compilers, and model checking.

The problem is also particularly useful in software engineering and model-based systems engineering, where software code and engineering models (e.g., Simulink, UML diagrams) are represented as graph data structures. Graph differencing can be used to detect changes between different versions of software code and models for change auditing, debugging, version control and collaborative team development.

References

ISBN 0-7167-1045-5
A1.4: GT48, pg.202.

ISBN 978-3-540-55210-9
.

S2CID 5713815, ECCC TR05-100
.

doi:10.1016/0020-0190(76)90049-1
.

S2CID 215812381

ISBN 9780999241103

ISBN 978-989-758-665-1
.

ISSN 1573-7470
.

ISSN 1759-0876
.

S2CID 5202419
.

ISSN 0218-0014
.

Retrieved from "https://en.wikipedia.org/w/index.php?title=Maximum_common_induced_subgraph&oldid=1297249595"

[1] ISBN 0-7167-1045-5
A1.4: GT48, pg.202.

[2] ISBN 978-3-540-55210-9
.

[3] S2CID 5713815, ECCC TR05-100
.

[4] :10.1016/0020-0190(76)90049-1
.

[5] S2CID 215812381

[6] ISBN 9780999241103

[7] ISBN 978-989-758-665-1
.

[8] ISSN 1573-7470
.

[9] ISSN 1759-0876
.

[10] S2CID 5202419
.

[11] ISSN 0218-0014
.

[1]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

Applications

See also

References