dedensify

dedensify(G, threshold, prefix=None, copy=True)[source]

Compresses neighborhoods around high-degree nodes

Reduces the number of edges to high-degree nodes by adding compressor nodes that summarize multiple edges of the same type to high-degree nodes (nodes with a degree greater than a given threshold). Dedensification also has the added benefit of reducing the number of edges around high-degree nodes. The implementation currently supports graphs with a single edge type.

Parameters
G: graph

A networkx graph

threshold: int

Minimum degree threshold of a node to be considered a high degree node. The threshold must be greater than or equal to 2.

prefix: str or None, optional (default: None)

An optional prefix for denoting compressor nodes

copy: bool, optional (default: True)

Indicates if dedensification should be done inplace

Returns
dedensified networkx graph(graph, set)

2-tuple of the dedensified graph and set of compressor nodes

Notes

According to the algorithm in [1], removes edges in a graph by compressing/decompressing the neighborhoods around high degree nodes by adding compressor nodes that summarize multiple edges of the same type to high-degree nodes. Dedensification will only add a compressor node when doing so will reduce the total number of edges in the given graph. This implementation currently supports graphs with a single edge type.

References

1

Maccioni, A., & Abadi, D. J. (2016, August). Scalable pattern matching over compressed graphs via dedensification. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1755-1764). http://www.cs.umd.edu/~abadi/papers/graph-dedense.pdf

Examples

Dedensification will only add compressor nodes when doing so would result in fewer edges:

>>> original_graph = nx.DiGraph()
>>> original_graph.add_nodes_from(
...     ["1", "2", "3", "4", "5", "6", "A", "B", "C"]
... )
>>> original_graph.add_edges_from(
...     [
...         ("1", "C"), ("1", "B"),
...         ("2", "C"), ("2", "B"), ("2", "A"),
...         ("3", "B"), ("3", "A"), ("3", "6"),
...         ("4", "C"), ("4", "B"), ("4", "A"),
...         ("5", "B"), ("5", "A"),
...         ("6", "5"),
...         ("A", "6")
...     ]
... )
>>> c_graph, c_nodes = nx.dedensify(original_graph, threshold=2)
>>> original_graph.number_of_edges()
15
>>> c_graph.number_of_edges()
14

A dedensified, directed graph can be “densified” to reconstruct the original graph:

>>> original_graph = nx.DiGraph()
>>> original_graph.add_nodes_from(
...     ["1", "2", "3", "4", "5", "6", "A", "B", "C"]
... )
>>> original_graph.add_edges_from(
...     [
...         ("1", "C"), ("1", "B"),
...         ("2", "C"), ("2", "B"), ("2", "A"),
...         ("3", "B"), ("3", "A"), ("3", "6"),
...         ("4", "C"), ("4", "B"), ("4", "A"),
...         ("5", "B"), ("5", "A"),
...         ("6", "5"),
...         ("A", "6")
...     ]
... )
>>> c_graph, c_nodes = nx.dedensify(original_graph, threshold=2)
>>> # re-densifies the compressed graph into the original graph
>>> for c_node in c_nodes:
...     all_neighbors = set(nx.all_neighbors(c_graph, c_node))
...     out_neighbors = set(c_graph.neighbors(c_node))
...     for out_neighbor in out_neighbors:
...         c_graph.remove_edge(c_node, out_neighbor)
...     in_neighbors = all_neighbors - out_neighbors
...     for in_neighbor in in_neighbors:
...         c_graph.remove_edge(in_neighbor, c_node)
...         for out_neighbor in out_neighbors:
...             c_graph.add_edge(in_neighbor, out_neighbor)
...     c_graph.remove_node(c_node)
...
>>> nx.is_isomorphic(original_graph, c_graph)
True