butterflies#
- butterflies(G, nodes=None)[source]#
Count the number of butterflies for each node in a bipartite graph.
A butterfly is a complete bipartite subgraph K_{2,2} — four nodes (two from each partition) with all four cross-edges present. It is the bipartite analogue of a triangle in unipartite graphs.
A1 A2 | \ / | | X | | / \ | B1 B2
Equivalently, a butterfly is a 4-cycle (C_4) in which alternating nodes belong to different partitions of the bipartite graph. This structure is also called a square in the physics and complex-networks literature [3], and a four-cycle in the sociology literature [4]. The name butterfly is standard in the data-mining and bipartite-network literature [1] [2].
- Parameters:
- GNetworkX graph
An undirected bipartite graph.
- nodesiterable of nodes, optional (default: all nodes)
Return butterfly counts only for these nodes. The computation always uses the full graph;
nodesonly filters the returned dictionary (same convention astriangles()). WhenNone(default), counts for all nodes are returned. Nodes not present inGare silently ignored.
- Returns:
- butterfliesdict
A dictionary keyed by node to the number of butterflies that node participates in. Each butterfly is counted once per participating node, so:
sum(butterflies(G).values()) == 4 * total_butterfly_count
See also
robins_alexander_clusteringgraph-level bipartite clustering coefficient whose numerator is
4 * total butterfly count.networkx.algorithms.cluster.square_clusteringper-node square clustering coefficient for general graphs.
Notes
This algorithm uses wedge enumerate to count butterflies. The wedge enumeration method uses the vertex-priority of BFC-VP [2] : nodes are ranked by degree, neighbor lists are pre-sorted by ascending rank, and for each start node
uonly neighbors with lower rank are processed as middle and end nodes, with early termination on the sorted lists. This gives time complexity \(O\!\bigl(\sum_{(u,v)\in E} \min(d(u), d(v))\bigr) = O(\alpha m)\) where \(\alpha\) is the arboricity, i.e. the minimum number of forests into which \(E\) can be partitioned, and \(m\) the number of edges.The per-node attribution (distributing butterfly credits to start, middle, and end nodes) is an extension of BFC-VP not present in the original paper, which computes only a global count.
References
[1]Sanei-Mehri, S. V., Sariyuce, A. E., & Tirthapura, S. (2018). Butterfly counting in bipartite networks. Proceedings of the 24th ACM SIGKDD, 2150–2159. https://doi.org/10.1145/3219819.3220097
[2] (1,2)Wang, K., Lin, X., Qin, L., Zhang, W., & Zhang, Y. (2023). Accelerated butterfly counting with vertex priority on bipartite graphs. The VLDB Journal, 32, 257–281. https://doi.org/10.1007/s00778-022-00746-0
[3]Lind, P. G., Gonzalez, M. C., & Herrmann, H. J. (2005). Cycles and clustering in bipartite networks. Physical Review E, 72, 056127. https://doi.org/10.1103/PhysRevE.72.056127
[4]Robins, G. and M. Alexander (2004). Small worlds among interlocking directors: Network structure and distance in bipartite graphs. Computational & Mathematical Organization Theory 10(1), 69–94. https://doi.org/10.1023/B:CMOT.0000032580.12184.c0
Examples
A single K_{2,2} contains exactly one butterfly, and each of its four nodes participates in that butterfly:
>>> G = nx.complete_bipartite_graph(2, 2) >>> nx.bipartite.butterflies(G) {0: 1, 1: 1, 2: 1, 3: 1}
The total number of butterflies is the sum divided by 4:
>>> bt = nx.bipartite.butterflies(G) >>> sum(bt.values()) // 4 1
K_{3,3} contains nine butterflies; every node participates in six:
>>> G2 = nx.complete_bipartite_graph(3, 3) >>> bt2 = nx.bipartite.butterflies(G2) >>> sum(bt2.values()) // 4 9
Nodes not in any butterfly receive count zero:
>>> G = nx.complete_bipartite_graph(2, 2) >>> G.add_edge(0, 5) >>> nx.bipartite.butterflies(G) {0: 1, 1: 1, 2: 1, 3: 1, 5: 0}