This documents the development version of NetworkX. Documentation for the current release can be found here.
simrank_similarity(G, source=None, target=None, importance_factor=0.9, max_iterations=100, tolerance=0.0001)¶
Returns the SimRank similarity of nodes in the graph
SimRank is a similarity metric that says “two objects are considered to be similar if they are referenced by similar objects.” .
The pseudo-code definition from the paper is:
def simrank(G, u, v): in_neighbors_u = G.predecessors(u) in_neighbors_v = G.predecessors(v) scale = C / (len(in_neighbors_u) * len(in_neighbors_v)) return scale * sum(simrank(G, w, x) for w, x in product(in_neighbors_u, in_neighbors_v))
Gis the graph,
uis the source,
vis the target, and
Cis a float decay or importance factor between 0 and 1.
The SimRank algorithm for determining node similarity is defined in .
- GNetworkX graph
A NetworkX graph
If this is specified, the returned dictionary maps each node
vin the graph to the similarity between
targetare specified, the similarity value between
targetis returned. If
targetis specified but
sourceis not, this argument is ignored.
The relative importance of indirect neighbors with respect to direct neighbors.
Maximum number of iterations.
Error tolerance used to check convergence. When an iteration of the algorithm finds that no similarity value changes more than this amount, the algorithm halts.
- similaritydictionary or float
None, this returns a dictionary of dictionaries, where keys are node pairs and value are similarity of the pair of nodes.
targetis, this returns a dictionary mapping node to the similarity of
sourceand that node.
None, this returns the similarity value for the given pair of nodes.
G. Jeh and J. Widom. “SimRank: a measure of structural-context similarity”, In KDD’02: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 538–543. ACM Press, 2002.
If the nodes of the graph are numbered from zero to n - 1, where n is the number of nodes in the graph, you can create a SimRank matrix from the return value of this function where the node numbers are the row and column indices of the matrix:
>>> import numpy as np >>> G = nx.cycle_graph(4) >>> sim = nx.simrank_similarity(G) >>> lol = [[sim[u][v] for v in sorted(sim[u])] for u in sorted(sim)] >>> sim_array = np.array(lol)