I am fairly new to Dash, I am attempting to work with Dash and maybe Cytoscape, which looks very useful. I am trying to use data from a csv file which is formatted similar to this:
Where the columns and the first row (or index) are the names, a value of one or more indicates an edge between the two corresponding nodes, and the number indicates the weight. I can create a graph quite easily with this type of data with Networkx and Dash.
My questions are the following:
I’ve read around the forums/github issues and noticed there is no direct compatibility with networkx objects and Cytoscape. To be able to do this i’d have to manually convert, using something like nx.readwrite.json_graph.cytoscape_data() Can anybody who has done this provide additional detail on how to fully get there?
I managed to plot my networks using Dash before I learned about Dash Cytoscape. My end goal is to make a simple website that displays these visualizations (with thousands of nodes) in an interactive, user friendly manner. What exactly does Dash Cytoscape add to the picture, and would it be crucial or should I perhaps ultimately not use Cytoscape?
I’m fairly new to all of this if it all sounds like beginner stuff, but really looking forward to learning about it!
I am fairly new to Dash-Cytoscape as well, but I have managed to make this work.
The trick is to get all of the things in your elements list that you need to be in there before worrying about Cytoscape at all. The big kicker is that converting a NetworkX graph to Cytoscape format is not exactly solid. Some of the variables will not make it in there and some dictionary keys do not line up properly. So it won’t just work out of the box. But if you are willing to do some munging, you can make it happen.
So suppose you have some graph, G, created with some command like G=nx.Graph(). The steps you need to do are:
Create your graph
Get the (x,y) positions of each of the nodes via nx.nx_agraph.graphviz_layout()
Convert the graph to Cytoscape format via cy = nx.readwrite.json_graph.cytoscape_data(G)
Add the dictionary key label to the nodes list of cy. (You can just rename one fo the keys to label if you want).
Add the positions you got from (2) as a value for data in the nodes portion of cy
Take the results of (3)-(5) and write them to a list, like elements_ls
Create your cyto.Cytoscape() in your app.layout and make sure you set elements = elements_ls.
Thanks so much for the help! it worked nicely! Although I was having a little bit of trouble renaming (or adding) one of the keys to ‘label’, since i’ve never worked with such a convoluted json dict file like this. I kind of hacked my way through it but was wondering if you found a simple way to rename such a key?
That is great, thank you for the question, if you dont mind can you paste the full sample code corresponding to your sample data (Assuming that this is in a csv file)
This is for a SNP distance matrix weighted graph. A symmetrical matrix contains the number of SNP differences from reference for X samples. Data is read in from a tab separated file, inversed to become an adjacency matrix for NetworkX import function from_pandas_adjacency(), and force-directed Fruchterman Reingold layout calculated. Edges are trimmed to avoid symmetrical source/targets and self source/targets. The 2000 multiplier for node position is determined manually to improve graph display and node overlap; automatic method preferred but not implemented.
import networkx as nx
import pandas as pd
import os
def genNetworkCyto(species, ST, date):
input = date + '/' + genDistName(species, ST)
if not (os.path.exists(input) and os.path.isfile(input)):
print("species/st not in selected date")
return None
A = pd.read_csv(input, sep='\t', index_col=0, header=0)
def inverse(x):
if x == 0:
return 1
else:
return 1/x
Ai = A.applymap(inverse)
G = nx.from_pandas_adjacency(Ai)
pos=nx.fruchterman_reingold_layout(G, iterations=2000, threshold=1e-10)
nodes = [
{
'data': {'id': node, 'label': node},
'position': {'x': 2000*pos[node][0], 'y': 2000*pos[node][1]},
'locked': 'true'
}
for node in G.nodes
]
edges = []
for col in A:
for row, value in A[col].iteritems():
if {'data': {'source': row, 'target': col}} not in edges and row != col:
edges.append({'data': {'source': col, 'target': row}})
for edge in edges:
edge['data']['weight'] = A.loc[edge['data']['source'], edge['data']['target']]
elements = nodes + edges
return elements