graphistry.layout package¶
Subpackages¶
Submodules¶
graphistry.compute.ComputeMixin module¶
-
class
graphistry.compute.ComputeMixin.
ComputeMixin
(*args, **kwargs)¶ Bases:
object
-
chain
(*args, **kwargs)¶ Experimental: Chain a list of operations
Return subgraph of matches according to the list of node & edge matchers
If any matchers are named, add a correspondingly named boolean-valued column to the output
- Parameters
ops – List[ASTobject] Various node and edge matchers
- Returns
Plotter
- Return type
- Example: Find nodes of some type
from graphistry.ast import n people_nodes_df = g.chain([ n({"type": "person"}) ])._nodes
- Example: Find 2-hop edge sequences with some attribute
from graphistry.ast import e_forward g_2_hops = g.chain([ e_forward({"interesting": True}, hops=2) ]) g_2_hops.plot()
Example: Find any node 1-2 hops out from another node, and label each hop
from graphistry.ast import n, e_undirected g_2_hops = g.chain([ n({g._node: "a"}), e_undirected(name="hop1"), e_undirected(name="hop2") ]) print('# first-hop edges:', len(g_2_hops._edges[ g_2_hops._edges.hop1 == True ]))
Example: Transaction nodes between two kinds of risky nodes
from graphistry.ast import n, e_forward, e_reverse g_risky = g.chain([ n({"risk1": True}), e_forward(to_fixed=True), n({"type": "transaction"}, name="hit"), e_reverse(to_fixed=True), n({"risk2": True}) ]) print('# hits:', len(g_risky._nodes[ g_risky._nodes.hit ]))
-
collapse
(node, attribute, column, self_edges=False, unwrap=False, verbose=False)¶ Topology-aware collapse by given column attribute starting at node
Traverses directed graph from start node node and collapses clusters of nodes that share the same property so that topology is preserved.
- Parameters
node (
Union
[str
,int
]) – start node to begin traversalattribute (
Union
[str
,int
]) – the given attribute to collapse over within columncolumn (
Union
[str
,int
]) – the column of nodes DataFrame that contains attribute to collapse over
:returns:A new Graphistry instance with nodes and edges DataFrame containing collapsed nodes and edges given by column attribute – nodes and edges DataFrames contain six new columns collapse_{node | edges} and final_{node | edges}, while original (node, src, dst) columns are left untouched :rtype: Plottable
- Parameters
self_edges (
bool
) –unwrap (
bool
) –verbose (
bool
) –
-
drop_nodes
(nodes)¶ return g with any nodes/edges involving the node id series removed
-
filter_edges_by_dict
(*args, **kwargs)¶ filter edges to those that match all values in filter_dict
-
filter_nodes_by_dict
(*args, **kwargs)¶ filter nodes to those that match all values in filter_dict
-
get_degrees
(col='degree', degree_in='degree_in', degree_out='degree_out')¶ Decorate nodes table with degree info
Edges must be dataframe-like: pandas, cudf, …
Parameters determine generated column names
Warning: Self-cycles are currently double-counted. This may change.
Example: Generate degree columns
edges = pd.DataFrame({'s': ['a','b','c','d'], 'd': ['c','c','e','e']}) g = graphistry.edges(edges, 's', 'd') print(g._nodes) # None g2 = g.get_degrees() print(g2._nodes) # pd.DataFrame with 'id', 'degree', 'degree_in', 'degree_out'
- Parameters
col (
str
) –degree_in (
str
) –degree_out (
str
) –
-
get_indegrees
(col='degree_in')¶ See get_degrees
- Parameters
col (
str
) –
-
get_outdegrees
(col='degree_out')¶ See get_degrees
- Parameters
col (
str
) –
-
get_topological_levels
(level_col='level', allow_cycles=True, warn_cycles=True, remove_self_loops=True)¶ Label nodes on column level_col based on topological sort depth Supports pandas + cudf, using parallelism within each level computation Options: * allow_cycles: if False and detects a cycle, throw ValueException, else break cycle by picking a lowest-in-degree node * warn_cycles: if True and detects a cycle, proceed with a warning * remove_self_loops: preprocess by removing self-cycles. Avoids allow_cycles=False, warn_cycles=True messages.
Example:
edges_df = gpd.DataFrame({‘s’: [‘a’, ‘b’, ‘c’, ‘d’],’d’: [‘b’, ‘c’, ‘e’, ‘e’]}) g = graphistry.edges(edges_df, ‘s’, ‘d’) g2 = g.get_topological_levels() g2._nodes.info() # pd.DataFrame with | ‘id’ , ‘level’ |
- Parameters
level_col (
str
) –allow_cycles (
bool
) –warn_cycles (
bool
) –remove_self_loops (
bool
) –
- Return type
Plottable
-
hop
(*args, **kwargs)¶ Given a graph and some source nodes, return subgraph of all paths within k-hops from the sources
g: Plotter nodes: dataframe with id column matching g._node. None signifies all nodes (default). hops: how many hops to consider, if any bound (default 1) to_fixed_point: keep hopping until no new nodes are found (ignores hops) direction: ‘forward’, ‘reverse’, ‘undirected’ edge_match: dict of kv-pairs to exact match (see also: filter_edges_by_dict) source_node_match: dict of kv-pairs to match nodes before hopping destination_node_match: dict of kv-pairs to match nodes after hopping (including intermediate) return_as_wave_front: Only return the nodes/edges reached, ignoring past ones (primarily for internal use)
-
materialize_nodes
(reuse=True)¶ Generate g._nodes based on g._edges
Uses g._node for node id if exists, else ‘id’
Edges must be dataframe-like: cudf, pandas, …
When reuse=True and g._nodes is not None, use it
Example: Generate nodes
edges = pd.DataFrame({'s': ['a','b','c','d'], 'd': ['c','c','e','e']}) g = graphistry.edges(edges, 's', 'd') print(g._nodes) # None g2 = g.materialize_nodes() print(g2._nodes) # pd.DataFrame
- Parameters
reuse (
bool
) –
-