This module contains a Python implementation of the Krackhardt hierarchy score.
Introduction
Biological signaling can be modeled as a directed network, where nodes represent genes/proteins and edges represent signaling interactions.
The hierarchy of such a network can be quantified using various metrics, including the Krackhardt hierarchy score. This score measures the degree to which the network exhibits a perfect hierarchy, with higher scores indicating greater hierarchy.
In R the sna package presents methods to compute graph hierarchy including Krackhardt’s score (defined as \(A/MaxV\) where \(A\) is the number of unordered pairs of points in Dr that are asymmetrically linked), and there are other hierarchy scores implemented in Python such as Flow Hierarchy ScoreLuo and Magee 2011. However, despite its utility, there is currently no native implementation of the Krackhardt hierarchy score in Python.
Krackhardt hierarchy score
As defined in Krackhardt, David. (1994). Graph Theoretical Dimensions of Informal Organization. Computational Organization Theory. 89.
The graph hierarchy condition states that in a digraph D, for each pair of points where one (Pi) can reach another (Pj), the second (Pj) can't reach the first (Pi).
For example, in a formal organization chart a high lvl employee can reach through the chain of command her subordinate's subordinate. If the formal organization is working "properly", this lower lvl employee can't simultaneously reach the high lvl employee.
To measure the degree of hierarchy of digraph D, a new digraph Dr must be created. Dr is defined as the reachability digraph of D. Each point in D exists in Dr; moreover, the line (Pi,Pj) exists in Dr if and only if Pi can reach Pj in D. If D is graph hierarchic, then Dr will have no symmetric lines in it (i.e. if the line (Pi,Pj) exists in Dr then the line (Pj,Pi) does not).
The degree of hierarchy then is defined as:
\[
Graph Hierarchy = 1 - [V/MaxV]
\] Where \(V\) is the number of unordered pairs of points in Dr that are symmetrically linked , and \(MaxV\) the number of unordered pairs of points in Dr where Pi is linked to Pj or viceversa.
for i in rpy2.situation.iter_info(): # Print Rpy2 infoprint(i)pandas2ri.activate()
rpy2 version:
3.5.11
Python version:
3.10.12 | packaged by conda-forge | (main, Jun 23 2023, 22:39:40) [Clang 15.0.7 ]
Looking for R's HOME:
Environment variable R_HOME: /Users/ferran/miniconda3/envs/pykrack/lib/R
Calling `R RHOME`: /Users/ferran/miniconda3/envs/pykrack/lib/R
Environment variable R_LIBS_USER: None
R's additions to LD_LIBRARY_PATH:
/usr/local/lib/R/library/stats/libs/:/usr/local/lib/R/library/stats/libs/
R version:
In the PATH: R version 4.3.1 (2023-06-16) -- "Beagle Scouts"
Loading R library from rpy2: OK
Additional directories to load R packages from:
None
C extension compilation:
include:
['/Users/ferran/miniconda3/envs/pykrack/lib/R/include']
libraries:
['R', 'pcre2-8', 'lzma', 'bz2', 'z', 'dl', 'm', 'iconv', 'icuuc', 'icui18n']
library_dirs:
['/Users/ferran/miniconda3/envs/pykrack/lib', '/Users/ferran/miniconda3/envs/pykrack/lib/R/lib', '/Users/ferran/miniconda3/envs/pykrack/lib']
extra_compile_args:
['-std=c99']
extra_link_args:
['-fopenmp', '-Wl,-dead_strip_dylibs', '-Wl,-pie', '-Wl,-headerpad_max_install_names', '-Wl,-dead_strip_dylibs', '-Wl,-rpath,/Users/ferran/miniconda3/envs/pykrack/lib']
Directory for the R shared library:
lib
CFFI extension type
Environment variable: RPY2_CFFI_MODE
Value: CFFI_MODE.ANY
ABI: PRESENT
API: PRESENT
Type of hierarchy metric to compute. Accepted types are: ‘pykrack’ for this module’s implementation of the Krackhardt score. ‘rsnakrack’ for the sna implementation in R. ‘hierarchy_flow’ for the Luo and Magee 2011 as implemented in the NetworkX package.
#Graph from OmniPath#using the following link: https://omnipathdb.org/interactions?genesymbols=yes&fields=type&license=academic&directed=0#Gets us all interactions, no matter type nor undirected/vs directedomnidata = pd.read_csv("https://omnipathdb.org/interactions?genesymbols=yes&fields=type&license=academic&directed=1", sep="\t")G4 = nx.from_pandas_edgelist(omnidata, source="source_genesymbol", target="target_genesymbol",edge_attr="type", create_using=nx.DiGraph)