Uses Leiden clustering on modularity for community detection. Leiden was chosen as default as expected PPI data is not inherently hierarchical, which is why modularity optimalization is used on the graph topology. Expected PPI data comes from genes/proteins (of interest) selected from gene set enrichment analysis or differential expression analysis. Using clustering from terms is not possible, as genes can be in multiple terms. Leiden also scales well to large graphs, has consistent clustering outcomes and provides some inherent guarantees by its method, e.g. locally optimal assignment.

get_ppigraph(ppi_data, vertex_clustering = NULL)

Arguments

ppi_data

dataframe, PPI by aliases/ids in columns 'from' and 'to'

vertex_clustering

NULL, else numerical vector of cluster IDs

Value

igraph object of PPI data

References

Traag, V.A., Waltman, L. & van Eck, N.J. From Louvain to Leiden: guaranteeing well-connected communities. Sci Rep 9, 5233 (2019). https://doi.org/10.1038/s41598-019-41695-z

Examples

get_ppigraph(
  get(load(system.file("extdata", "example_ppi_data.rda", package = "goatea")))
)
#> IGRAPH b4a29c8 UNW- 6 15 -- 
#> + attr: central gene (g/c), modularity (g/n), transitivity (g/n),
#> | assortattivity (g/n), mean distance (g/n), edge density (g/n), degree
#> | centralization (g/n), betweenness centralization (g/n), closeness
#> | centralization (g/n), eigen centralization (g/n), name (v/c), cluster
#> | (v/n), degree (v/n), betweenness (v/n), closeness (v/n), knn (v/n),
#> | diversity (v/n), id (v/c), combined_score (e/n), from (e/c), to
#> | (e/c), edge_betweenness (e/n), weight (e/n)
#> + edges from b4a29c8 (vertex names):
#>  [1] TP53--MYC   TP53--BRCA1 TP53--SOX2  TP53--MTOR  TP53--EGFR  EGFR--MTOR 
#>  [7] EGFR--SOX2  EGFR--BRCA1 EGFR--MYC   SOX2--MYC   SOX2--BRCA1 SOX2--MTOR 
#> + ... omitted several edges