Title: | Personalized Network-Based Anti-Cancer Therapy Evaluation |
---|---|
Description: | Identification of the most appropriate pharmacotherapy for each patient based on genomic alterations is a major challenge in personalized oncology. 'PANACEA' is a collection of personalized anti-cancer drug prioritization approaches utilizing network methods. The methods utilize personalized "driverness" scores from 'driveR' to rank drugs, mapping these onto a protein-protein interaction network. The "distance-based" method scores each drug based on these scores and distances between drugs and genes to rank given drugs. The "RWR" method propagates these scores via a random-walk with restart framework to rank the drugs. The methods are described in detail in Ulgen E, Ozisik O, Sezerman OU. 2023. PANACEA: network-based methods for pharmacotherapy prioritization in personalized oncology. Bioinformatics <doi:10.1093/bioinformatics/btad022>. |
Authors: | Ege Ulgen [aut, cre, cph] |
Maintainer: | Ege Ulgen <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.0.1.9000 |
Built: | 2025-03-02 04:34:58 UTC |
Source: | https://github.com/egeulgen/panacea |
Add Drugs as Nodes
add_drugs_as_nodes(W_mat, drug_target_interactions, edge_weight = 1000)
add_drugs_as_nodes(W_mat, drug_target_interactions, edge_weight = 1000)
W_mat |
adjacency matrix for the chosen PIN |
drug_target_interactions |
data frame containing (processed) drugs and target genes |
edge_weight |
edge weight for drug-target gene interaction (default = 1000) |
adjacency matrix with the drugs added as nodes
Turn Adjacency List into Adjacency Matrix
adj_list2mat(adj_list)
adj_list2mat(adj_list)
adj_list |
Adjacency list |
Adjacency matrix
Convert Input Gene Symbols to Alias
convert2alias(input_genes, target_genes)
convert2alias(input_genes, target_genes)
input_genes |
vector of input genes |
target_genes |
vector of target genes |
vector of converted gene symbols (if any alias in target genes)
Data frame containing drug-gene interactions from expert-curated sources (CancerCommons, CGI, ChemblInteractions, CIViC, ClearityFoundationBiomarkers, ClearityFoundationClinicalTrial, COSMIC, DoCM, MyCancerGenome, MyCancerGenomeClinicalTrial, TALC, TdgClinicalTrial, TEND) from DGIdb.
DGIdb_interactions_df
DGIdb_interactions_df
a data frame containing 11323 rows and 2 variables:
Drug name
HGNC gene symbol for the interacting gene
Data frame containing 'driveR' results for a lung adenocarcinoma case.
example_driveR_res
example_driveR_res
a data frame containing 106 rows and 3 variables:
HGNC gene symbol
'driverness' probability
driveR's prediction for whether the gene is a 'driver' or 'non-driver'
Vector containing 'PANACEA' "distance-based" results for a lung adenocarcinoma case. Names are drug names, values are scores
example_scores_dist
example_scores_dist
named vector of 1423 values
Vector containing 'PANACEA' "RWR" results for a lung adenocarcinoma case. Names are drug names, values are scores
example_scores_RWR
example_scores_RWR
named vector of 1423 values
Graph Laplacian Normalization
Laplacian.norm(W)
Laplacian.norm(W)
W |
square symmetric adjacency matrix |
normalized adjacency matrix
Network Propagation (Random-walk with Restart)
network_propagation(prior_vec, W_prime, alpha, max.iter = 1000, eps = 1e-04)
network_propagation(prior_vec, W_prime, alpha, max.iter = 1000, eps = 1e-04)
prior_vec |
vector of prior knowledge on selected genes (names are gene symbols) |
W_prime |
(Laplacian-normalized, symmetric) adjacency matrix |
alpha |
restart parameter, controlling trade-off between prior information and network smoothing |
max.iter |
maximum allowed number of iterations (default = 1000) |
eps |
epsilon value to assess the L2 norm of the difference between iterations (default = 1e-4) |
Implementing RWR following the following publications: Cowen L, Ideker T, Raphael BJ, Sharan R. Network propagation: a universal amplifier of genetic associations. Nat Rev Genet. 2017 Sep;18(9):551–62. Shnaps O, Perry E, Silverbush D, Sharan R. Inference of personalized drug targets via network propagation. Pac Symp Biocomput. 2016;21:156–67.
vector of propagation values
Identification of the most appropriate pharmacotherapy for each patient based
on genomic alterations is a major challenge in personalized oncology.
PANACEA
is a collection of personalized anti-cancer drug prioritization
approaches utilizing network methods. The methods utilize personalized
"driverness" scores from 'driveR' to rank drugs, mapping these onto a
protein-protein interaction network (PIN). The "distance-based" method scores
each drug based on these scores and distances between drugs and genes to rank
given drugs. The "RWR" method propagates these scores via a random-walk with
restart framework to rank the drugs.
Maintainer: Ege Ulgen [email protected] (ORCID) [copyright holder]
score_drugs
for the wrapper function for scoring of
drugs via network-based methods
Process Drug-Target Interactions
process_drug_target_interactions( drug_target_interactions, PIN_genes, drug_name_col = "drug_name", target_col = "gene_name" )
process_drug_target_interactions( drug_target_interactions, PIN_genes, drug_name_col = "drug_name", target_col = "gene_name" )
drug_target_interactions |
data frame containing drugs and target genes |
PIN_genes |
gene symbols for the chosen PIN |
drug_name_col |
name of the column containing drug names (default = "drug_name") |
target_col |
name of the column containing drug targets (default = "converted_target_gene") |
processed drug-target interactions. Processing involves converting symbols missing in the PIN, merging drugs that have the same target gene(s)
Scoring of Drugs via Network-based Methods
score_drugs(driveR_res, drug_interactions_df, W_mat, method, ...)
score_drugs(driveR_res, drug_interactions_df, W_mat, method, ...)
driveR_res |
data frame of driveR results |
drug_interactions_df |
data frame of drug-gene interactions |
W_mat |
adjacency matrix for the PIN |
method |
scoring method (one of 'distance-based' or 'RWR') |
... |
additional arguments for |
This is the wrapper function for the two proposed methods for personalized scoring of drugs for individual cancer samples via network-based methods. The available methods are 'distance-based' and 'RWR'. For the 'distance-based' method, the score between a gene (g) and drug (d) is formulated as:
where driver(g) is the driverness probability of gene g, as predicted by 'driveR' and d(g, d) is the distance withing the PIN between gene g and drug d. The final score of the drug d is then the average of the scores between each altered gene and d:
For the 'RWR' method, a random-walk with restart framework is used to propagate the driverness probabilities.
By default DGIdb_interactions_df
is used as the
drug_interactions_df
.
If the W_mat
argument is not supplied, the built-in STRNG data
STRING_adj_df
is used to generate W_mat
.
vector of scores per drug.
toy_data <- data.frame( gene_symbol = c("TP53", "EGFR", "KDR", "ATM"), driverness_prob = c(0.94, 0.92, 0.84, 0.72) ) toy_interactions <- DGIdb_interactions_df[1:25, ] res <- score_drugs( driveR_res = toy_data, drug_interactions_df = toy_interactions, # leave blank for default W_mat = toy_W_mat, # leave blank for default method = "distance-based", verbose = FALSE )
toy_data <- data.frame( gene_symbol = c("TP53", "EGFR", "KDR", "ATM"), driverness_prob = c(0.94, 0.92, 0.84, 0.72) ) toy_interactions <- DGIdb_interactions_df[1:25, ] res <- score_drugs( driveR_res = toy_data, drug_interactions_df = toy_interactions, # leave blank for default W_mat = toy_W_mat, # leave blank for default method = "distance-based", verbose = FALSE )
Distance-based Scoring of Drugs
score_drugs_distance_based( driveR_res, drug_interactions_df, W_mat, driver_prob_cutoff = 0.05, drug_name_col = "drug_name", target_col = "gene_name", verbose = TRUE )
score_drugs_distance_based( driveR_res, drug_interactions_df, W_mat, driver_prob_cutoff = 0.05, drug_name_col = "drug_name", target_col = "gene_name", verbose = TRUE )
driveR_res |
data frame of driveR results |
drug_interactions_df |
data frame of drug-gene interactions |
W_mat |
adjacency matrix for the PIN |
driver_prob_cutoff |
cut-off value for 'driverness_prob' (default = 0.05) |
drug_name_col |
for 'drug_interactions_df', the column name containing drug names/identifiers |
target_col |
for 'drug_interactions_df', the column name containing target gene symbols |
verbose |
boolean to control verbosity (default = |
vector of scores per drug. Drugs with the same target gene(s) are merged
(via process_drug_target_interactions
)
toy_data <- data.frame( gene_symbol = c("TP53", "EGFR", "KDR", "ATM"), driverness_prob = c(0.94, 0.92, 0.84, 0.72) ) toy_interactions <- DGIdb_interactions_df[1:100, ] res <- score_drugs_distance_based( driveR_res = toy_data, drug_interactions_df = toy_interactions, W_mat = toy_W_mat, verbose = FALSE )
toy_data <- data.frame( gene_symbol = c("TP53", "EGFR", "KDR", "ATM"), driverness_prob = c(0.94, 0.92, 0.84, 0.72) ) toy_interactions <- DGIdb_interactions_df[1:100, ] res <- score_drugs_distance_based( driveR_res = toy_data, drug_interactions_df = toy_interactions, W_mat = toy_W_mat, verbose = FALSE )
RWR-based Scoring of Drugs
score_drugs_RWR_based( driveR_res, drug_interactions_df, W_mat, alpha = 0.05, max.iter = 1000, eps = 1e-04, drug_name_col = "drug_name", target_col = "gene_name", verbose = TRUE )
score_drugs_RWR_based( driveR_res, drug_interactions_df, W_mat, alpha = 0.05, max.iter = 1000, eps = 1e-04, drug_name_col = "drug_name", target_col = "gene_name", verbose = TRUE )
driveR_res |
data frame of driveR results |
drug_interactions_df |
data frame of drug-gene interactions |
W_mat |
adjacency matrix for the PIN |
alpha |
restart parameter, controlling trade-off between prior information and network smoothing |
max.iter |
maximum allowed number of iterations (default = 1000) |
eps |
epsilon value to assess the L2 norm of the difference between iterations (default = 1e-4) |
drug_name_col |
for 'drug_interactions_df', the column name containing drug names/identifiers |
target_col |
for 'drug_interactions_df', the column name containing target gene symbols |
verbose |
boolean to control verbosity (default = |
vector of scores per drug. Drugs with the same target gene(s) are merged
(via process_drug_target_interactions
)
toy_data <- data.frame( gene_symbol = c("TP53", "EGFR", "KDR", "ATM"), driverness_prob = c(0.94, 0.92, 0.84, 0.72) ) toy_interactions <- DGIdb_interactions_df[1:100, ] res <- score_drugs_RWR_based( driveR_res = toy_data, drug_interactions_df = toy_interactions, W_mat = toy_W_mat, verbose = FALSE )
toy_data <- data.frame( gene_symbol = c("TP53", "EGFR", "KDR", "ATM"), driverness_prob = c(0.94, 0.92, 0.84, 0.72) ) toy_interactions <- DGIdb_interactions_df[1:100, ] res <- score_drugs_RWR_based( driveR_res = toy_data, drug_interactions_df = toy_interactions, W_mat = toy_W_mat, verbose = FALSE )
Data frame of adjacency list for STRING v11.5 interactions with combined score > 700 (high confidence)
STRING_adj_df
STRING_adj_df
a data frame with 887797 rows and 3 variables:
Interactor 1
Interactor 2
edge weight(combined score)
Symmetric matrix containing example adjacency data
toy_W_mat
toy_W_mat
matrix of 84 rows and 84 columns