CNA correlation analysis: pairwise gene-level correlations between CNA (x-axis in both panels) and RNA (y-axis, left panel) and proteome (y-axis, right panel). A table summarizing high-level results and an interactive correlation (y-axis) rank (x-axis) plot enables further exploration of the results. RNA correlation analysis: Histogram of all (blue) and significant (red, FDR < 0.05) gene-level Pearson correlations between RNA and protein data. A browseable table summarizes the results. Association analysis: Interactive volcano plot illustrating differentially abundant signatures of cancer hallmark pathways between tumors with GATA3 mutations (right side) compared to GATA3 wild-type tumors (left side). Right heatmap shows activity scores of cancer hallmark pathways for the three NMF clusters. Annotation tracks depict mutation status of key BRCA genes ( TP53, GATA3, PIK3CA), hormone receptor status (ER, PR, HER2) and PAM50 transcriptional subtypes. Multi-omics NMF clustering: Heatmap (left) depicts multi-omics expression patterns across three NMF clusters derived from integrative clustering of copy-number alterations (CNA), RNA, protein and phosphosite data. The full suite of reports (html) derived from the tutorial dataset can be accessed and explored at, with an interpretation summary at. ( B) A sampling of reports from PANOPLY for the tutorial BRCA dataset 1 that can be explored in a web browser. Results from the data analysis modules are summarized in interactive reports generated by appropriate report modules. Analysis ready data tables are then used as inputs to the data analysis modules. Data preparation modules perform quality checks on input data followed by optional normalization and filtering for proteomics data. Panoply modules are grouped into Data Preparation Modules (green box), Data Analysis Modules (blue box) and Report Modules (red box). ( A) Inputs to PANOPLY consists of (i) externally characterized genomics and proteomics data (in gct format) (ii) sample phenotypes and annotations (in csv format) and (iii) parameters settings (in yaml format). Tasks can also be run independently, or combined into custom workflows, including new tasks added by users. Overview of PANOPLY architecture and the various tasks that constitute the complete workflow. ![]() PANOPLY can analyze LC-MS/MS data input as (log-transformed) intensities or ratios. The types of liquid chromatography-tandem mass spectrometry-based (LC-MS/MS) proteomics data that are amenable to analysis by PANOPLY includes label-free and isobaric mass tag label-based LC-MS/MS approaches like iTRAQ and TMT profiling of the proteome and multiple PTM-omes including phospho-, acetyl- and ubiquitylomes. A complete list of PANOPLY task modules with documentation can be found at the PANOPLY wiki ( ). Most analysis modules include a report generation task that outputs an interactive HTML report summarizing results from the analysis ( Figure 1B). PANOPLY provides a comprehensive collection of proteogenomic data analysis methods including sample QC (sample quality evaluation using profile plots and tumor purity scores 1, identify sample swaps, etc.), association analysis, RNA and copy number correlation (to proteome), connectivity map (CMAP) analysis 5, outlier analysis using BlackSheep 6, PTM Signature Enrichment Analysis (PTM-SEA) 7, Gene Set Enrichment Analysis (GSEA) 8 and single-sample GSEA 7, consensus clustering, and multi-omic clustering using non-negative matrix factorization (NMF) ( Figure 1A). PANOPLY uses state-of-the-art statistical and machine learning algorithms to transform multi-omic data from cancer samples into biologically meaningful and interpretable results. A wide array of algorithms have been implemented, and we highlight the application of PANOPLY to the analysis of cancer proteogenomic data. In order to encapsulate the complex data processing required for proteogenomics, and provide a simple interface to deploy a range of algorithms developed for data analysis, we have developed PANOPLY-a cloud-based platform for automated and reproducible proteogenomic data analysis. Several publications by the Clinical Proteomic Tumor Analysis Consortium (CPTAC) and others have highlighted the impact of proteogenomics in enabling deeper insight into the biology of cancer and identification of potential drug targets 1– 4. Proteogenomics involves the integrative analysis of genomic, transcriptomic, proteomic and post-translational modification (PTM) data produced by next-generation sequencing and mass spectrometry-based proteomics.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |