Although mutations in the gene encoding the RNA splicing factor SF3B1 are frequent in multiple cancers, their functional effects and therapeutic dependencies are poorly understood. Here, we characterize 98 tumors and 12 isogenic cell lines harboring SF3B1 hotspot mutations, identifying hundreds of cryptic 3′ splice sites common and specific to different cancer types. Regulatory network analysis revealed that the most common SF3B1 mutation activates MYC via effects conserved across human and mouse cells. SF3B1 mutations promote decay of transcripts encoding the protein phosphatase 2A (PP2A) subunit PPP2R5A, increasing MYC S62 and BCL2 S70 phosphorylation which, in turn, promotes MYC protein stability and impair apoptosis, respectively. Genetic PPP2R5A restoration or pharmacologic PP2A activation impaired SF3B1-mutant tumorigenesis, elucidating a therapeutic approach to aberrant splicing by mutant SF3B1.
Here, we identify that mutations in SF3B1, the most commonly mutated splicing factor gene across cancers, alter splicing of a specific subunit of the PP2A serine/threonine phosphatase complex to confer post-translational MYC and BCL2 activation, which is therapeutically intervenable using an FDA-approved drug.
Since the identification of the first cases of the coronavirus in December 2019 in Wuhan, China, there has been a significant amount of confusion regarding the origin and spread of the so-called 'coronavirus', officially named SARS-CoV-2, and the cause of the disease COVID-19. Conflicting messages from the media and officials across different countries and organizations, the abundance of disparate sources of information, unfounded conspiracy theories on the origins of the newly emerging virus and the inconsistent public health measures across different countries, have all served to increase the level of anxiety in the population. Where did the virus come from? How is it transmitted? How does it cause disease? Is it like flu? What is a pandemic? What can we do to stop its spread? Written by a leading expert, this concise and accessible introduction provides answers to the most common questions surrounding coronavirus for a general audience.
Topological Data Analysis for Genomics and Evolution, from Cambridge University Press, explores biology in the age of Big Data. A technical revolution has transformed the field, and extracting meaningful information from large biological data sets is now a central methodological challenge. Algebraic topology is a well-established branch of pure mathematics that studies qualitative descriptors of the shape of geometric objects. It aims to reduce comparisons of shape to a comparison of algebraic invariants, such as numbers, which are typically easier to work with. Topological data analysis is a rapidly developing subfield that leverages the tools of algebraic topology to provide robust multiscale analysis of data sets. This book introduces the central ideas and techniques of topological data analysis and its specific applications to biology, including the evolution of viruses, bacteria and humans, genomics of cancer, and single cell characterization of developmental processes. Bridging two disciplines, the book is for researchers and graduate students in genomics and evolutionary biology as well as mathematicians interested in applied topology.Topological Data Analysis for Genomics and Evolution: Topology in Biology
The human leukocyte antigen (HLA) locus plays a critical role in tissue compatibility and regulates the host response to many diseases, including cancers and autoimmune disorders. Recent improvements in the quality and accessibility of next-generation sequencing have made HLA typing from standard short-read data practical. However, this task remains challenging given the high level of polymorphism and homology between HLA genes. HLA typing from RNA sequencing is further complicated by post-transcriptional modifications and bias due to amplification.
To address this, the Rabadan Lab developed arcasHLA, a fast and accurate in silico tool that infers HLA genotypes from RNA sequencing data. Our tool outperforms established tools on the gold-standard benchmark dataset for HLA typing in terms of both accuracy and speed, with an accuracy rate of 100% at two-field resolution for class I genes, and over 99.7% for class II. Furthermore, we evaluate the performance of our tool on a new biological dataset of 447 single-end total RNA samples from nasopharyngeal swabs, and establish the applicability of arcasHLA in metatranscriptome studies.
Immune checkpoint inhibitors have been successful across several tumor types; however, their efficacy has been uncommon and unpredictable in glioblastomas (GBM), where <10% of patients show long-term responses. To understand the molecular determinants of immunotherapeutic response in GBM, we longitudinally profiled 66 patients, including 17 long-term responders, during standard therapy and after treatment with PD-1 inhibitors (nivolumab or pembrolizumab). Genomic and transcriptomic analysis revealed a significant enrichment of PTEN mutations associated with immunosuppressive expression signatures in non-responders, and an enrichment of MAPK pathway alterations (PTPN11, BRAF) in responders. Responsive tumors were also associated with branched patterns of evolution from the elimination of neoepitopes as well as with differences in T cell clonal diversity and tumor microenvironment profiles. Our study shows that clinical response to anti-PD-1 immunotherapy in GBM is associated with specific molecular alterations, immune expression signatures, and immune infiltration that reflect the tumor’s clonal evolution during treatment.
The top figure shows Brain MRIs of two patients treated with nivolumab, one of whom showed disease progression following 2 months of treatment (left, NU 7) while the other showed stable disease without progression after 17 months of treatment (right, NU 11). The bottom figure is a Kaplan–Meier curve comparing overall survival of patients who responded to anti-PD-1 therapy (n = 13) with those that did not respond (n = 12).
Outcomes of anticancer therapy vary dramatically among patients due to diverse genetic and molecular backgrounds, highlighting extensive intertumoral heterogeneity. The fundamental tenet of precision oncology defines molecular characterization of tumors to guide optimal patient-tailored therapy. Towards this goal, we have established a compilation of pharmacological landscapes of 462 patient-derived tumor cells (PDCs) across 14 cancer types, together with genomic and transcriptomic profiling in 385 of these tumors. Compared with the traditional long-term cultured cancer cell line models, PDCs recapitulate the molecular properties and biology of the diseases more precisely. Here, we provide insights into dynamic pharmacogenomic associations, including molecular determinants that elicit therapeutic resistance to EGFR inhibitors, and the potential repurposing of ibrutinib (currently used in hematological malignancies) for EGFR-specific therapy in gliomas. Lastly, we present a potential implementation of PDC-derived drug sensitivities for the prediction of clinical response to targeted therapeutics using retrospective clinical studies.Pharmacogenomic landscape of patient-derived tumor cells informs precision oncology therapy
Analyzing large-scale, multi-experiment studies requires scientists to test each experimental outcome for statistical significance and then assess the results as a whole. We present Black Box FDR (BB-FDR), an empirical-Bayes method for analyzing multi-experiment studies when many covariates are gathered per experiment. BB-FDR learns a series of black box predictive models to boost power and control the false discovery rate (FDR) at two stages of study analysis. In Stage 1, it uses a deep neural network prior to report which experiments yielded significant outcomes. In Stage 2, a separate black box model of each covariate is used to select features that have significant predictive power across all experiments. In benchmarks, BB-FDR outperforms competing state-of-the-art methods in both stages of analysis. We apply BB-FDR to two real studies on cancer drug efficacy. For both studies, BB-FDR increases the proportion of significant outcomes discovered and selects variables that reveal key genomic drivers of drug sensitivity and resistance in cancer.Black Box FDR
We developed a computational framework to reconstruct the non-coding transcriptome from crosssectional RNA-Seq, integrating somatic copy number alterations (SCNA), common germline variants associated to PDA risk and clinical outcome. We generated a catalogue of PDA-associated lncRNAs. We showed that lncRNAs define molecular subtypes with biological and clinical significance. We identified lncRNAs in genomic regions with SCNA and single nucleotide polymorphisms associated with lifetime risk of PDA and associated with clinical outcome using genomic and clinical data in PDA. We found that loss of LINC00673 regulates the epithelial differentiation state in PDA cells, increases migratory capacity in vitro and in vivo, and results in loss of epithelial and gain of mesenchymal markers, both in vitro and in tumour samples. This finding is further reflected in poor clinical outcome in low LINC00673 tumours. We expect that the collection of PDA-associated lncRNAs will aid in the design of targeted therapies and may contribute to the development of improved diagnostic tools for PDA. The recent clinical approval of the first antisense therapy for human disease provides a viable, practical approach for leveraging this new understanding of cancer biology.Comprehensive characterisation of compartment-specific long non-coding RNAs associated with pancreatic ductal adenocarcinoma
Dissecting the pathogenesis of classical Hodgkin lymphoma (cHL), a common cancer in young adults, remains challenging because of the rarity of tumor cells in involved tissues (usually lower than 5%). Here, we analyzed the coding genome of cHL by microdissecting tumor and normal cells from 34 patient biopsies for a total of ∼50 000 singly isolated lymphoma cells. We uncovered several recurrently mutated genes, namely, STAT6 (32% of cases), GNA13 (24%), XPO1 (18%), and ITPKB (16%), and document the functional role of mutant STAT6 in sustaining tumor cell viability. Mutations of STAT6 genetically and functionally cooperated with disruption of SOCS1, a JAK-STAT pathway inhibitor, to promote cHL growth. Overall, 87% of cases showed dysregulation of the JAK-STAT pathway by genetic alterations in multiple genes (also including STAT3, STAT5B, JAK1, JAK2, and PTPN1), attesting to the pivotal role of this pathway in cHL pathogenesis and highlighting its potential as a new therapeutic target in this disease.
The Handbook of Discrete and Computational Geometry is intended as a reference book fully accessible to nonspecialists as well as specialists, covering all major aspects of both fields. The book offers the most important results and methods in discrete and computational geometry to those who use them in their work, both in the academic world - as researchers in mathematics and computer science - and in the professional world - as practitioners in - fields as diverse as operations research, molecular biology, and robotics. Discrete geometry has contributed significantly to the growth of discrete mathematics in recent years. This has been fueled partly by the advent of powerful computers and by the recent explosion of activity in the relatively young field of computational geometry. This synthesis between discrete and computational geometry lies at the heart of this Handbook. A growing list of application fields includes combinatorial optimization, computer-aided design, computer graphics, crystallography, data analysis, error-correcting codes, geographic information systems, motion planning, operations research, pattern recognition, robotics, solid modeling, and tomography.Geometry and topology of genomic data.
Precision medicine in cancer proposes that genomic characterization of tumors can inform personalized targeted therapies. However, this proposition is complicated by spatial and temporal heterogeneity. Here we study genomic and expression profiles across 127 multisector or longitudinal specimens from 52 individuals with glioblastoma (GBM). Using bulk and single-cell data, we find that samples from the same tumor mass share genomic and expression signatures, whereas geographically separated, multifocal tumors and/or long-term recurrent tumors are seeded from different clones. Chemical screening of patient-derived glioma cells (PDCs) shows that therapeutic response is associated with genetic similarity, and multifocal tumors that are enriched with PIK3CA mutations have a heterogeneous drug-response pattern. We show that targeting truncal events is more efficacious than targeting private events in reducing the tumor burden. In summary, this work demonstrates that evolutionary inference from integrated genomic analysis in multisector biopsies can inform targeted therapeutic interventions for patients with GBM.
Protein synthesis in eukaryotes is regulated by diverse reprogramming mechanisms that expand the coding capacity of individual genes. One such mechanism is programmed ribosomal frameshifting (PRF). In this work, efficient PRF stimulatory RNA elements were discovered by in vitro selection, and then ligand-responsive switches were constructed by coupling PRF stimulatory elements to RNA aptamers using rational design and directed evolution. Motif discovery was enabled by the methodological novelty of deep sequencing an initially randomized library of RNA sharing a certain pseudoknot scaffold that had undergone multiple rounds of in vitro selection for PRF. This approach led to a rich characterization of precise pseudoknot geometries that can facilitate translation reprogramming, an area with great potential for synthetic biology.Reprogramming eukaryotic translation with ligand-responsive synthetic RNA switches.
Glioblastoma (GBM) is the most common and most aggressive brain tumor in adults. Current treatment involves surgery, radiotherapy, and chemotherapy plus alkylation agents. Although intensively treated, GBM will always recur. The recurrent tumor will be typically resistant to therapy, leading to death. To understand how GBM evolves under therapy, we have analyzed longitudinal genomic/transcriptomic data from 114 patients, and uncovered the evolutionary landscape of GBM. Importantly, we found 63% of patients experience expression-based subtype changes, 15% of tumors present hypermutation at relapse in highly expressed genes, and 11% of recurrence tumors harbor mutations in LTBP4, which encodes a protein binding to TGF-β.Clonal evolution of glioblastoma under therapy.
Population-based recombination maps capture the recombination history of populations using genomic data and are a valuable tool in the study of human recombination. We have developed fast statistical estimators of the recombination rate based on topological summaries. Compared to standard linkage-based estimators, topology-based estimators can deal with a larger number of segregating sites and genomes without incurring excessive computational costs. Applying these estimators to phased genotype data of 647 human individuals, we have produced high-resolution, genome-wide maps of human recombination, which have uncovered several novel associations. Specific transcription factor binding sites are frequently associated with recombination. These include binding sites of MLL complexes, which play prominent regulatory roles in germ cell development and early embryogenesis. Additionally, some repeat-derived loci, coding families of transposable elements that are expressed during embryogenesis, are also enriched for recombination.Topological Data Analysis Generates High-Resolution, Genome-wide Maps of Human Recombination
The human genome project has shown that only a small fraction (<2%) of human genome can be transcribed into mRNA that is further translated into protein, and the vast majority of the mammalian genome might express non-coding RNA (ncRNA). Although a number of long non-coding RNAs (lncRNAs) have been recently shown to play significant roles in the regulation of gene expression or protein activity in critical signaling pathways, the total number of ncRNAs and the fraction of functional ncRNAs within the mammalian genome are still mysteries. To reveal the landscape of ncRNA expression and specifically, to capture the expression of transient RNAs, we have developed an RNA-seq Analysis pipeline of Transcriptome Reconstruction and Annotation to Identify Novel non-coding RNAs from exosome deficient cells (ATRAIN).RNA-Exosome-Regulated Long Non-coding RNA Transcription Controls Super-Enhancer Activity
"If germline genetic variation in Mendelian loci predisposes bearers to common cancers, the same loci may harbour cancer-associated somatic variation. Compilations of clinical records spanning over 100 million patients provide an unprecedented opportunity to assess clinical associations between Mendelian diseases and cancers. We systematically compare these comorbidities against recurrent somatic mutations from more than 5,000 patients across many cancers. Using multiple measures of genetic similarity, we show that a Mendelian disease and comorbid cancer indeed have genetic alterations of significant functional similarity."
"The first-ever systematic study of the genomes of patients with ALK-negative anaplastic large cell lymphoma (ALCL), a particularly aggressive form of non-Hodgkin’s lymphoma (NHL), shows that many cases of the disease are driven by alterations in the JAK/STAT3 cell signaling pathway. The study also demonstrates, in mice implanted with human-derived ALCL tumors, that the disease can be inhibited by compounds that target this pathway, raising hopes that more effective treatments might soon be developed."
A graph representing the sequence of genomic alterations in chronic lymphocytic leukemia (CLL). Each node represents a mutation, with arrows indicating temporal relationships between them. The size of the nodes indicates the number of patients in the study who exhibited the alteration, while the thickness of the lines shows how often the temporal relationships between nodes were seen. The method the researchers use enabled them to identify multiple, distinct evolutionary patterns in CLL.Tumor evolutionary directed graphs and the history of chronic lymphocytic leukemia.
Activation-induced cytidine deaminase (AID) is an enzyme that generates mutations and translocations in mature B cells to produce antibody diversity by targeting immunoglobulin loci, but “off-targets” of AID also lead to cancer. The mechanism of how AID finds its targets is still unclear. By conditionally knocking out a protein Exosc3 in the RNA exosome complex, we have identified a novel type of noncoding RNA, xTSS-RNA, which is most strongly expressed at genes that accumulate AID-mediated somatic mutations and/or are frequent translocation partners of DNA double-stranded breaks generated at the immunoglobulin heavy chain (IgH), indicating a role of this noncoding RNA in the AID targeting mechanism.
Topological network representation of S. aureus genome profiles. Color corresponds to enrichment in mecA, an antibiotic resistance gene.
Tumor evolutionary modes visualized in PΣ3.
Persistent homology computes topological invariants from point cloud data. Recent work has focused on developing statistical methods for data analysis in this framework. We show that, in certain models, parametric inference can be performed using statistics defined on the computed invariants. We develop this idea with a model from population genetics, the coalescent with recombination. We apply our model to an influenza dataset, identifying two scales of topological structure which have a distinct biological interpretation.
"Recent genomic studies have made it clear that evolution does not only proceed in a 'vertical' pattern in which one organism inherits genomic information from the organisms from which it descends (figure A). Scientists now understand that genomic evolution can also be 'horizontal'; that is, genomic information can be transferred between organisms or evolutionarily similar groups of organisms that are in parallel lineages (figure B), such as in cases of species hybridization in eukaryotes, lateral gene transfer in bacteria, recombination and reassortment in viruses, viral integration in eukaryotes, and fusion of genomes of symbiotic species. These observations suggest that phylogenetic trees have limitations in their ability to characterize evolution at the molecular level and that another model is needed that can integrate both vertical and horizontal evolution.
---CU Systems Biology News