Research Highlights

image image

Comprehensive characterisation of compartment-specific long non-coding RNAs associated with pancreatic ductal adenocarcinoma

We developed a computational framework to reconstruct the non-coding transcriptome from crosssectional RNA-Seq, integrating somatic copy number alterations (SCNA), common germline variants associated to PDA risk and clinical outcome. We generated a catalogue of PDA-associated lncRNAs. We showed that lncRNAs define molecular subtypes with biological and clinical significance. We identified lncRNAs in genomic regions with SCNA and single nucleotide polymorphisms associated with lifetime risk of PDA and associated with clinical outcome using genomic and clinical data in PDA. We found that loss of LINC00673 regulates the epithelial differentiation state in PDA cells, increases migratory capacity in vitro and in vivo, and results in loss of epithelial and gain of mesenchymal markers, both in vitro and in tumour samples. This finding is further reflected in poor clinical outcome in low LINC00673 tumours. We expect that the collection of PDA-associated lncRNAs will aid in the design of targeted therapies and may contribute to the development of improved diagnostic tools for PDA. The recent clinical approval of the first antisense therapy for human disease provides a viable, practical approach for leveraging this new understanding of cancer biology.

Comprehensive characterisation of compartment-specific long non-coding RNAs associated with pancreatic ductal adenocarcinoma
Luis Arnes, Zhaoqi Liu, Jiguang Wang, Hans Carlo Maurer, Irina Sagalovskiy, Marta Sanchez-Martin, Nikhil Bommakanti, Diana C Garofalo, Dina A Balderes, Lori Sussel, Kenneth P Olive, Raul Rabadan
Gut.Published Online First: 10 February 2018. doi: 10.1136/gutjnl-2017-314353

Geometry and topology of genomic data.

The Handbook of Discrete and Computational Geometry is intended as a reference book fully accessible to nonspecialists as well as specialists, covering all major aspects of both fields. The book offers the most important results and methods in discrete and computational geometry to those who use them in their work, both in the academic world - as researchers in mathematics and computer science - and in the professional world - as practitioners in - fields as diverse as operations research, molecular biology, and robotics. Discrete geometry has contributed significantly to the growth of discrete mathematics in recent years. This has been fueled partly by the advent of powerful computers and by the recent explosion of activity in the relatively young field of computational geometry. This synthesis between discrete and computational geometry lies at the heart of this Handbook. A growing list of application fields includes combinatorial optimization, computer-aided design, computer graphics, crystallography, data analysis, error-correcting codes, geographic information systems, motion planning, operations research, pattern recognition, robotics, solid modeling, and tomography.

Geometry and topology of genomic data.
A.J. Blumberg and R. Rabadan.
Chapter 64 in J.E. Goodman, J. O'Rourke, and C.D. Toth, editors,
Handbook of Discrete and Computational Geometry, 2nd edition,
CRC Press, Boca Raton, FL, 2017, pp. 1735-1773.


Spatiotemporal genomic architecture informs precision oncology in glioblastoma

Precision medicine in cancer proposes that genomic characterization of tumors can inform personalized targeted therapies. However, this proposition is complicated by spatial and temporal heterogeneity. Here we study genomic and expression profiles across 127 multisector or longitudinal specimens from 52 individuals with glioblastoma (GBM). Using bulk and single-cell data, we find that samples from the same tumor mass share genomic and expression signatures, whereas geographically separated, multifocal tumors and/or long-term recurrent tumors are seeded from different clones. Chemical screening of patient-derived glioma cells (PDCs) shows that therapeutic response is associated with genetic similarity, and multifocal tumors that are enriched with PIK3CA mutations have a heterogeneous drug-response pattern. We show that targeting truncal events is more efficacious than targeting private events in reducing the tumor burden. In summary, this work demonstrates that evolutionary inference from integrated genomic analysis in multisector biopsies can inform targeted therapeutic interventions for patients with GBM.

Spatiotemporal genomic architecture informs precision oncology in glioblastoma
Lee JK, Wang J, Sa JK, Ladewig E, Lee HO, Lee IH, Kang HJ, Rosenbloom DS, Camara PG, Liu Z, van Nieuwenhuizen P, Jung SW, Choi SW, Kim J, Chen A, Kim KT, Shin S, Seo YJ, Oh JM, Shin YJ, Park CK, Kong DS, Seol HJ, Blumberg A, Lee JI, Iavarone A, Park WY, Rabadan R, Nam DH.
Nat Genet. 2017 Apr. doi: 10.1038/ng.3806.

Reprogramming eukaryotic translation with ligand-responsive synthetic RNA switches

Protein synthesis in eukaryotes is regulated by diverse reprogramming mechanisms that expand the coding capacity of individual genes. One such mechanism is programmed ribosomal frameshifting (PRF). In this work, efficient PRF stimulatory RNA elements were discovered by in vitro selection, and then ligand-responsive switches were constructed by coupling PRF stimulatory elements to RNA aptamers using rational design and directed evolution. Motif discovery was enabled by the methodological novelty of deep sequencing an initially randomized library of RNA sharing a certain pseudoknot scaffold that had undergone multiple rounds of in vitro selection for PRF. This approach led to a rich characterization of precise pseudoknot geometries that can facilitate translation reprogramming, an area with great potential for synthetic biology.

Reprogramming eukaryotic translation with ligand-responsive synthetic RNA switches.
Anzalone AV, Lin AJ, Zairis S, Rabadan R, Cornish VW.
Nat Methods. 2016 Mar 21. doi: 10.1038/nmeth.3807.


Evolutionary history of deadly brain tumor


Glioblastoma (GBM) is the most common and most aggressive brain tumor in adults. Current treatment involves surgery, radiotherapy, and chemotherapy plus alkylation agents. Although intensively treated, GBM will always recur. The recurrent tumor will be typically resistant to therapy, leading to death. To understand how GBM evolves under therapy, we have analyzed longitudinal genomic/transcriptomic data from 114 patients, and uncovered the evolutionary landscape of GBM. Importantly, we found 63% of patients experience expression-based subtype changes, 15% of tumors present hypermutation at relapse in highly expressed genes, and 11% of recurrence tumors harbor mutations in LTBP4, which encodes a protein binding to TGF-β.

Clonal evolution of glioblastoma under therapy.
Jiguang Wang, Emanuela Cazzato, Erik Ladewig, Veronique Frattini, Daniel I S Rosenbloom, Sakellarios Zairis, Francesco Abate, Zhaoqi Liu, Oliver Elliott, Yong-Jae Shin, Jin-Ku Lee, In-Hee Lee, Woong-Yang Park, Marica Eoli, Andrew J Blumberg, Anna Lasorella, Do-Hyun Nam, Gaetano Finocchiaro, Antonio Iavarone, Raul Rabadan.
Nature Genetics 2016 June 6. doi: 10.1038/ng.3590.

Topological data analysis captures recombination from large genomic samples

Population-based recombination maps capture the recombination history of populations using genomic data and are a valuable tool in the study of human recombination. We have developed fast statistical estimators of the recombination rate based on topological summaries. Compared to standard linkage-based estimators, topology-based estimators can deal with a larger number of segregating sites and genomes without incurring excessive computational costs. Applying these estimators to phased genotype data of 647 human individuals, we have produced high-resolution, genome-wide maps of human recombination, which have uncovered several novel associations. Specific transcription factor binding sites are frequently associated with recombination. These include binding sites of MLL complexes, which play prominent regulatory roles in germ cell development and early embryogenesis. Additionally, some repeat-derived loci, coding families of transposable elements that are expressed during embryogenesis, are also enriched for recombination.

Topological Data Analysis Generates High-Resolution, Genome-wide Maps of Human Recombination
Pablo G. Camara, Daniel I.S. Rosenbloom, Kevin J. Emmett, Arnold J. Levine, Raul Rabadan.
Cell Systems 2016 June. doi: 10.1016/j.cels.2016.05.008.



Identifying Novel Noncoding RNAs

The human genome project has shown that only a small fraction (<2%) of human genome can be transcribed into mRNA that is further translated into protein, and the vast majority of the mammalian genome might express non-coding RNA (ncRNA). Although a number of long non-coding RNAs (lncRNAs) have been recently shown to play significant roles in the regulation of gene expression or protein activity in critical signaling pathways, the total number of ncRNAs and the fraction of functional ncRNAs within the mammalian genome are still mysteries. To reveal the landscape of ncRNA expression and specifically, to capture the expression of transient RNAs, we have developed an RNA-seq Analysis pipeline of Transcriptome Reconstruction and Annotation to Identify Novel non-coding RNAs from exosome deficient cells (ATRAIN).

RNA-Exosome-Regulated Long Non-coding RNA Transcription Controls Super-Enhancer Activity
Evangelos Pefanis*, Jiguang Wang*, Gerson Rothschild*, Junghyun Lim*, et al.
Cell 2015 May; doi: 10.1016/j.cell.2015.04.034
*These authors have contributed equally to this work.

Connections between Mendelian Diseases and Cancer

"If germline genetic variation in Mendelian loci predisposes bearers to common cancers, the same loci may harbour cancer-associated somatic variation. Compilations of clinical records spanning over 100 million patients provide an unprecedented opportunity to assess clinical associations between Mendelian diseases and cancers. We systematically compare these comorbidities against recurrent somatic mutations from more than 5,000 patients across many cancers. Using multiple measures of genetic similarity, we show that a Mendelian disease and comorbid cancer indeed have genetic alterations of significant functional similarity."
---Nature Communications

Genetic similarity between cancers and comorbid Mendelian diseases identifies candidate driver genes.
Rachel Melamed, Kevin Emmett, Chioma Madubata, et al.
Nature Communications 2015 April 30; doi:10.1038/ncomms8033.



Non-Hodgkin’s Lymphoma

"The first-ever systematic study of the genomes of patients with ALK-negative anaplastic large cell lymphoma (ALCL), a particularly aggressive form of non-Hodgkin’s lymphoma (NHL), shows that many cases of the disease are driven by alterations in the JAK/STAT3 cell signaling pathway. The study also demonstrates, in mice implanted with human-derived ALCL tumors, that the disease can be inhibited by compounds that target this pathway, raising hopes that more effective treatments might soon be developed."
---CUMC Newsroom

Convergent Mutations and Kinase Fusions Lead to Oncogenic STAT3 Activation in Anaplastic Large Cell Lymphoma.
Ramona Crescenzo*, Francesco Abate*, Elena Lasorsa*, et al
Cancer Cell 2015 April 13; doi:10.1016/j.ccell.2015.03.006.
* These authors have contributed equally to this work.

Chronic Lymphocytic Leukemia

A graph representing the sequence of genomic alterations in chronic lymphocytic leukemia (CLL). Each node represents a mutation, with arrows indicating temporal relationships between them. The size of the nodes indicates the number of patients in the study who exhibited the alteration, while the thickness of the lines shows how often the temporal relationships between nodes were seen. The method the researchers use enabled them to identify multiple, distinct evolutionary patterns in CLL.

Tumor evolutionary directed graphs and the history of chronic lymphocytic leukemia.
Jiguang Wang*, Hossein Khiabanian*, Davide Rossi*, et al.
eLife 2014 Dec 11; doi: 10.7554/eLife.02869.
* These authors have contributed equally to this work.


Identifying Novel Noncoding RNAs

Activation-induced cytidine deaminase (AID) is an enzyme that generates mutations and translocations in mature B cells to produce antibody diversity by targeting immunoglobulin loci, but “off-targets” of AID also lead to cancer. The mechanism of how AID finds its targets is still unclear. By conditionally knocking out a protein Exosc3 in the RNA exosome complex, we have identified a novel type of noncoding RNA, xTSS-RNA, which is most strongly expressed at genes that accumulate AID-mediated somatic mutations and/or are frequent translocation partners of DNA double-stranded breaks generated at the immunoglobulin heavy chain (IgH), indicating a role of this noncoding RNA in the AID targeting mechanism.

Noncoding RNA transcription targets AID to divergently transcribed loci in B cells
Evangelos Pefanis*, Jiguang Wang*, Gerson Rothschild*, Junghyun Lim, Jaime Chao, Raul Rabadan#, Aris N. Economides, Uttiya Basu#.
Nature 2014 Oct 16;514(7522):389-93. doi: 10.1038/nature13580.
*These authors have contributed equally to this work. #Corresponding authors.

Bacterial Evolution

Topological network representation of S. aureus genome profiles. Color corresponds to enrichment in mecA, an antibiotic resistance gene.

Characterizing Scales of Genetic Recombination and Antibiotic Resistance in Pathogenic Bacteria Using Topological Data Analysis.
Kevin Emmett and Raul Rabadan.
Lecture Notes in Computer Science 2014. Volume 8609, pp 540-551.


Tumor Evolution

Tumor evolutionary modes visualized in PΣ3.

  • A: frozen evolution
  • B: branched evolution
  • C: divergent evolution
  • D: linear evolution
  • E: somatic hypermutation

  • Moduli Spaces of Phylogenetic Trees Describing Tumor Evolutionary Patterns.
    Sakellarios Zairis, Hossein Khiabanian, Andrew Blumberg, and Raul Rabadan.
    Lecture Notes in Computer Science 2014. Volume 8609, pp 528-539. doi: 10.1007/978-3-319-09891-3_48.
    arXiv:1410.0980 [full version].


    Population Genetics

    Persistent homology computes topological invariants from point cloud data. Recent work has focused on developing statistical methods for data analysis in this framework. We show that, in certain models, parametric inference can be performed using statistics defined on the computed invariants. We develop this idea with a model from population genetics, the coalescent with recombination. We apply our model to an influenza dataset, identifying two scales of topological structure which have a distinct biological interpretation.

    Parametric Inference using Persistence Diagrams: A Case Study in Population Genetics.
    K Emmett, D Rosenbloom, P Camara, R Rabadan.
    International Conference on Machine Learning (ICML) Workshop on Topological Methods in Machine Learning June 2014. arXiv:1406.4582

    A Topological Approach to Modeling Evolution

    "Recent genomic studies have made it clear that evolution does not only proceed in a 'vertical' pattern in which one organism inherits genomic information from the organisms from which it descends (figure A). Scientists now understand that genomic evolution can also be 'horizontal'; that is, genomic information can be transferred between organisms or evolutionarily similar groups of organisms that are in parallel lineages (figure B), such as in cases of species hybridization in eukaryotes, lateral gene transfer in bacteria, recombination and reassortment in viruses, viral integration in eukaryotes, and fusion of genomes of symbiotic species. These observations suggest that phylogenetic trees have limitations in their ability to characterize evolution at the molecular level and that another model is needed that can integrate both vertical and horizontal evolution.
    ---CU Systems Biology News

    Topology of viral evolution.
    Chan JM, Carlsson G, Rabadan R.
    Proc Natl Acad Sci USA 2013 Oct 29. doi: 10.1073/pnas.1313480110.