
Synopsis Track 7: NGS in Real Time @pharma_BI 2018 CHI’s BioIT World conference & Expo, May 15 – 17, 2018, Boston, MA – Seaport World Trade Center
http://www.bio-itworldexpo.com/
LPBI Group will cover Track 7: NGS in Real Time
Aviva Lev-Ari, PhD, RN will be in attendance
- e-Scientific Publishing: The Competitive Advantage of a Powerhouse for Curation of Scientific Findings and Methodology Development for e-Scientific Publishing – LPBI Group, A Case in Point – Updated on 4/2/2018
2018 Plenary Keynote Speakers
.jpg?w=500)
Executive Vice President and Chief Medical Officer, Liberty BioSecurity

Founder, TCB Analytics
.jpg?w=500)
Vice President, Data Sciences, Genomics, and Bioinformatics, Alexion Pharmaceuticals, Inc.
.jpg?w=500)
Vice President, Biostatistics, Merck Research Laboratories (Retired)
.jpg?w=500)
Chief Data Science Officer, H3 Biomedicine
Join the Community at Bio-IT World Conference & Expo
TUESDAY, MAY 15
7:00 am Workshop Registration Open and Morning Coffee
8:00 – 11:30 Recommended Morning Pre-Conference Workshops*
W4. Introduction to Scalable and Reproducible RNA-Seq Data Processing, Analysis, and Result Reporting Using AWS, R, knitr, and LaTex
12:30 – 4:00 pm Recommended Afternoon Pre-Conference Workshops*
W13. Leveraging Cloud Technologies to Enable Large-Scale Integration of Human Genome and Clinical Outcomes Data
* Separate registration required.
2:00 – 6:30 Main Conference Registration Open
4:00 PLENARY KEYNOTE SESSION
Click here for detailed information
5:00 – 7:00 Welcome Reception in the Exhibit Hall with Poster Viewing
WEDNESDAY, MAY 16
7:00 am Registration Open and Morning Coffee
8:00 PLENARY KEYNOTE SESSION
Click here for detailed information
9:45 Coffee Break in the Exhibit Hall with Poster Viewing
LARGE-SCALE RNA-SEQ AND GENE EXPRESSION VARIABILITY
10:50 Chairperson’s Remarks
11:00 KEYNOTE PRESENTATION: RNA-Seq X: Look Back and Look Ahead
Shanrong Zhao, PhD, Director, Computational Biology and Bioinformatics, Pfizer, Inc.
Since Dr. Mortazavi published his groundbreaking research entitled “Mapping and Quantifying Mammalian Transcriptomes by RNA-Seq” in Nature Methods in 2008, RNA-seq has evolved rapidly and revolutionized biological research, drug development and clinical diagnostics. 2018 is the 10-year anniversary of RNA-seq, and it’s the right time to look back and look forward.
11:30 LCA: A Robust and Scalable Algorithm to Reveal Subtle Diversity in Large-Scale Single-Cell RNA Sequencing Data
Xiang Chen, PhD, Assistant Member, Department of Computational Biology, St. Jude Children’s Research Hospital
We developed Latent Cellular Analysis (LCA), a machine learning based single-cell RNA sequencing (scRNA-seq) analytical pipeline that combines similarity measurement by latent cellular states and a graph based clustering algorithm featuring dual-space model search for both the optimal number of subpopulations and the informative cellular states distinguishing them. LCA has proved to be robust, accurate and powerful by comparison to multiple state-of-the-art computational methods on large-scale real and simulated scRNA-seq data.
12:00 pm Presentation to be Announced
12:15 RSEQREP: An Open-Source Cloud-Enabled Framework for Reproducible RNA-Seq Data Processing, Analysis & Result Reporting
Johannes Goll, Director, Bioinformatics, The Emmes Corporation
RSEQREP (RNA-Seq Reports) is a new open-source cloud-enabled framework that allows researchers to execute start-to-end RNA-Seq analysis to characterize transcriptomics changes in human cells following treatment. It outputs dynamically generated reports using R and LaTeX. We provide results for a published RNA-Seq study to characterize transcriptomics changes following influenza vaccination.
12:30 Session Break
12:40 Luncheon Presentation I: Querying of 100k Genomes Using Google Cloud
Hákon Gudbjartsson, PhD, Chief Informatics Officer, WuXi NextCODE
Hákon Gudbjartsson will demonstrate the power of the GOR database in real time. GORdb is used to organize, mine and share massive genome datasets, providing a global architecture for the largest precision medicine efforts worldwide. It’s designed to enable fast, computationally-efficient use of sequence data, and allows for the query and application of data in the context of reference sets.
1:10 Luncheon Presentation II (Sponsorship Opportunity Available) or Enjoy Lunch on Your Own
1:40 Session Break
OPTIMIZING GENE BASES WITH CODON USAGE
1:50 Chairperson’s Remarks
Leonard Lipovich, PhD, Associate Professor with Tenure, Center for Molecular Medicine and Genetics, Wayne State University
1:55 Analysis of Codon Optimized Therapeutic Proteins Using Ribosome Profiling
Chava Kimchi-Sarfaty, PhD, Research Chemist, Principal Investigator, OTAT Acting Deputy Associate Director for Research, Division of Plasma Protein Therapeutics, Office of Tissues and Advanced Therapies, FDA | CBER | OTAT
Codon optimization is a genetic engineering technique used to improve the yield of recombinant therapeutic proteins. Despite being used ubiquitously to increase protein expression, codon optimization requires widespread substitution of synonymous codons across the native expression sequence. This degree of genetic manipulation can carry consequences, including altered conformation of the recombinant product. These unforeseen modifications can have impacts on protein function and health outcomes, and are of high regulatory importance. To study these techniques, we have used ribosome profiling, a technique used to characterize the translation pattern of the ribosome across the mRNA transcript. In this technique, actively translating ribosomes are cross‐linked to mRNA and is followed by nuclease digestion of mRNA not protected by a ribosome, generating short mRNA fragments (called “ribosome footprints”). These fragments are sequenced and aligned to generate a differential coverage map across portions of the transcript. This technique provides insight into the relative translation efficiency in a given area of the transcript. We have analyzed the ribosome profiling data for relationships to codon usage. By identifying regions of differential ribosome profiling patterns between wild type and codon optimized transcripts, we aim to create a method of selecting regions to leave unmodified, allowing recombinant proteins to benefit from increased expression while maintaining the integrity and safety of the protein product. Codon optimization as a technique relies heavily on accurate codon usage statistics of the organism in question, to identify rare codons to be replaced with common codons for an increase in translation efficiency. However, previous databases containing this information were either outdated or limited in scope. To address this gap in knowledge, we constructed a new database containing codon usage tables for all the species in GenBank and RefSeq. We designed a program in Python to download, parse, and organize all the sequence data available in these two repositories, and in Javascript designed an accessible web portal available to the public to query the new database. The new HIVE‐CUTs database contains substantially more organisms and coding sequence data and is a dramatic improvement upon prior databases. This tool will aid in the effective implementation of codon optimization techniques and other areas of recombinant protein design.
- FDA approved 2011-2017 – Therapeutic Proteins by Drug Class: i.e., Oncology , Hematology
- Effort on Gene therapy is primarily in Cancer Disease
- Factor 9 and Hemophillia B
- Codon Optimization to increase Protein expression TECHNIQUES:
- Transcription
- mRNA
- Codon Usage Database – Tables open to the public -9606 (Homo Sapient)
- Evaluation of Codon Optimized Factor 9 – Altered Protein Structure Binding Affinity, protein folding
- Riboson profile: Protected Profile 20-22 or 24-27 nucleotides [PCR Sequence] Sites: A, P, E
- Analysis of Ribosome Profiling Data: Correlations F9 and ACTB and GAPDH – each against Codon Optimizer
- In Silico: translation kinetics based solely on calculated codon usage frequency
- CONCLUSION:
- Ribosomal profiling data do not correlate with codon Optimization
- Genetically engineered therapeutics – benefit from Codon Optimization
2:25 Multidimensional Global Proteogenomics Identifies Persistent Ribosomal In-Frame Mis-Translation of Stop Codons as Amino Acids in Multiple Open Reading Frames from a Human Breast Cancer Long Non-Coding RNA
Leonard Lipovich, PhD, Associate Professor with Tenure, Center for Molecular Medicine and Genetics, Wayne State University
Two-thirds of the ~60,000 human genes (www.gencodegenes.org) do not encode known proteins, and aside from long non-coding RNA (lncRNA) genes with recently characterized functions, the possibility that these poorly understood genes’ transcripts serve as de-facto unconventional messenger RNAs has not been formally excluded. Our group was the first to use direct evidence from protein mass spectrometry, preceding efforts that employed indirect evidence from ribosome profiling, to demonstrate that specific lncRNAs are recurrently and nonrandomly translated in human cells (Bánfai et al 2012, Genome Research 22:1646-1657). In our current study, we integrated RNAseq, ribosome profiling, and mass spectrometry to globally assess lncRNA translation in human estrogen receptor alpha positive MCF7 breast cancer cells. We identified 27 peptides, mapping to multiple sense-strand open reading frames (ORFs) of the lncRNA gene MMP24-AS1, united by a novel and highly unconventional property: the existence of these peptides can only be explained by stop-to-nonstop in-frame replacements of specific UAG and UGA (but not UAA) stop codons by amino acids. This result, validated by the absence of any genomic mutations, polymorphisms, and RNA editing events in genomic and cDNA targeted resequencing, represents an unprecedented apparent gene-specific violation of the Genetic Code in human breast cancer cells, and hints at a new mechanism enhancing the combinatorial complexity of the cancer proteome.
[Note 1: This work has been funded in its entirety by the NIH Director’s New Innovator Award 1DP2-CA196375 to LL.]
[Note 2: This project encompasses collaborations. A full listing of co-authors will be shown during the talk.]
- LncRNA
- ENCODE –
- WG 6-frame tRanslation + mass spectrometric data {mass specriptom] = empirical redefinition of the genomic sequencing field
- MMP24 maps – Breast Cancer
- Mechanism: MisTranslation – Translation Infidelity: why are only UAG and UGA, never UAA, reference-genome stop codons are affected
2:55 CO-PRESENTATION: Workflow Optimization for NGS Discovery – How to Drive BIX Insights
Jack DiGiovanna, PhD, General Manager, NGS Applications and Services, Seven Bridges Genomics (->2009) 250 CS
Isaac M. Neuhaus, PhD, Director, Computational Genomics, Bristol Myers Squibb
- Predict immuno-oncoogy outcomes
- Biomarkers
- Microsatellite Instability (MSI) – short tanden repeats of 1 to 6 base-pairs: Detections of MSI mutations in somatic variants, Profiling
- Whole Exome gene data
- Companion diagnostics
- Colorectal adenocarcinomas with MSI status
- Validating predictions
- Tumor Heterogeneity: Clones have different potentials to metastasize
- Heterogeneity (purity) work flow & Validation – Variant Allele Frequency
- MSI sensor score: Benchmarking MSIsensor vs Tumor Purity
- Clinical MSI data
3:25 Refreshment Break in the Exhibit Hall with Poster Viewing
NGS DATA ANALYSIS, INTEGRATION, INTERPRETATION, AND VISUALIZATION
4:00 Variant Query Tool: Drag & Drop for a Scalable, Server-Less, Web UI to Querying Annotated Variants
William Van Etten, Senior Scientific Consultant, BioTeam
It’s a challenge to build an environment that provides real-time querying of reads and annotated variants for genomics research, requiring significant human and computational resources. Whether tens or thousands of genomes, the barrier to entry can be high for the biologists/geneticist, who might not also be computer scientist. BioTeam has developed a simple tool that leverages several AWS services (S3, Athena, Lambda, Cognito, IAM, CloudWatch) to enable a biologists/geneticist to drag & drop VCF and BAM files onto an S3 bucket, then point their web browser at this bucket, to provide a scalable, server-less, web UI to querying the reads and annotated variants within these files. We aim to demonstrate, explain, and promote what we’ve learned from this proof of concept software development in the hope that others might benefit from our experience.
- Amazon Athena API Introduced – Variant Query Tool – Server-less
4:30 Building a GXP Validated Platform for NGS Analysis Pipelines
Anthony Rowe, PhD, Business Technology Leader, R&D IT, Janssen R&D LLC
An NGS applications approach the clinic the bioinformatics pipelines used to analyze the data have to be validated to demonstrate their correctness. This talk will present Janssen approach to deploying validated NGS applications with specific focus in microbiome metagnomics.
- IBD: UlcerativeColitis (UC), Crohn’s DIsease (CD)
- $11Bil – $28Billion – Cost burden in Health Care systems
- J&J and Vedanta announced a collaboration DNAnexus
- Microbiome based approach for IBD: Gut Dysbiosis, beneficial microbes symbionts, Pathogenesis
- Jeanssen developing a new platform for DIsease therapy
- How to analyze microbiome data as drug?
- Stelara Biotherapeutics – 27 microbes
- PK PD of antigen therapy
- Biotherapeutic product like VE202 –
- Can VE202 be detected in stool
- Real Time NGS – The Clinical Novel NSG for Clinical Trials – Emerging Science meets Regulated Science
- Emerging Science: Novel NGS Informatics
- Tool Box for Microbiome: Biomathematicians carried workflow to Clinical Trials
- Computational workflow: Step 1: alienment 1,2,3,4
- 7 samples 4 tools: Run Time cost result quality
- Quality Control: for Clinical NGS Platform:Manufacturing, Software QA,
- current system Overview: sequencing Vendors prtnerships
- Establish scientific ladscape and a structured drug development process
5:00 LIMS or ELN, Which Do You Need? [Electronic Lab Notebook]
Kevin Cramer, CEO, Sapio Sciences (->2007)
Both Biotech and Pharma need Laboratory Information Management (LIMS) and Electronic Lab Notebook (ELN) capabilities. Sapio has eliminated the barriers between these two product areas by leveraging its more than decade of unique experience offering both LIMS and ELN solutions and combining the key features of each solution into one, best of breed, product: Exemplar ELN Pro.
- Configurable data model for LIMS Platform
- LIMS 1.0 Configure Data Model
- LIMS 2.0 Workflow enginw for tracking complex processes
- NGS – 2008 Celexa,
- ELN – 2015 Exemplar support ad Hoc experimentation, clean sheet design
- create RQS for Samples
- Assign processes
- Track progres
- register Samples
- Register plates Aliquoting Define Storage
- Graphical Assignment
- register consumables
- ELN: Spreadsheets, Office Integration, Drag & Drop experiment items, Curve fitting, R Stat, Charting/visualization
- Author/witness/Reviewer/Approver Accept/reject attach instrument design
- ELN, LIMS, ELN Pro[fessional]
- ELN Pro: ELN plus LIMS properties
- Exemplar ELN Pro: storage mgm
Sapio Sciences – Exemplar ELN Pro
- Integrate collaboration,charting tools
- global repository
- Chemaxon Integration
- Prebuilt NGS pipelines out of the box
5:15 Sponsored Presentation (Opportunity Available)
5:30 Best of Show Awards Reception in the Exhibit Hall with Poster Viewing
7:00 – 10:00 Bio-IT World After Hours @Lawn on D
THURSDAY, MAY 17
7:30 am Registration Open and Morning Coffee
8:00 PLENARY KEYNOTE SESSION & AWARDS PROGRAM
Click here for detailed information
9:45 Coffee Break in the Exhibit Hall and Poster Competition Winners Announced
APPLICATION OF NGS TO ONCOLOGY, IMMUNOLOGY, DIAGNOSTICS, AND THERAPEUTIC DEVELOPMENT
10:30 Chairperson’s Remarks, Bruce Press, EVP Seven Bridges Genomics
10:40 Instantiating a Single Point of Truth for Genomic Reference Data
David Herzig, Scientist, Research Informatics, Roche Pharmaceuticals
This talk will exemplify how expression and mutation data were made actionable by consolidating a scattered landscape of genomic reference data into a real SPoT.
- Common System Landscapes
- Data Sources
- Silo Solutions
- API
- Consumers: Bioinformatics, Data Science, JBrowser
- Single Source of Truth
- Moving away from Silo solutions
- Roche Data Commons
Physical HW
File SYstem & Workflow
Single Point of Truth SSOT
Integration Data Mart
UI
All data goes into Data Storage API as input and Output from the data storage
- Requirements: Functional
- API – R Interface, Python
- Support Arvados Platform (https://curoverse.com)
Evaluation
- inhouse development
- customization of open source
- Ensembl
- Genomics Reference Data manu species
- Multi species DB [stable ID] MySQL DB is been used
- API & SW: REST, Tools, Web Code
- Modules – Variation, Funcgen, other features CORE – genome annotation
- SOLUTION – 150 hours = Homo-sapiens Variation is the most time consuming Ensebl REST API Endpoints
- Customization: NCBI: DOwnload, Unzip data – Import data – Ensembl PERL Script goes into SPoT (SSoT)
- Into CORE – only species of interestLoading Log – UPDATE META TABLE (Roche Data from BioInformatics Dept)
- SSOT & Arvados – 2 updates a year, 5 versions are available in parallel: Portal Page, latest version: http://genomes.roce.com:3091 http://genomes.roche.com/latest
- USE CASES: Comparative Genomics
11:10 A Network-Based Approach to Understanding Drug Toxicity
Yue Webster, PhD, Principal Research Scientist, Informatics Capabilities, Research IT, Eli Lilly and Company
Despite investment in toxicogenomics, nonclinical safety studies are still used to predict clinical liabilities for new drug candidates. Network-based approaches for genomic analysis help overcome challenges with whole-genome transcriptional profiling using limited numbers of treatments for phenotypes of interest. Herein, we apply co-expression network analysis to safety assessment using rat liver gene expression data to define 415 modules, exhibiting unique transcriptional control, organized in a visual representation of the transcriptome. Compared to gene-level analysis alone, the network approach identifies significantly more phenotype-gene associations, including established and novel biomarkers of liver injury.
- Phase III Clinical Trials fail due to Drug Toxicity – TXG – MAP
- Food preservative BHA
- antibiotic TB patient Trecator – like Tunicamycin
- blood thinner – Ticlid
- n-dimensional problem space for Toxicity:Gene Expression COmplexity DYnamic COmplexity, pathophysiology complexity
- TRANSLATION: from Animal to Human Clinical Trials – Failure of Clinical Trial equals to failure of the TRANSLATION
- Modules in DNA and RNA
- Protein structure
- Reduce dimensionality of the information space
- Changes of patterns – risk assessment of the confidence in translation
- Gene vs System-level View – Tunicamycin – image TXG – MAP = unsupervised approach to convert a table of data into ONE image
- How to build TXG – MAP: Genotype and Phenotype – for Predictions of Untargeted Effects, recomendations
- Data Input
- Training set – DrugMatrix
- Algorithm – 415 co-expression module
- Interpretation (Gene ontology)
- use TXG – MAP for adaptive response: Measure changes in Biological processes usinf eigengene scores
- TXG – MAP to compare Treatments: Apoptosis post treatment with Tunicamycin Red Induced expression Green supressed expression
- Hypetrophy caused by antibiotic Tunicamycin
- Building TXG – MAP – for sharing with Scientific Community across species to be used in Translation Research for preservation across cell lines, across species and for translation to Humans
11:40 Michael Rusch , Dir Bioinformatics, St. Jude Cloud
- 2017 Genomic Test is ORDERABLE >400 patients as of May 2018
- 300 approved access requests globally – PCGP Data Sharing – 8 attempts to download have failed once
- Solution: 2015 Cloud: Secure, sustainable, expandable
- SW development Partner DNAnexus on Microsoft ADURE in 2017, St.Jude CLoud
- 3000 pediatric cancer survivors – Optimize therapy to improve quality of life
- Simple Data Access Procedure : data securely into private cloud
- Gene fusions – Turnaround Time Challenge – Assay – 42 days – Leukemia RNA Seq workflow 15 days to get seq done
- Cost $5-$10 per sample running 5-8 days seq, manual Review and Reporting 20 Minutes
- most runs completed in 12 hours
- Variant annotation, pathogenicity: Germ line Mutations and Pediatric Cancer NEJM, Journal Pediatric Oncology
- Recan PIE – Pathogenicity Information Exchange (PIE) for SNV/Indel Classification
- Results overview Variant page: Gene info Protein Paint, Gene ingo
- damage prediction algorithm – ACMG classification Tool: Variant page
- 100 Registered users
- 425 jobs
- 340,000 variants
- VisualizationProtein paint, PCGP Mutation: SOmatic and Germ line Pathogenic and Likely Pathogenic Variants
11:40 Sponsored Presentation (Opportunity Available)
12:10 pm Session Break
12:20 Luncheon Presentation (Sponsorship Opportunity Available) or Enjoy Lunch on Your Own
1:20 Dessert Refreshment Break in the Exhibit Hall with Poster Viewing
DATA MINING FOR DISEASE CLASSIFICATION – CITYVIEW I
1:55 Chairperson’s Remarks
John Methot, Director, Health Informatics Architecture, Dana-Farber Cancer Institute
2:00 Disease Classification in the Era of Data-Intensive Medicine
Kanix Wang, PhD, Research Professional, Booth School of Business, Institute for Genomics & Systems Biology, University of Chicago
We used insurance claims for over one-third of the U.S. population to create a subset of 128,989 families (481,657 unique individuals). Using these data, we estimated the heritability and familial environmental patterns of 149 diseases. We then computed the environmental and genetic disease classifications for a set of 29 complex diseases after inferring their pairwise genetic and environmental correlations.
2:30 Enviro-Geno-Pheno State Approach and State-Based Biomarkers for Differentiation, Prognosis, Subtypes, and Staging
Lei Xu, PhD, Director, Centre for Cognitive Machines and Computational Health; Zhiyuan Chair Professor, Department of Computer Science and Engineering, Shanghai Jiao Tong University
In the joint space of geno-measures, pheno-measures, and enviro-measures, one point represents a bio-system behavior and a subset of points that locate adjacently and share a common system status represents a ‘state’. The system is characterized by such states learned from samples. This enviro-geno-pheno state is considered a biomarker, indicating ‘health/normal’ versus ‘risk/abnormal’ together with its associated enviro-geno-pheno condition.
3:00 PANEL DISCUSSION: Can We Improve Breast Cancer Patient Outcomes through Artificial Intelligence?
Maya Said, ScD, President & CEO, Outcomes4me, Inc. (Moderator)
Panelists:
Regina Barzilay, PhD, MacArthur Fellow and Delta Electronics Professor, Massachusetts Institute of Technology (MIT) Department of Electrical Engineering and Computer Science; Member, Computer Science and Artificial Intelligence Laboratory, MIT
Kevin Hughes, MD, Co-Director, Avon Breast Evaluation Program, Massachusetts General Hospital; Associate Professor of Surgery, Harvard Medical School; Medical Director, Bermuda Cancer Genetics Risk Assessment Clinic
Newly diagnosed cancer patients attempting to understand their treatment options face the overwhelming task of filtering an information deluge, much of which is irrelevant, outdated and occasionally inaccurate. Additionally, matching their diagnosis to best-in-class treatments or potential clinical trials, while simultaneously learning to navigate an extremely complex healthcare system is daunting, even for the most highly trained physicians. We will explore various platforms aimed at improving patient outcomes by leveraging technology to help educate, track, and connect patients with personalized resources while simultaneously working to improve the care continuum and the development of new treatments. We will explore the nexus of healthcare networks and their IT systems, clinical decision-making and delivery, R&D, and patients, for whom we all create our innovation solutions. Attendees will be interested to understand how various groups are working to increase value across the entire system by bringing laboratory, clinical and pharmaceutical science, real-world evidence and patient-reported data together with technology and artificial intelligence to solve health challenges. These approaches offer the opportunity to generate deeper insights into how therapies perform in the real world and harness that understanding to improve efficiency, effectiveness, value, and ultimately, patient care.
- Targeted Therapy in Breast Cancer more than another diseases
4:00 Conference Adjourns
SOURCE
http://www.bio-itworldexpo.com/next-gen-sequencing-informatics/
Leave a Reply