disease variants | Leaders in Pharmaceutical Business Intelligence Group, LLC, Doing Business As LPBI Group, Newton, MA

Posts Tagged ‘disease variants’

Complex rearrangements and oncogene amplification revealed by long-read DNA and RNA sequencing of a breast cancer cell line

Posted in Cancer - General, CANCER BIOLOGY & Innovations in Cancer Therapy, Cancer Genomics, Genomic Testing: Methodology for Diagnosis, Next Generation Sequencing (NGS), Single Cell Genomics, tagged 10X Genomics Pacific Biosciences, breast cancer, cancer variants, chromosomal abberation, disease variants, fusion genes, gene fusions, long read sequencing, mutational spectrum, next gen sequencing (NGS), oncogenes, sequencing methodology, Whole genome sequencing on August 14, 2019| Leave a Comment »

Complex rearrangements and oncogene amplification revealed by long-read DNA and RNA sequencing of a breast cancer cell line

Reporter: Stephen J. Williams, PhD

In a Genome Research report by Marie Nattestad et al. [1], the SK-BR-3 breast cancer cell line was sequenced using a long read single molecule sequencing protocol in order to develop one of the most detailed maps of structural variations in a cancer genome to date. The authors detected over 20,000 variants with this new sequencing modality, whereas most of these variants would have been missed by short read sequencing. In addition, a complex sequence of nested duplications and translocations occurred surrounding the ERBB2 (HER2) while full-length transcriptomic analysis revealed novel gene fusions within the nested genomic variants. The authors suggest that combining this long-read genome and transcriptome sequencing results in a more comprehensive coverage of tumor gene variants and “sheds new light on the complex mechanisms involved in cancer genome evolution.”

Genomic instability is a hallmark of cancer [2], which lead to numerous genetic variations such as:

Copy number variations
Chromosomal alterations
Gene fusions
Deletions
Gene duplications
Insertions
Translocations

Efforts such as the Cancer Genome Atlas [3], and the International Genome Consortium (2010) use short-read sequencing technology to detect and analyze thousands of commonly occurring mutations however short-read technology has a high false positive and negative rate for detecting less common genetic structural variations {as high as 50% [4]}. In addition, short reads cannot detect variations in close proximity to each other or on the same molecule, therefore underestimating the variation number.

Methods: The authors used a long-read sequencing technology from Pacific Biosciences (SMRT) to analyze the mutational and structural variation in the SK-BR-3 breast cancer cell line. A split read and within-read mapping approach was used to detect variants of different types and sizes. In general, long-reads have better alignment qualities than short reads, resulting in higher quality mapping. Transcriptomic analysis was performed using Iso-Seq.

Results: Using the SMRT long-read sequencing technology from Pacific Biosciences, the authors were able to obtain 71.9% sequencing coverage with average read length of 9.8 kb for the SK-BR-3 genome.

A few notes:

Most amplified regions (33.6 copies) around the locus spanning the ERBB2 oncogene and around MYC locus (38 copies), EGFR locus (7 copies) and BCAS1 (16.8 copies)
The locus 8q24.12 had the most amplifications (this locus contains the SNTB1 gene) at 69.2 copies
Long-read sequencing showed more insertions than deletions and suggests an underestimate of the lengths of low complexity regions in the human reference genome
Found 1,493 long read variants, 603 of which were between different chromosomes
Using Iso-Seq in conjunction with the long-read platform, they detected 1,692,379 isoforms (93%) mapping to the reference genome and 53 putative gene fusions (39 of which they found genomic evidence)

A table modified from the paper on the gene fusions is given below:

Table 1. Gene fusions with RNA evidence from Iso-Seq and DNA evidence from SMRT DNA sequencing where the genomic path is found using SplitThreader from Sniffles variant calls. Note link in table is GeneCard for each gene.

SplitThreader path

#	Genes		Distance (bp)	Number of variants	Chromosomes in path	Previously observed in references
1	KLHDC2	SNTB1	9837	3	14\|17\|8	Asmann et al. (2011) as only a 2-hop fusion
2	CYTH1	EIF3H	8654	2	17\|8	Edgren et al. (2011); Kim and Salzberg
						(2011); RNA only, not observed as 2-hop
3	CPNE1	PREX1	1777	2	20	Found and validated as 2-hop by Chen et al. 2013
4	GSDMB	TATDN1	0	1	17\|8	Edgren et al. (2011); Kim and Salzberg
						(2011); Chen et al. (2013); validated by
						Edgren et al. (2011)
5	LINC00536	PVT1	0	1	8	No
6	MTBP	SAMD12	0	1	8	Validated by Edgren et al. (2011)
7	LRRFIP2	SUMF1	0	1	3	Edgren et al. (2011); Kim and Salzberg
						(2011); Chen et al. (2013); validated by
						Edgren et al. (2011)
8	FBXL7	TRIO	0	1	5	No
9	ATAD5	TLK2	0	1	17	No
10	DHX35	ITCH	0	1	20	Validated by Edgren et al. (2011)
11	LMCD1-AS1	MECOM	0	1	3	No
12	PHF20	RP4-723E3.1	0	1	20	No
13	RAD51B	SEMA6D	0	1	14\|15	No
14	STAU1	TOX2	0	1	20	No
15	TBC1D31	ZNF704	0	1	8	Edgren et al. (2011); Kim and Salzberg
						(2011); Chen et al. (2013); validated by
						Edgren et al. (2011); Chen et al. (2013)

SplitThreader found two different paths for the RAD51B-SEMA6D gene fusion and for the LINC00536-PVT1 gene fusion. Number of Iso-Seq reads refers to full-length HQ-filtered reads. Alignments of SMRT DNA sequence reads supporting each of these gene fusions are shown in Supplemental Note S2.

References

Nattestad M, Goodwin S, Ng K, Baslan T, Sedlazeck FJ, Rescheneder P, Garvin T, Fang H, Gurtowski J, Hutton E et al: Complex rearrangements and oncogene amplifications revealed by long-read DNA and RNA sequencing of a breast cancer cell line. Genome research 2018, 28(8):1126-1135.
Hanahan D, Weinberg RA: The hallmarks of cancer. Cell 2000, 100(1):57-70.
Kandoth C, McLellan MD, Vandin F, Ye K, Niu B, Lu C, Xie M, Zhang Q, McMichael JF, Wyczalkowski MA et al: Mutational landscape and significance across 12 major cancer types. Nature 2013, 502(7471):333-339.
Sudmant PH, Rausch T, Gardner EJ, Handsaker RE, Abyzov A, Huddleston J, Zhang Y, Ye K, Jun G, Fritz MH et al: An integrated map of structural variation in 2,504 human genomes. Nature 2015, 526(7571):75-81.

Bioinformatics Tool Review: Genome Variant Analysis Tools

Posted in Big Data, BioIT: BioInformatics, BioIT: BioInformatics, NGS, Clinical & Translational, Pharmaceutical R&D Informatics, Clinical Genomics, Cancer Informatics, Cancer Informatics, Clinical Genomics, Computational Biology/Systems and Bioinformatics, Genome Biology, Personalized and Precision Medicine & Genomic Research, Uncategorized, tagged bioinformatic tools, Bioinformatics, Bioinformatics methodologies, disease variants, DNA, Ensembl, European Bioinformatics Institute, genetic variants, National Center for Biotechnology Information, NCBI, next gen sequencing (NGS), NGS tests, SNP, Whole genome sequencing on October 23, 2018| Leave a Comment »

Bioinformatics Tool Review: Genome Variant Analysis Tools

Curator: Stephen J. Williams, Ph.D.

Updated 02/07/2021

Updated 11/15/2018

The following post will be an ongoing curation of reviews of gene variant bioinformatic software.

The Ensembl Variant Effect Predictor.

McLaren W, Gil L, Hunt SE, Riat HS, Ritchie GR, Thormann A, Flicek P, Cunningham F.

Genome Biol. 2016 Jun 6;17(1):122. doi: 10.1186/s13059-016-0974-4.

Author information

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK. wm2@ebi.ac.uk.

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK. fiona@ebi.ac.uk.

Abstract

The Ensembl Variant Effect Predictor is a powerful toolset for the analysis, annotation, and prioritization of genomic variants in coding and non-coding regions. It provides access to an extensive collection of genomic annotation, with a variety of interfaces to suit different requirements, and simple options for configuring and extending analysis. It is open source, free to use, and supports full reproducibility of results. The Ensembl Variant Effect Predictor can simplify and accelerate variant interpretation in a wide range of study designs.

Rare diseases can be difficult to diagnose due to low incidence and incomplete penetrance of implicated alleles however variant analysis of whole genome sequencing can identify underlying genetic events responsible for the disease (Nature, 2015). However, a large cohort is required for many WGS association studies in order to produce enough statistical power for interpretation (see post and here). To this effect major sequencing projects have been initiated worldwide including:

Iceland
UK (100,000 Genome Project)
USA
Genome 10K

A more thorough curation of sequencing projects can be seen in the following post:

Icelandic Population Genomic Study Results by deCODE Genetics come to Fruition: Curation of Current genomic studies

And although sequencing costs have dramatically been reduced over the years, the costs to determine the functional consequences of such variants remains high, as thorough basic research studies must be conducted to validate the interpretation of variant data with respect to the underlying disease, as only a small fraction of variants from a genome sequencing project will encode for a functional protein. Correct annotation of sequences and variants, identification of correct corresponding reference genes or transcripts in GENCODE or RefSeq respectively offer compelling challenges to the proper identification of sequenced variants as potential functional variants.

To this effect, the authors developed the Ensembl Variant Effect Predictor (VEP), which is a software suite that performs annotations and analysis of most types of genomic variation in coding and non-coding regions of the genome.

Summary of Features

Annotation: VEP can annotate two broad categories of genomic variants
- Sequence variants with specific and defined changes: indels, base substitutions, SNVs, tandem repeats
- Larger structural variants > 50 nucleotides
Species and assembly/genomic database support: VEP can analyze data from any species with assembled genome sequence and annotated gene set. VEP supports chromosome assemblies such as the latest GRCh38, FASTA, as well as transcripts from RefSeq as well as user-derived sequences
Transcript Annotation: VEP includes a wide variety of gene and transcript related information including NCBI Gene ID, Gene Symbol, Transcript ID, NCBI RefSeq ID, exon/intron information, and cross reference to other databases such as UniProt
Protein Annotation: Protein-related fields include Protein ID, RefSeq ID, SwissProt, UniParc ID, reference codons and amino acids, SIFT pathogenicity score, protein domains
Noncoding Annotation: VEP reports variants in noncoding regions including genomic regulatory regions, intronic regions, transcription binding motifs. Data from ENCODE, BLUEPRINT, and NIH Epigenetics RoadMap are used for primary annotation. Plugins to the Perl coding are also available to link other databases which annotate noncoding sequence features.
Frequency, phenotype, and citation annotation: VEP searches Ensembl databases containing a large amount of germline variant information and checks variants against the dbSNP single nucleotide polymorphism database. VEP integrates with mutational databases such as COSMIC, the Human Gene Mutation Database, and structural and copy number variants from Database of Genomic Variants. Allele Frequencies are reported from 1000 Genomes and NHLBI and integrates with PubMed for literature annotation. Phenotype information is from OMIM, Orphanet, GWAS and clinical information of variants from ClinVar.
Flexible Input and Output Formats: VEP supports input data format called “variant call format” or VCP, a standard in next-gen sequencing. VEP has the ability to process variant identifiers from other database formats. Output formats are tab deliminated and give the user choices in presentation of results (HTML or text based)
Choice of user interface
- Online tool (VEP Web): simple point and click; incorporates Instant VEP Functionality and copy and paste features. Results can be stored online in cloud storage on Ensembl.
- VEP script: VEP is available as a downloadable PERL script (see below for link) and can process large amounts of data rapidly. This interface is powerfully flexible with the ability to integrate multiple plugins available from Ensembl and GitHub. The ability to alter the PERL code and add plugins and code functions allows the flexibility to modify any feature of VEP.
- VEP REST API: provides robust computational access to any programming language and returns basic variant annotation. Can make use of external plugins.

Watch Video on VES Instructional Webinar: https://youtu.be/7Fs7MHfXjWk

Watch Video on VES Web Version training on How to Analyze Your Sequence in VEP

Availability of data and materials

The dataset supporting the conclusions of this article is available from Illumina’s Platinum Genomes [93] and using the Ensembl release 75 gene set. Pre-built data sets are available for all Ensembl and Ensembl Genomes species [94]. They can also be downloaded automatically during set up whilst installing the VEP.

Project name: Ensembl Variant Effect Predictor
Project home page: http://www.ensembl.org/vep
Archived version: https://github.com/Ensembl/ensembl-tools/archive/release/83.zip
Zenodo deposit: https://zenodo.org/record/50492#.Vx9TJ5MrKEI
Operating system: platform independent
Programming language: Perl
Other requirements: Perl 5.10 or higher and the DBI and DBD::mysql modules
License: Apache 2.0
Any restrictions to use by non-academics: none.

References

Large-scale discovery of novel genetic causes of developmental disorders.

Deciphering Developmental Disorders Study.

Nature. 2015 Mar 12;519(7542):223-8. doi: 10.1038/nature14135. PMID:25533962

Updated 11/15/2018

Research Points to Caution in Use of Variant Effect Prediction Bioinformatic Tools

Although we have the ability to use high throughput sequencing to identify allelic variants occurring in rare disease, correlation of these variants with the underlying disease is often difficult due to a few concerns:

For rare sporadic diseases, classical gene/variant association studies have proven difficult to perform (Meyts et al. 2016)
As Whole Exome Sequencing (WES) returns a considerable number of variants, how to differentiate the normal allelic variation found in the human population from disease-causing pathogenic alleles
For rare diseases, pathogenic allele frequencies are generally low

Therefore, for these rare pathogenic alleles, the use of bioinformatics tools in order to predict the resulting changes in gene function may provide insight into disease etiology when validation of these allelic changes might be experimentally difficult.

In a 2017 Genes & Immunity paper, Line Lykke Andersen and Rune Hartmann tested the reliability of various bioinformatic software to predict the functional consequence of variants of six different genes involved in interferon induction and sixteen allelic variants of the IFNLR1 gene. These variants were found in cohorts of patients presenting with herpes simplex encephalitis (HSE). Most of the adult population is seropositive for Herpes Simplex Virus (HSV) however a minor fraction (1 in 250,000 individuals per year) of HSV infected individuals will develop HSE (Hjalmarsson et al., 2007). It has been suggested that HSE occurs in individuals with rare primary immunodeficiencies caused by gene defects affecting innate immunity through reduced production of interferons (IFN) (Zhang et al., Lim et al.).

References

Meyts I, Bosch B, Bolze A, Boisson B, Itan Y, Belkadi A, et al. Exome and genome sequencing for inborn errors of immunity. J Allergy Clin Immunol. 2016;138:957–69.

Hjalmarsson A, Blomqvist P, Skoldenberg B. Herpes simplex encephalitis in Sweden, 1990-2001: incidence, morbidity, and mortality. Clin Infect Dis. 2007;45:875–80.

Zhang SY, Jouanguy E, Ugolini S, Smahi A, Elain G, Romero P, et al. TLR3 deficiency in patients with herpes simplex encephalitis. Science. 2007;317:1522–7.

Lim HK, Seppanen M, Hautala T, Ciancanelli MJ, Itan Y, Lafaille FG, et al. TLR3 deficiency in herpes simplex encephalitis: high allelic heterogeneity and recurrence risk. Neurology. 2014;83:1888–97.

Genes Immun. 2017 Dec 4. doi: 10.1038/s41435-017-0002-z.

Frequently used bioinformatics tools overestimate the damaging effect of allelic variants.

Andersen LL¹, Terczyńska-Dyla E¹, Mørk N², Scavenius C¹, Enghild JJ¹, Höning K³, Hornung V^3,4, Christiansen M^5,6, Mogensen TH^2,6, Hartmann R⁷.

Abstract

We selected two sets of naturally occurring human missense allelic variants within innate immune genes. The first set represented eleven non-synonymous variants in six different genes involved in interferon (IFN) induction, present in a cohort of patients suffering from herpes simplex encephalitis (HSE) and the second set represented sixteen allelic variants of the IFNLR1 gene. We recreated the variants in vitro and tested their effect on protein function in a HEK293T cell based assay. We then used an array of 14 available bioinformatics tools to predict the effect of these variants upon protein function. To our surprise two of the most commonly used tools, CADD and SIFT, produced a high rate of false positives, whereas SNPs&GO exhibited the lowest rate of false positives in our test. As the problem in our test in general was false positive variants, inclusion of mutation significance cutoff (MSC) did not improve accuracy.

Methodology

Identification of rare variants
Genomes of nineteen Dutch patients with a history of HSE sequenced by WES and identification of novel HSE causing variants determined by filtering the single nucleotide polymorphisms (SNPs) that had a frequency below 1% in the NHBLI Exome Sequencing Project Exome Variant Server and the 1000 Genomes Project and were present within 204 genes involved in the immune response to HSV.
Identified variants (204) manually evaluated for involvement of IFN induction based on IDBase and KEGG pathway database analysis.
In-silico predictions: Variants classified by the in silico variant pathogenicity prediction programs: SIFT, Mutation Assessor, FATHMM, PROVEAN, SNAP2, PolyPhen2, PhD-SNP, SNP&GO, FATHMM-MKL, MutationTaster2, PredictSNP, Condel, MetaSNP, and CADD. Each program returned prediction scores measuring likelihood of a variant either being ‘deleterious’ or ‘neutral’. Prediction accuracy measured as

ACC = (true positive+true negative)/(true positive+true negative+false positive+false negative)

Validation of prediction software/tools

In order to validate the predictive value of the software, HEK293T cells, deficient in IRF3, MAVS, and IKKe/TBK1, were cotransfected with the nine variants of the aforementioned genes and a luciferase reporter under control of the IFN-b promoter and luciferase activity measured as an indicator of IFN signaling function. Western blot was performed to confirm the expression of the constructs.

Results


Table 2 Summary of the bioinformatic predictions	HSE variants						IFNLR1 variants						Overall ACC
Table 2 Summary of the bioinformatic predictions	TN	TP	FN	FP	Total	ACC	TN	TP	FN	FP	Total	ACC	Overall ACC
Uniform cutoff
SIFT	4	1	0	4	9	0.56	8	1	0	7	16	0.56	0.56
Mutation assessor	6	1	0	2	9	0.78	9	1	0	6	16	0.63	0.68
FATHMM	7	1	0	1	9	0.89							0.89
PROVEAN	8	1	0	0	9	1.00	11	1	0	4	16	0.75	0.84
SNAP2	5	1	0	3	9	0.67	8	0	1	7	16	0.50	0.56
PolyPhen2	6	1	0	2	9	0.78	12	1	0	3	16	0.81	0.80
PhD-SNP	7	1	0	1	9	0.89	11	1	0	4	16	0.75	0.80
SNPs&GO	8	1	0	0	9	1.00	14	1	0	1	16	0.94	0.96
FATHMM MKL	4	1	0	4	9	0.56	13	0	1	2	16	0.81	0.72
MutationTaster2	4	0	1	4	9	0.44	14	0	1	1	16	0.88	0.72
PredictSNP	6	1	0	2	9	0.78	11	1	0	4	16	0.75	0.76
Condel	6	1	0	2	9	0.78							0.78
Meta-SNP	8	1	0	0	9	1.00	11	1	0	4	16	0.75	0.84
CADD	2	1	0	6	9	0.33	8	0	1	7	16	0.50	0.44
MSC 95% cutoff
SIFT	5	1	0	3	9	0.67	8	1	0	8	16	0.50	0.56
PolyPhen2	6	1	0	2	9	0.78	13	1	0	3	16	0.81	0.80
CADD	4	1	0	4	9	0.56	7	0	1	9	16	0.44	0.48

Note: TN: true negative, TP: true positive, FN: false negative, FP: false positive, ACC: accuracy

Functional testing (data obtained from reporter construct experiments) were considered as the correct outcome.

Three prediction tools (PROVEAN, SNP&GO, and MetaSNP correctly predicted the effect of all nine variants tested.

Updated 02/07/2021

InMeRF: prediction of pathogenicity of missense variants by individual modeling for each amino acid substitution

Jun-Ichi Takeda, Kentaro Nanatsue, Ryosuke Yamagishi, Mikako Ito, Nobuhiko Haga², Hiromi Hirata, Tomoo Ogi, Kinji Ohno in NAR Genomics and Bioinformatics. 2020 May 26;2(2):lqaa038.doi: 10.1093/nargab/lqaa038. eCollection 2020 Jun.

PMID: 33543123
PMCID: PMC7671370
DOI: 10.1093/nargab/lqaa03

Abstract

In predicting the pathogenicity of a nonsynonymous single-nucleotide variant (nsSNV), a radical change in amino acid properties is prone to be classified as being pathogenic. However, not all such nsSNVs are associated with human diseases. We generated random forest (RF) models individually for each amino acid substitution to differentiate pathogenic nsSNVs in the Human Gene Mutation Database and common nsSNVs in dbSNP. We named a set of our models ‘Individual Meta RF’ (InMeRF). Ten-fold cross-validation of InMeRF showed that the areas under the curves (AUCs) of receiver operating characteristic (ROC) and precision-recall curves were on average 0.941 and 0.957, respectively. To compare InMeRF with seven other tools, the eight tools were generated using the same training dataset, and were compared using the same three testing datasets. ROC-AUCs of InMeRF were ranked first in the eight tools. We applied InMeRF to 155 pathogenic and 125 common nsSNVs in seven major genes causing congenital myasthenic syndromes, as well as in VANGL1 causing spina bifida, and found that the sensitivity and specificity of InMeRF were 0.942 and 0.848, respectively. We made the InMeRF web service, and also made genome-wide InMeRF scores available online (https://www.med.nagoya-u.ac.jp/neurogenetics/InMeRF/).

Source: https://pubmed.ncbi.nlm.nih.gov/33543123/

ADDRESS: A database of disease-associated human variants incorporating protein structure and folding stabilities

Jaie Woodard, Chengxin Zhang, Yang Zhang in J Mol Biol. 2021 Feb 1;166840. doi: 10.1016/j.jmb.2021.166840.

PMID: 33539887
DOI: 10.1016/j.jmb.2021.166840

Abstract

Numerous human diseases are caused by mutations in genomic sequences. Since amino acid changes affect protein function through mechanisms often predictable from protein structure, the integration of structural and sequence data enables us to estimate with greater accuracy whether and how a given mutation will lead to disease. Publicly available annotated databases enable hypothesis assessment and benchmarking of prediction tools. However, the results are often presented as summary statistics or black box predictors, without providing full descriptive information. We developed a new semi-manually curated human variant database presenting information on the protein contact-map, sequence-to-structure mapping, amino acid identity change, and stability prediction for the popular UniProt database. We found that the profiles of pathogenic and benign missense polymorphisms can be effectively deduced using decision trees and comparative analyses based on the presented dataset. The database is made publicly available through https://zhanglab.ccmb.med.umich.edu/ADDRESS.

Source: https://pubmed.ncbi.nlm.nih.gov/33539887/

PopDel identifies medium-size deletions simultaneously in tens of thousands of genomes

Sebastian Niehus , Hákon Jónsson, Janina Schönberger, Eythór Björnsson, Doruk Beyter, Hannes P Eggertsson, Patrick Sulem Kári Stefánsson , Bjarni V Halldórsson Birte Kehr

in Nature Communications. 2021 Feb 1;12(1):730.doi: 10.1038/s41467-020-20850-5.

PMID: 33526789
DOI: 10.1038/s41467-020-20850-5

Abstract

Thousands of genomic structural variants (SVs) segregate in the human population and can impact phenotypic traits and diseases. Their identification in whole-genome sequence data of large cohorts is a major computational challenge. Most current approaches identify SVs in single genomes and afterwards merge the identified variants into a joint call set across many genomes. We describe the approach PopDel, which directly identifies deletions of about 500 to at least 10,000 bp in length in data of many genomes jointly, eliminating the need for subsequent variant merging. PopDel scales to tens of thousands of genomes as we demonstrate in evaluations on up to 49,962 genomes. We show that PopDel reliably reports common, rare and de novo deletions. On genomes with available high-confidence reference call sets PopDel shows excellent recall and precision. Genotype inheritance patterns in up to 6794 trios indicate that genotypes predicted by PopDel are more reliable than those of previous SV callers. Furthermore, PopDel’s running time is competitive with the fastest tested previous tools. The demonstrated scalability and accuracy of PopDel enables routine scans for deletions in large-scale sequencing studies.

Source: https://pubmed.ncbi.nlm.nih.gov/33526789/

Other articles related to Genomics and Bioinformatics on this online Open Access Journal Include:

Finding the Genetic Links in Common Disease: Caveats of Whole Genome Sequencing Studies

Large-scale sequencing does not support the idea that lower-frequency variants have a major role in predisposition to type 2 diabetes

US Personalized Cancer Genome Sequencing Market Outlook 2018 –

Icelandic Population Genomic Study Results by deCODE Genetics come to Fruition: Curation of Current genomic studies

Read Full Post »

Social Behavior Traits Embedded in Gene Expression

Posted in Autism Spectrum Disorders, Behavioral Genetics, Best evidence, Biomarkers & Medical Diagnostics, Biomedical Measurement Science, Chemical Biology and its relations to Metabolic Disease, Child and Adolescent Psychiatry, Mutant Gene Expression, Neurological Diseases, Proteomics, Schizophrenia, Severe Autism, Social Development, tagged chronic disease, disease variants, FOXP2, genetic mutations, language, language skills, protein-protein interaction, Schizophrenia, Severe Autism, social behavior, TPR1 on September 29, 2014| Leave a Comment »

Social Behavior Traits Embedded in Gene Expression

Reviewer and Curator: Larry H. Bernstein, MD, FCAP

Social behavior traits embedded in gene expression

This article is a special piece on the anniversary of my brother’s death. It is special in the unique discoveries on social interaction and genetic mutations expressed early or in adolescence, with severe disabilities to the individual, and with a challenge to the families affected. The first is about the genetic classification of schizophrenia variants. The second is about a severe autism variant with insights into the protein expression in language development.

Schizophrenia is Actually 8 Genetic Disorders

09/15/2014 http://www.biosciencetechnology.com/news/2014/09/schizophrenia-actually-8-genetic-disorders

DNA variations matching schizophrenia symptoms

Igor Zwir, Ph.D., one of the senior investigators, helped match precise DNA variations in people with and without schizophrenia to symptoms in individual patients. (Source: WUSTL/Robert Boston)

New research shows that schizophrenia isn’t a single disease but

a group of eight genetically distinct disorders,
each with its own set of symptoms.

The finding could be a first step toward improved diagnosis and treatment for the debilitating psychiatric illness.

The research at Washington University School of Medicine in St. Louis is reported online in The American Journal of Psychiatry.

About 80 percent of the risk for schizophrenia is known to be inherited, but scientists have struggled to identify specific genes for the condition. Now, in a novel approach analyzing genetic influences on more than 4,000 people with schizophrenia, the research team has identified distinct gene clusters that contribute to

eight different classes of schizophrenia.

“Genes don’t operate by themselves,” said C. Robert Cloninger, one of the study’s senior investigators. “They function in concert much like an orchestra, and to understand how they’re working, you have to know

not just who the members of the orchestra are
but how they interact.”

Cloninger, the Wallace Renard Professor of Psychiatry and Genetics, and his colleagues matched precise DNA variations

in people with and without schizophrenia
to symptoms in individual patients.

In all, the researchers analyzed nearly 700,000 sites within the genome where a single unit of DNA is changed, often referred to as a single nucleotide polymorphism (SNP). They looked at SNPs in 4,200 people with schizophrenia and 3,800 healthy controls,

learning how individual genetic variations
interacted with each other to produce the illness.

In some patients with hallucinations or delusions, for example, the researchers

matched distinct genetic features to patients’ symptoms,
demonstrating that specific genetic variations interacted
to create a 95 percent certainty of schizophrenia.

In another group, they found that

disorganized speech and behavior were specifically associated with
a set of DNA variations that carried a 100 percent risk of schizophrenia.

Cloninger said.. “What we’ve done here, after a decade of frustration in the field
of psychiatric genetics, is identify the way genes interact with each other, how

the ‘orchestra’ is either harmonious and leads to health, or
disorganized in ways that lead to distinct classes of schizophrenia,”

Although individual genes have only weak and inconsistent associations with schizophrenia, groups of interacting gene clusters

create an extremely high and consistent risk of illness,
on the order of 70 to 100 percent.

That makes it almost impossible for people with those genetic variations to avoid the condition. In all, the researchers identified

42 clusters of genetic variations that
dramatically increased the risk of schizophrenia.

“In the past, scientists had been looking for associations between individual genes and schizophrenia,” explained Dragan Svrakic, a co-investigator and a professor of psychiatry at Washington University. “When one study would identify an association, no one else could replicate it. What was missing was the idea that

these genes don’t act independently.
They work in concert to disrupt the brain’s structure and function,
and that results in the illness.”

Svrakic said it was only when the research team

was able to organize the genetic variations and
the patients’ symptoms into groups that
they could see that particular clusters of DNA variations
acted together to cause specific types of symptoms.

Then they divided patients according to the type and severity of their symptoms, such as

different types of hallucinations or delusions, and other symptoms, such as
lack of initiative,
problems organizing thoughts or a
lack of connection between emotions and thoughts.

The results indicated that those symptom profiles describe

eight qualitatively distinct disorders
based on underlying genetic conditions.

The investigators also replicated their findings in two additional DNA databases of people with schizophrenia, an indicator that

identifying the gene variations that are working together
is a valid avenue to explore for improving diagnosis and treatment.

By identifying groups of genetic variations and
matching them to symptoms in individual patients,
it soon may be possible to target treatments
to specific pathways that cause problems,

according to co-investigator Igor Zwir, research associate in psychiatry at Washington University and associate professor in the Department of Computer Science and Artificial Intelligence at the University of Granada, Spain.

And Cloninger added it may be possible to use the same approach

to better understand how genes work together
to cause other common but complex disorders.

“People have been looking at genes to get a better handle on

heart disease,
hypertension and
diabetes, and

it’s been a real disappointment,” he said. “Most of the variability in the severity of disease has not been explained, but we were able to find that

different sets of genetic variations
were leading to distinct clinical syndromes.

So I think this really could change the way people approach understanding the causes of complex diseases.”

Autism Caused by Spontaneous Mutations in Key Brain Gene

09/18/2014 –

TBR1 protein configurations

Mutations in the TBR1 gene affect the location of the TBR1 protein in human cells. In normal cells the TBR1 protein, shown in red, is found alongside the DNA, shown in blue. In contrast, the mutant TBR1 protein is found throughout the cell. (Source: Source: Radboud University) Spontaneous mutations in the brain gene TBR1 disrupt the function of the encoded protein in children with severe autism. In addition, there is a direct link between TBR1 and FOXP2, a well-known language-related protein. These are the main findings of an article by Pelagia Deriziotis and colleagues at the Nijmegen Max Planck Institute for Psycholinguistics and published in Nature Communications.

Autism is a disorder of brain development which

leads to difficulties with social interaction and communication.

Disorders such as autism caused by gene mutation can change

the shape of protein molecules and
stop them from working properly during brain development.

Inherited genetic variants put some individuals at risk for autism. Research in recent years has shown that

severe autism can result from
(germ-line?) mutations expressed in a child, not in either parent.

Scientists have sequenced the DNA code of thousands of unrelated children with severe autism and found a handful of genes involving independent de novo. One of these genes is

TBR1, a key gene in brain development.

Strong impact on protein function

In their study, Deriziotis and colleagues from the MPI’s Language and Genetics Department and the University of Washington investigated the

effects of autism risk mutations on TBR1 protein function.

They used several cutting-edge techniques to examine how these mutations affect the way the TBR1 protein works, using human cells grown in the laboratory.

“We directly compared de novo and inherited mutations, and found that the de novo mutations had much more

dramatic effects on TBR1 protein function,”

said Deriziotis, “This is a really striking confirmation of the strong impact that de novo mutations can have on early brain development.”

Social network for proteins

Since the human brain depends on many different genes and proteins working together, the researchers were interested in

identifying proteins that interact with TBR1.

They discovered that

TBR1 directly interacts with FOXP2,
an important protein in speech and language disorders,
pathogenic mutations affecting either of these proteins
- abolish the mutual interaction.

FOXP2 is one of the few proteins to have been clearly implicated in speech and language disorders.

Simon Fisher, professor of Language and Genetics at Radboud University and director at the MPI, said that they are building

a picture of the neurogenetic pathways contributing to human traits
by coupling data from genome screening with functional analysis in the lab

Source: Radboud University

Read Full Post »

Leaders in Pharmaceutical Business Intelligence Group, LLC, Doing Business As LPBI Group, Newton, MA

Posts Tagged ‘disease variants’

Complex rearrangements and oncogene amplification revealed by long-read DNA and RNA sequencing of a breast cancer cell line

Complex rearrangements and oncogene amplification revealed by long-read DNA and RNA sequencing of a breast cancer cell line

Other articles on Cancer Genome Sequencing in this Open Access Journal Include:

International Cancer Genome Consortium Website has 71 Committed Cancer Genome Projects Ongoing

Loss of Gene Islands May Promote a Cancer Genome’s Evolution: A new Hypothesis on Oncogenesis

Identifying Aggressive Breast Cancers by Interpreting the Mathematical Patterns in the Cancer Genome

CancerBase.org – The Global HUB for Diagnoses, Genomes, Pathology Images: A Real-time Diagnosis and Therapy Mapping Service for Cancer Patients – Anonymized Medical Records accessible to

Like this:

Bioinformatics Tool Review: Genome Variant Analysis Tools