Advertisements
Feeds:
Posts
Comments

Posts Tagged ‘disease variants’


Bioinformatics Tool Review: Genome Variant Analysis Tools

Curator: Stephen J. Williams, Ph.D.

 

The following post will be an ongoing curation of reviews of gene variant bioinformatic software.

 

The Ensembl Variant Effect Predictor.

McLaren W, Gil L, Hunt SE, Riat HS, Ritchie GR, Thormann A, Flicek P, Cunningham F.

Genome Biol. 2016 Jun 6;17(1):122. doi: 10.1186/s13059-016-0974-4.

Author information

1

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK. wm2@ebi.ac.uk.

2

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.

3

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK. fiona@ebi.ac.uk.

Abstract

The Ensembl Variant Effect Predictor is a powerful toolset for the analysis, annotation, and prioritization of genomic variants in coding and non-coding regions. It provides access to an extensive collection of genomic annotation, with a variety of interfaces to suit different requirements, and simple options for configuring and extending analysis. It is open source, free to use, and supports full reproducibility of results. The Ensembl Variant Effect Predictor can simplify and accelerate variant interpretation in a wide range of study designs.

 

Rare diseases can be difficult to diagnose due to low incidence and incomplete penetrance of implicated alleles however variant analysis of whole genome sequencing can identify underlying genetic events responsible for the disease (Nature, 2015).  However, a large cohort is required for many WGS association studies in order to produce enough statistical power for interpretation (see post and here).  To this effect major sequencing projects have been initiated worldwide including:

A more thorough curation of sequencing projects can be seen in the following post:

Icelandic Population Genomic Study Results by deCODE Genetics come to Fruition: Curation of Current genomic studies

 

And although sequencing costs have dramatically been reduced over the years, the costs to determine the functional consequences of such variants remains high, as thorough basic research studies must be conducted to validate the interpretation of variant data with respect to the underlying disease, as only a small fraction of variants from a genome sequencing project will encode for a functional protein.  Correct annotation of sequences and variants, identification of correct corresponding reference genes or transcripts in GENCODE or RefSeq respectively offer compelling challenges to the proper identification of sequenced variants as potential functional variants.

To this effect, the authors developed the Ensembl Variant Effect Predictor (VEP), which is a software suite that performs annotations and analysis of most types of genomic variation in coding and non-coding regions of the genome.

Summary of Features

  • Annotation: VEP can annotate two broad categories of genomic variants
    • Sequence variants with specific and defined changes: indels, base substitutions, SNVs, tandem repeats
    • Larger structural variants > 50 nucleotides
  • Species and assembly/genomic database support: VEP can analyze data from any species with assembled genome sequence and annotated gene set. VEP supports chromosome assemblies such as the latest GRCh38, FASTA, as well as transcripts from RefSeq as well as user-derived sequences
  • Transcript Annotation: VEP includes a wide variety of gene and transcript related information including NCBI Gene ID, Gene Symbol, Transcript ID, NCBI RefSeq ID, exon/intron information, and cross reference to other databases such as UniProt
  • Protein Annotation: Protein-related fields include Protein ID, RefSeq ID, SwissProt, UniParc ID, reference codons and amino acids, SIFT pathogenicity score, protein domains
  • Noncoding Annotation: VEP reports variants in noncoding regions including genomic regulatory regions, intronic regions, transcription binding motifs. Data from ENCODE, BLUEPRINT, and NIH Epigenetics RoadMap are used for primary annotation.  Plugins to the Perl coding are also available to link other databases which annotate noncoding sequence features.
  • Frequency, phenotype, and citation annotation: VEP searches Ensembl databases containing a large amount of germline variant information and checks variants against the dbSNP single nucleotide polymorphism database. VEP integrates with mutational databases such as COSMIC, the Human Gene Mutation Database, and structural and copy number variants from Database of Genomic Variants.  Allele Frequencies are reported from 1000 Genomes and NHLBI and integrates with PubMed for literature annotation.  Phenotype information is from OMIM, Orphanet, GWAS and clinical information of variants from ClinVar.
  • Flexible Input and Output Formats: VEP supports input data format called “variant call format” or VCP, a standard in next-gen sequencing. VEP has the ability to process variant identifiers from other database formats.  Output formats are tab deliminated and give the user choices in presentation of results (HTML or text based)
  • Choice of user interface
    • Online tool (VEP Web): simple point and click; incorporates Instant VEP Functionality and copy and paste features. Results can be stored online in cloud storage on Ensembl.
    • VEP script: VEP is available as a downloadable PERL script (see below for link) and can process large amounts of data rapidly. This interface is powerfully flexible with the ability to integrate multiple plugins available from Ensembl and GitHub.  The ability to alter the PERL code and add plugins and code functions allows the flexibility to modify any feature of VEP.
    • VEP REST API: provides robust computational access to any programming language and returns basic variant annotation. Can make use of external plugins.

 

 

Watch Video on VES Instructional Webinar: https://youtu.be/7Fs7MHfXjWk

Watch Video on VES Web Version training on How to Analyze Your Sequence in VEP

 

 

Availability of data and materials

The dataset supporting the conclusions of this article is available from Illumina’s Platinum Genomes [93] and using the Ensembl release 75 gene set. Pre-built data sets are available for all Ensembl and Ensembl Genomes species [94]. They can also be downloaded automatically during set up whilst installing the VEP.

 

References

Large-scale discovery of novel genetic causes of developmental disorders.

Deciphering Developmental Disorders Study.

Nature2015 Mar 12;519(7542):223-8. doi: 10.1038/nature14135. PMID:25533962

Other articles related to Genomics and Bioinformatics on this online Open Access Journal Include:

Finding the Genetic Links in Common Disease: Caveats of Whole Genome Sequencing Studies

 

Large-scale sequencing does not support the idea that lower-frequency variants have a major role in predisposition to type 2 diabetes

 

US Personalized Cancer Genome Sequencing Market Outlook 2018 –

 

Icelandic Population Genomic Study Results by deCODE Genetics come to Fruition: Curation of Current genomic studies

 

 

Advertisements

Read Full Post »


Social Behavior Traits Embedded in Gene Expression

Reviewer and Curator: Larry H. Bernstein, MD, FCAP

 

Social behavior traits embedded in gene expression

This article is a special piece on the anniversary of my brother’s death. It is special in the unique discoveries on social interaction and genetic mutations expressed early or in adolescence, with severe disabilities to the individual, and with a challenge to the families affected.  The first is about the genetic classification of schizophrenia variants.  The second is about a severe autism variant with insights into the protein expression in language development.

.

Schizophrenia is Actually 8 Genetic Disorders

09/15/2014    http://www.biosciencetechnology.com/news/2014/09/schizophrenia-actually-8-genetic-disorders

DNA variations matching schizophrenia symptoms

DNA variations matching schizophrenia symptoms

 

Igor Zwir, Ph.D., one of the senior investigators, helped match precise DNA variations in people with and without schizophrenia to symptoms in individual patients. (Source: WUSTL/Robert Boston)

 New research shows that schizophrenia isn’t a single disease but

  •  a group of eight genetically distinct disorders,
  • each with its own set of symptoms.

The finding could be a first step toward improved diagnosis and treatment for the debilitating psychiatric illness.

The research at Washington University School of Medicine in St. Louis is reported online in The American Journal of Psychiatry.

About 80 percent of the risk for schizophrenia is known to be inherited, but scientists have struggled to identify specific genes for the condition. Now, in a novel approach analyzing genetic influences on more than 4,000 people with schizophrenia, the research team has identified distinct gene clusters that contribute to

  • eight different classes of schizophrenia.

Genes don’t operate by themselves,” said C. Robert Cloninger, one of the study’s senior investigators. “They function in concert much like an orchestra, and to understand how they’re working, you have to know

  • not just who the members of the orchestra are
  • but how they interact.”

Cloninger, the Wallace Renard Professor of Psychiatry and Genetics, and his colleagues matched precise DNA variations

  • in people with and without schizophrenia
  • to symptoms in individual patients.

In all, the researchers analyzed nearly 700,000 sites within the genome where a single unit of DNA is changed, often referred to as a single nucleotide polymorphism (SNP). They looked at SNPs in 4,200 people with schizophrenia and 3,800 healthy controls,

  • learning how individual genetic variations
  • interacted with each other to produce the illness.

In some patients with hallucinations or delusions, for example, the researchers

  • matched distinct genetic features to patients’ symptoms,
  • demonstrating that specific genetic variations interacted
  • to create a 95 percent certainty of schizophrenia.

In another group, they found that

  • disorganized speech and behavior were specifically associated with
  • a set of DNA variations that carried a 100 percent risk of schizophrenia.

Cloninger said..  “What we’ve done here, after a decade of frustration in the field
of psychiatric genetics, is identify the way genes interact with each other, how

  • the ‘orchestra’ is either harmonious and leads to health, or
  • disorganized in ways that lead to distinct classes of schizophrenia,”

Although individual genes have only weak and inconsistent associations with schizophrenia, groups of interacting gene clusters

  • create an extremely high and consistent risk of illness,
  • on the order of 70 to 100 percent.

That makes it almost impossible for people with those genetic variations to avoid the condition. In all, the researchers identified

  • 42 clusters of genetic variations that
  • dramatically increased the risk of schizophrenia.

“In the past, scientists had been looking for associations between individual genes and schizophrenia,” explained Dragan Svrakic, a co-investigator and a professor of psychiatry at Washington University. “When one study would identify an association, no one else could replicate it. What was missing was the idea that

  • these genes don’t act independently.
  • They work in concert to disrupt the brain’s structure and function,
  • and that results in the illness.”

Svrakic said it was only when the research team

  1. was able to organize the genetic variations and
  2. the patients’ symptoms into groups that
  3. they could see that particular clusters of DNA variations
  4. acted together to cause specific types of symptoms.

Then they divided patients according to the type and severity of their symptoms, such as

  1. different types of hallucinations or delusions, and other symptoms, such as
  2. lack of initiative,
  3. problems organizing thoughts or a
  4. lack of connection between emotions and thoughts.

The results indicated that those symptom profiles describe

  • eight qualitatively distinct disorders
  • based on underlying genetic conditions.

The investigators also replicated their findings in two additional DNA databases of people with schizophrenia, an indicator that

  • identifying the gene variations that are working together
  • is a valid avenue to explore for improving diagnosis and treatment.
  1. By identifying groups of genetic variations and
  2. matching them to symptoms in individual patients,
  3. it soon may be possible to target treatments
  4. to specific pathways that cause problems,

according to co-investigator Igor Zwir, research associate in psychiatry at Washington University and associate professor in the Department of Computer Science and Artificial Intelligence at the University of Granada, Spain.

And Cloninger added it may be possible to use the same approach

  • to better understand how genes work together
  • to cause other common but complex disorders.

“People have been looking at genes to get a better handle on

  1. heart disease,
  2. hypertension and
  3. diabetes, and

it’s been a real disappointment,” he said. “Most of the variability in the severity of disease has not been explained, but we were able to find that

  • different sets of genetic variations
  • were leading to distinct clinical syndromes.

So I think this really could change the way people approach understanding the causes of complex diseases.”

Autism Caused by Spontaneous Mutations in Key Brain Gene

09/18/2014 –

TBR1 protein configurations

TBR1 protein configurations

Mutations in the TBR1 gene affect the location of the TBR1 protein in human cells. In normal cells the TBR1 protein, shown in red, is found alongside the DNA, shown in blue. In contrast, the mutant TBR1 protein is found throughout the cell. (Source: Source: Radboud University) Spontaneous mutations in the brain gene TBR1 disrupt the function of the encoded protein in children with severe autism. In addition, there is a direct link between TBR1 and FOXP2, a well-known language-related protein. These are the main findings of an article by Pelagia Deriziotis and colleagues at the Nijmegen Max Planck Institute for Psycholinguistics and published in Nature Communications.

Autism is a disorder of brain development which

  • leads to difficulties with social interaction and communication.

Disorders such as autism caused by gene mutation can change

  • the shape of protein molecules and
  • stop them from working properly during brain development.

Inherited genetic variants put some individuals at risk for autism.  Research in recent years has shown that

  • severe autism can result from
  • (germ-line?) mutations expressed in a child, not in either parent.

Scientists have sequenced the DNA code of thousands of unrelated children with severe autism and found a handful of genes involving independent de novo. One of these genes is

  • TBR1, a key gene in brain development.

Strong impact on protein function

In their study, Deriziotis and colleagues from the MPI’s Language and Genetics Department and the University of Washington investigated the

  • effects of autism risk mutations on TBR1 protein function.

They used several cutting-edge techniques to examine how these mutations affect the way the TBR1 protein works, using human cells grown in the laboratory.

“We directly compared de novo and inherited mutations, and found that the de novo mutations had much more

  • dramatic effects on TBR1 protein function,”

said Deriziotis, “This is a really striking confirmation of the strong impact that de novo mutations can have on early brain development.”

Social network for proteins

Since the human brain depends on many different genes and proteins working together, the researchers were interested in

  • identifying proteins that interact with TBR1.

They discovered that

  • TBR1 directly interacts with FOXP2,
  • an important protein in speech and language disorders,
  • pathogenic mutations affecting either of these proteins
    • abolish the mutual interaction.

FOXP2 is one of the few proteins to have been clearly implicated in speech and language disorders.

Simon Fisher, professor of Language and Genetics at Radboud University and director at the MPI, said that they are building

  • a picture of the neurogenetic pathways contributing to human traits
  • by coupling data from genome screening with functional analysis in the lab

Source: Radboud University

Read Full Post »