Posts Tagged ‘Genomic’

A Nonlinear Methodology to Explain Complexity of the Genome and Bioinformatic Information, Volume 2 (Volume Two: Latest in Genomics Methodologies for Therapeutics: Gene Editing, NGS and BioInformatics, Simulations and the Genome Ontology), Part 1: Next Generation Sequencing (NGS)

A Nonlinear Methodology to Explain Complexity of the Genome and Bioinformatic Information

Reporter: Stephen J. Williams, Ph.D.

Multifractal bioinformatics: A proposal to the nonlinear interpretation of genome

The following is an open access article by Pedro Moreno on a methodology to analyze genetic information across species and in particular, the evolutionary trends of complex genomes, by a nonlinear analytic approach utilizing fractal geometry, coined “Nonlinear Bioinformatics”.  This fractal approach stems from the complex nature of higher eukaryotic genomes including mosaicism, multiple interdispersed  genomic elements such as intronic regions, noncoding regions, and also mobile elements such as transposable elements.  Although seemingly random, there exists a repetitive nature of these elements. Such complexity of DNA regulation, structure and genomic variation is felt best understood by developing algorithms based on fractal analysis, which can best model the regionalized and repetitive variability and structure within complex genomes by elucidating the individual components which contributes to an overall complex structure rather than using a “linear” or “reductionist” approach looking at individual coding regions, which does not take into consideration the aforementioned factors leading to genetic complexity and diversity.

Indeed, many other attempts to describe the complexities of DNA as a fractal geometric pattern have been described.  In a paper by Carlo Cattani “Fractals and Hidden Symmetries in DNA“, Carlo uses fractal analysis to construct a simple geometric pattern of the influenza A virus by modeling the primary sequence of this viral DNA, namely the bases A,G,C, and T. The main conclusions that

fractal shapes and symmetries in DNA sequences and DNA walks have been shown and compared with random and deterministic complex series. DNA sequences are structured in such a way that there exists some fractal behavior which can be observed both on the correlation matrix and on the DNA walks. Wavelet analysis confirms by a symmetrical clustering of wavelet coefficients the existence of scale symmetries.

suggested that, at least, the viral influenza genome structure could be analyzed into its basic components by fractal geometry.
This approach has been used to model the complex nature of cancer as discussed in a 2011 Seminars in Oncology paper
Abstract: Cancer is a highly complex disease due to the disruption of tissue architecture. Thus, tissues, and not individual cells, are the proper level of observation for the study of carcinogenesis. This paradigm shift from a reductionist approach to a systems biology approach is long overdue. Indeed, cell phenotypes are emergent modes arising through collective non-linear interactions among different cellular and microenvironmental components, generally described by “phase space diagrams”, where stable states (attractors) are embedded into a landscape model. Within this framework, cell states and cell transitions are generally conceived as mainly specified by gene-regulatory networks. However, the system s dynamics is not reducible to the integrated functioning of the genome-proteome network alone; the epithelia-stroma interacting system must be taken into consideration in order to give a more comprehensive picture. Given that cell shape represents the spatial geometric configuration acquired as a result of the integrated set of cellular and environmental cues, we posit that fractal-shape parameters represent “omics descriptors of the epithelium-stroma system. Within this framework, function appears to follow form, and not the other way around.

As authors conclude

” Transitions from one phenotype to another are reminiscent of phase transitions observed in physical systems. The description of such transitions could be obtained by a set of morphological, quantitative parameters, like fractal measures. These parameters provide reliable information about system complexity. “

Gene expression also displays a fractal nature. In a Frontiers in Physiology paper by Mahboobeh Ghorbani, Edmond A. Jonckheere and Paul Bogdan* “Gene Expression Is Not Random: Scaling, Long-Range Cross-Dependence, and Fractal Characteristics of Gene Regulatory Networks“,

the authors describe that gene expression networks display time series display fractal and long-range dependence characteristics.

Abstract: Gene expression is a vital process through which cells react to the environment and express functional behavior. Understanding the dynamics of gene expression could prove crucial in unraveling the physical complexities involved in this process. Specifically, understanding the coherent complex structure of transcriptional dynamics is the goal of numerous computational studies aiming to study and finally control cellular processes. Here, we report the scaling properties of gene expression time series in Escherichia coliand Saccharomyces cerevisiae. Unlike previous studies, which report the fractal and long-range dependency of DNA structure, we investigate the individual gene expression dynamics as well as the cross-dependency between them in the context of gene regulatory network. Our results demonstrate that the gene expression time series display fractal and long-range dependence characteristics. In addition, the dynamics between genes and linked transcription factors in gene regulatory networks are also fractal and long-range cross-correlated. The cross-correlation exponents in gene regulatory networks are not unique. The distribution of the cross-correlation exponents of gene regulatory networks for several types of cells can be interpreted as a measure of the complexity of their functional behavior.


Given that multitude of complex biomolecular networks and biomolecules can be described by fractal patterns, the development of bioinformatic algorithms  would enhance our understanding of the interdependence and cross funcitonality of these mutiple biological networks, particularly in disease and drug resistance.  The article below by Pedro Moreno describes the development of such bioinformatic algorithms.

Pedro A. Moreno
Escuela de Ingeniería de Sistemas y Computación, Facultad de Ingeniería, Universidad del Valle, Cali, Colombia
E-mail: pedro.moreno@correounivalle.edu.co

Eje temático: Ingeniería de sistemas / System engineering
Recibido: 19 de septiembre de 2012
Aceptado: 16 de diciembre de 2013




The first draft of the human genome (HG) sequence was published in 2001 by two competing consortia. Since then, several structural and functional characteristics for the HG organization have been revealed. Today, more than 2.000 HG have been sequenced and these findings are impacting strongly on the academy and public health. Despite all this, a major bottleneck, called the genome interpretation persists. That is, the lack of a theory that explains the complex puzzles of coding and non-coding features that compose the HG as a whole. Ten years after the HG sequenced, two recent studies, discussed in the multifractal formalism allow proposing a nonlinear theory that helps interpret the structural and functional variation of the genetic information of the genomes. The present review article discusses this new approach, called: “Multifractal bioinformatics”.

Keywords: Omics sciences, bioinformatics, human genome, multifractal analysis.

1. Introduction

Omic Sciences and Bioinformatics

In order to study the genomes, their life properties and the pathological consequences of impairment, the Human Genome Project (HGP) was created in 1990. Since then, about 500 Gpb (EMBL) represented in thousands of prokaryotic genomes and tens of different eukaryotic genomes have been sequenced (NCBI, 1000 Genomes, ENCODE). Today, Genomics is defined as the set of sciences and technologies dedicated to the comprehensive study of the structure, function and origin of genomes. Several types of genomic have arisen as a result of the expansion and implementation of genomics to the study of the Central Dogma of Molecular Biology (CDMB), Figure 1 (above). The catalog of different types of genomics uses the Latin suffix “-omic” meaning “set of” to mean the new massive approaches of the new omics sciences (Moreno et al, 2009). Given the large amount of genomic information available in the databases and the urgency of its actual interpretation, the balance has begun to lean heavily toward the requirements of bioinformatics infrastructure research laboratories Figure 1 (below).

The bioinformatics or Computational Biology is defined as the application of computer and information technology to the analysis of biological data (Mount, 2004). An interdisciplinary science that requires the use of computing, applied mathematics, statistics, computer science, artificial intelligence, biophysical information, biochemistry, genetics, and molecular biology. Bioinformatics was born from the need to understand the sequences of nucleotide or amino acid symbols that make up DNA and proteins, respectively. These analyzes are made possible by the development of powerful algorithms that predict and reveal an infinity of structural and functional features in genomic sequences, as gene location, discovery of homologies between macromolecules databases (Blast), algorithms for phylogenetic analysis, for the regulatory analysis or the prediction of protein folding, among others. This great development has created a multiplicity of approaches giving rise to new types of Bioinformatics, such as Multifractal Bioinformatics (MFB) that is proposed here.

1.1 Multifractal Bioinformatics and Theoretical Background

MFB is a proposal to analyze information content in genomes and their life properties in a non-linear way. This is part of a specialized sub-discipline called “nonlinear Bioinformatics”, which uses a number of related techniques for the study of nonlinearity (fractal geometry, Hurts exponents, power laws, wavelets, among others.) and applied to the study of biological problems (http://pharmaceuticalintelligence.com/tag/fractal-geometry/). For its application, we must take into account a detailed knowledge of the structure of the genome to be analyzed and an appropriate knowledge of the multifractal analysis.

1.2 From the Worm Genome toward Human Genome

To explore a complex genome such as the HG it is relevant to implement multifractal analysis (MFA) in a simpler genome in order to show its practical utility. For example, the genome of the small nematode Caenorhabditis elegans is an excellent model to learn many extrapolated lessons of complex organisms. Thus, if the MFA explains some of the structural properties in that genome it is expected that this same analysis reveals some similar properties in the HG.

The C. elegans nuclear genome is composed of about 100 Mbp, with six chromosomes distributed into five autosomes and one sex chromosome. The molecular structure of the genome is particularly homogeneous along with the chromosome sequences, due to the presence of several regular features, including large contents of genes and introns of similar sizes. The C. elegans genome has also a regional organization of the chromosomes, mainly because the majority of the repeated sequences are located in the chromosome arms, Figure 2 (left) (C. elegans Sequencing Consortium, 1998). Given these regular and irregular features, the MFA could be an appropriate approach to analyze such distributions.

Meanwhile, the HG sequencing revealed a surprising mosaicism in coding (genes) and noncoding (repetitive DNA) sequences, Figure 2 (right) (Venter et al., 2001). This structure of 6 Gbp is divided into 23 pairs of chromosomes (diploid cells) and these highly regionalized sequences introduce complex patterns of regularity and irregularity to understand the gene structure, the composition of sequences of repetitive DNA and its role in the study and application of life sciences. The coding regions of the genome are estimated at ~25,000 genes which constitute 1.4% of GH. These genes are involved in a giant sea of various types of non-coding sequences which compose 98.6% of HG (misnamed popularly as “junk DNA”). The non-coding regions are characterized by many types of repeated DNA sequences, where 10.6% consists of Alu sequences, a type of SINE (short and dispersed repeated elements) sequence and preferentially located towards the genes. LINES, MIR, MER, LTR, DNA transposons and introns are another type of non-coding sequences which form about 86% of the genome. Some of these sequences overlap with each other; as with CpG islands, which complicates the analysis of genomic landscape. This standard genomic landscape was recently clarified, the last studies show that 80.4% of HG is functional due to the discovery of more than five million “switches” that operate and regulate gene activity, re-evaluating the concept of “junk DNA”. (The ENCODE Project Consortium, 2012).

Given that all these genomic variations both in worm and human produce regionalized genomic landscapes it is proposed that Fractal Geometry (FG) would allow measuring how the genetic information content is fragmented. In this paper the methodology and the nonlinear descriptive models for each of these genomes will be reviewed.

1.3 The MFA and its Application to Genome Studies

Most problems in physics are implicitly non-linear in nature, generating phenomena such as chaos theory, a science that deals with certain types of (non-linear) but very sensitive dynamic systems to initial conditions, nonetheless of deterministic rigor, that is that their behavior can be completely determined by knowing initial conditions (Peitgen et al, 1992). In turn, the FG is an appropriate tool to study the chaotic dynamic systems (CDS). In other words, the FG and chaos are closely related because the space region toward which a chaotic orbit tends asymptotically has a fractal structure (strange attractors). Therefore, the FG allows studying the framework on which CDS are defined (Moon, 1992). And this is how it is expected for the genome structure and function to be organized.

The MFA is an extension of the FG and it is related to (Shannon) information theory, disciplines that have been very useful to study the information content over a sequence of symbols. Initially, Mandelbrot established the FG in the 80’s, as a geometry capable of measuring the irregularity of nature by calculating the fractal dimension (D), an exponent derived from a power law (Mandelbrot, 1982). The value of the D gives us a measure of the level of fragmentation or the information content for a complex phenomenon. That is because the D measures the scaling degree that the fragmented self-similarity of the system has. Thus, the FG looks for self-similar properties in structures and processes at different scales of resolution and these self-similarities are organized following scaling or power laws.

Sometimes, an exponent is not sufficient to characterize a complex phenomenon; so more exponents are required. The multifractal formalism allows this, and applies when many subgroups of fractals with different scalar properties with a large number of exponents or fractal dimensions coexist simultaneously. As a result, when a spectrum of multifractal singularity measurement is generated, the scaling behavior of the frequency of symbols of a sequence can be quantified (Vélez et al, 2010).

The MFA has been implemented to study the spatial heterogeneity of theoretical and experimental fractal patterns in different disciplines. In post-genomics times, the MFA was used to study multiple biological problems (Vélez et al, 2010). Nonetheless, very little attention has been given to the use of MFA to characterize the content of the structural genetic information of the genomes obtained from the images of the Chaos Representation Game (CRG). First studies at this level were made recently to the analysis of the C. elegans genome (Vélez et al, 2010) and human genomes (Moreno et al, 2011). The MFA methodology applied for the study of these genomes will be developed below.

2. Methodology

The Multifractal Formalism from the CGR

2.1 Data Acquisition and Molecular Parameters

Databases for the C. elegans and the 36.2 Hs_ refseq HG version were downloaded from the NCBI FTP server. Then, several strategies were designed to fragment the genomic DNA sequences of different length ranges. For example, the C. elegans genome was divided into 18 fragments, Figure 2 (left) and the human genome in 9,379 fragments. According to their annotation systems, the contents of molecular parameters of coding sequences (genes, exons and introns), noncoding sequences (repetitive DNA, Alu, LINES, MIR, MER, LTR, promoters, etc.) and coding/ non-coding DNA (TTAGGC, AAAAT, AAATT, TTTTC, TTTTT, CpG islands, etc.) are counted for each sequence.

2.2 Construction of the CGR 2.3 Fractal Measurement by the Box Counting Method

Subsequently, the CGR, a recursive algorithm (Jeffrey, 1990; Restrepo et al, 2009) is applied to each selected DNA sequence, Figure 3 (above, left) and from which an image is obtained, which is quantified by the box-counting algorithm. For example, in Figure 3 (above, left) a CGR image for a human DNA sequence of 80,000 bp in length is shown. Here, dark regions represent sub-quadrants with a high number of points (or nucleotides). Clear regions, sections with a low number of points. The calculation for the D for the Koch curve by the box-counting method is illustrated by a progression of changes in the grid size, and its Cartesian graph, Table 1

The CGR image for a given DNA sequence is quantified by a standard fractal analysis. A fractal is a fragmented geometric figure whose parts are an approximated copy at full scale, that is, the figure has self-similarity. The D is basically a scaling rule that the figure obeys. Generally, a power law is given by the following expression:

Where N(E) is the number of parts required for covering the figure when a scaling factor E is applied. The power law permits to calculate the fractal dimension as:

The D obtained by the box-counting algorithm covers the figure with disjoint boxes ɛ = 1/E and counts the number of boxes required. Figure 4 (above, left) shows the multifractal measure at momentum q=1.

2.4 Multifractal Measurement

When generalizing the box-counting algorithm for the multifractal case and according to the method of moments q, we obtain the equation (3) (Gutiérrez et al, 1998; Yu et al, 2001):

Where the Mi number of points falling in the i-th grid is determined and related to the total number Mand ɛ to box size. Thus, the MFA is used when multiple scaling rules are applied. Figure 4 (above, right) shows the calculation of the multifractal measures at different momentum q (partition function). Here, linear regressions must have a coefficient of determination equal or close to 1. From each linear regression D are obtained, which generate an spectrum of generalized fractal dimensions Dfor all q integers, Figure 4 (below, left). So, the multifractal spectrum is obtained as the limit:

The variation of the q integer allows emphasizing different regions and discriminating their fractal a high Dq is synonymous of the structure’s richness and the properties of these regions. Negative values emphasize the scarce regions; a high Dindicates a lot of structure and properties in these regions. In real world applications, the limit Dqreadily approximated from the data using a linear fitting: the transformation of the equation (3) yields:

Which shows that ln In(Mi )= for set q is a linear function in the ln(ɛ), Dq can therefore be evaluated as q the slope of a fixed relationship between In(Mi )= and (q-1) ln(ɛ). The methodologies and approaches for the method of box-counting and MFA are detailed in Moreno et al, 2000, Yu et al, 2001; Moreno, 2005. For a rigorous mathematical development of MFA from images consult Multifractal system, wikipedia.

2.5 Measurement of Information Content

Subsequently, from the spectrum of generalized dimensions Dq, the degree of multifractality ΔDq(MD) is calculated as the difference between the maximum and minimum values of : ΔD qq Dqmax – Dqmin (Ivanov et al, 1999). When qmaxqmin ΔDis high, the multifractal spectrum is rich in information and highly aperiodic, when ΔDq is small, the resulting dimension spectrum is poor in information and highly periodic. It is expected then, that the aperiodicity in the genome would be related to highly polymorphic genomic aperiodic structures and those periodic regions with highly repetitive and not very polymorphic genomic structures. The correlation exponent t(q) = (– 1)DqFigure 4 (below, right ) can also be obtained from the multifractal dimension Dq. The generalized dimension also provides significant specific information. D(q = 0) is equal to the Capacity dimension, which in this analysis is the size of the “box count”. D(q = 1) is equal to the Information dimension and D(q = 2) to the Correlation dimension. Based on these multifractal parameters, many of the structural genomic properties can be quantified, related, and interpreted.

2.6 Multifractal Parameters and Statistical and Discrimination Analyses

Once the multifractal parameters are calculated (D= (-20, 20), ΔDq, πq, etc.), correlations with the molecular parameters are sought. These relations are established by plotting the number of genome molecular parameters versus MD by discriminant analysis with Cartesian graphs in 2-D, Figure 5 (below, left) and 3-D and combining multifractal and molecular parameters. Finally, simple linear regression analysis, multivariate analysis, and analyses by ranges and clusterings are made to establish statistical significance.

3 Results and Discussion

3.1 Non-linear Descriptive Model for the C. elegans Genome

When analyzing the C. elegans genome with the multifractal formalism it revealed what symmetry and asymmetry on the genome nucleotide composition suggested. Thus, the multifractal scaling of the C. elegans genome is of interest because it indicates that the molecular structure of the chromosome may be organized as a system operating far from equilibrium following nonlinear laws (Ivanov et al, 1999; Burgos and Moreno-Tovar, 1996). This can be discussed from two points of view:

1) When comparing C. elegans chromosomes with each other, the X chromosome showed the lowest multifractality, Figure 5 (above). This means that the X chromosome is operating close to equilibrium, which results in an increased genetic instability. Thus, the instability of the X could selectively contribute to the molecular mechanism that determines sex (XX or X0) during meiosis. Thus, the X chromosome would be operating closer to equilibrium in order to maintain their particular sexual dimorphism.

2) When comparing different chromosome regions of the C. elegans genome, changes in multifractality were found in relation to the regional organization (at the center and arms) exhibited by the chromosomes, Figure 5 (below, left). These behaviors are associated with changes in the content of repetitive DNA, Figure 5 (below, right). The results indicated that the chromosome arms are even more complex than previously anticipated. Thus, TTAGGC telomere sequences would be operating far from equilibrium to protect the genetic information encoded by the entire chromosome.

All these biological arguments may explain why C. elegans genome is organized in a nonlinear way. These findings provide insight to quantify and understand the organization of the non-linear structure of the C. elegans genome, which may be extended to other genomes, including the HG (Vélez et al, 2010).

3.2 Nonlinear Descriptive Model for the Human Genome

Once the multifractal approach was validated in C. elegans genome, HG was analyzed exhaustively. This allowed us to propose a nonlinear model for the HG structure which will be discussed under three points of view.

1) It was found that the HG high multifractality depends strongly on the contents of Alu sequences and to a lesser extent on the content of CpG islands. These contents would be located primarily in highly aperiodic regions, thus taking the chromosome far from equilibrium and giving to it greater genetic stability, protection and attraction of mutations, Figure 6 (A-C). Thus, hundreds of regions in the HG may have high genetic stability and the most important genetic information of the HG, the genes, would be safeguarded from environmental fluctuations. Other repeated elements (LINES, MIR, MER, LTRs) showed no significant relationship,

Figure 6 (D). Consequently, the human multifractal map developed in Moreno et al, 2011 constitutes a good tool to identify those regions rich in genetic information and genomic stability. 2) The multifractal context seems to be a significant requirement for the structural and functional organization of thousands of genes and gene families. Thus, a high multifractal context (aperiodic) appears to be a “genomic attractor” for many genes (KOGs, KEEGs), Figure 6 (E) and some gene families, Figure 6 (F) are involved in genetic and deterministic processes, in order to maintain a deterministic regulation control in the genome, although most of HG sequences may be subject to a complex epigenetic control.

3) The classification of human chromosomes and chromosome regions analysis may have some medical implications (Moreno et al, 2002; Moreno et al, 2009). This means that the structure of low nonlinearity exhibited by some chromosomes (or chromosome regions) involve an environmental predisposition, as potential targets to undergo structural or numerical chromosomal alterations in Figure 6 (G). Additionally, sex chromosomes should have low multifractality to maintain sexual dimorphism and probably the X chromosome inactivation.

All these fractals and biological arguments could explain why Alu elements are shaping the HG in a nonlinearly manner (Moreno et al, 2011). Finally, the multifractal modeling of the HG serves as theoretical framework to examine new discoveries made by the ENCODE project and new approaches about human epigenomes. That is, the non-linear organization of HG might help to explain why it is expected that most of the GH is functional.

4. Conclusions

All these results show that the multifractal formalism is appropriate to quantify and evaluate genetic information contents in genomes and to relate it with the known molecular anatomy of the genome and some of the expected properties. Thus, the MFB allows interpreting in a logic manner the structural nature and variation of the genome.

The MFB allows understanding why a number of chromosomal diseases are likely to occur in the genome, thus opening a new perspective toward personalized medicine to study and interpret the GH and its diseases.

The entire genome contains nonlinear information organizing it and supposedly making it function, concluding that virtually 100% of HG is functional. Bioinformatics in general, is enriched with a novel approach (MFB) making it possible to quantify the genetic information content of any DNA sequence and their practical applications to different disciplines in biology, medicine and agriculture. This novel breakthrough in computational genomic analysis and diseases contributes to define Biology as a “hard” science.

MFB opens a door to develop a research program towards the establishment of an integrative discipline that contributes to “break” the code of human life. (http://pharmaceuticalintelligence. com/page/3/).

5. Acknowledgements

Thanks to the directives of the EISC, the Universidad del Valle and the School of Engineering for offering an academic, scientific and administrative space for conducting this research. Likewise, thanks to co authors (professors and students) who participated in the implementation of excerpts from some of the works cited here. Finally, thanks to Colciencias by the biotechnology project grant # 1103-12-16765.

6. References

Blanco, S., & Moreno, P.A. (2007). Representación del juego del caos para el análisis de secuencias de ADN y proteínas mediante el análisis multifractal (método “box-counting”). In The Second International Seminar on Genomics and Proteomics, Bioinformatics and Systems Biology (pp. 17-25). Popayán, Colombia.         [ Links ]

Burgos, J.D., & Moreno-Tovar, P. (1996). Zipf scaling behavior in the immune system. BioSystem , 39, 227-232.         [ Links ]

C. elegans Sequencing Consortium. (1998). Genome sequence of the nematode C. elegans: a platform for investigating biology. Science , 282, 2012-2018.         [ Links ]

Gutiérrez, J.M., Iglesias A., Rodríguez, M.A., Burgos, J.D., & Moreno, P.A. (1998). Analyzing the multifractals structure of DNA nucleotide sequences. In, M. Barbie & S. Chillemi (Eds.) Chaos and Noise in Biology and Medicine (cap. 4). Hackensack (NJ): World Scientific Publishing Co.         [ Links ]

Ivanov, P.Ch., Nunes, L.A., Golberger, A.L., Havlin, S., Rosenblum, M.G., Struzikk, Z.R., & Stanley, H.E. (1999). Multifractality in human heartbeat dynamics. Nature , 399, 461-465.         [ Links ]

Jeffrey, H.J. (1990). Chaos game representation of gene structure. Nucleic Acids Research , 18, 2163-2175.         [ Links ]

Mandelbrot, B. (1982). La geometría fractal de la naturaleza. Barcelona. España: Tusquets editores.         [ Links ]

Moon, F.C. (1992). Chaotic and fractal dynamics. New York: John Wiley.         [ Links ]

Moreno, P.A. (2005). Large scale and small scale bioinformatics studies on the Caenorhabditis elegans enome. Doctoral thesis. Department of Biology and Biochemistry, University of Houston, Houston, USA.         [ Links ]

Moreno, P.A., Burgos, J.D., Vélez, P.E., Gutiérrez, J.M., & et al., (2000). Multifractal analysis of complete genomes. In P roceedings of the 12th International Genome Sequencing and Analysis Conference (pp. 80-81). Miami Beach (FL).         [ Links ]

Moreno, P.A., Rodríguez, J.G., Vélez, P.E., Cubillos, J.R., & Del Portillo, P. (2002). La genómica aplicada en salud humana. Colombia Ciencia y Tecnología. Colciencias , 20, 14-21.         [ Links ]

Moreno, P.A., Vélez, P.E., & Burgos, J.D. (2009). Biología molecular, genómica y post-genómica. Pioneros, principios y tecnologías. Popayán, Colombia: Editorial Universidad del Cauca.         [ Links ]

Moreno, P.A., Vélez, P.E., Martínez, E., Garreta, L., Díaz, D., Amador, S., Gutiérrez, J.M., et. al. (2011). The human genome: a multifractal analysis. BMC Genomics , 12, 506.         [ Links ]

Mount, D.W. (2004). Bioinformatics. Sequence and ge nome analysis. New York: Cold Spring Harbor Laboratory Press.         [ Links ]

Peitgen, H.O., Jürgen, H., & Saupe D. (1992). Chaos and Fractals. New Frontiers of Science. New York: Springer-Verlag.         [ Links ]

Restrepo, S., Pinzón, A., Rodríguez, L.M., Sierra, R., Grajales, A., Bernal, A., Barreto, E. et. al. (2009). Computational biology in Colombia. PLoS Computational Biology, 5 (10), e1000535.         [ Links ]

The ENCODE Project Consortium. (2012). An integrated encyclopedia of DNA elements in the human genome. Nature , 489, 57-74.         [ Links ]

Vélez, P.E., Garreta, L.E., Martínez, E., Díaz, N., Amador, S., Gutiérrez, J.M., Tischer, I., & Moreno, P.A. (2010). The Caenorhabditis elegans genome: a multifractal analysis. Genet and Mol Res , 9, 949-965.         [ Links ]

Venter, J.C., Adams, M.D., Myers, E.W., Li, P.W., & et al. (2001). The sequence of the human genome. Science , 291, 1304-1351.         [ Links ]

Yu, Z.G., Anh, V., & Lau, K.S. (2001). Measure representation and multifractal analysis of complete genomes. Physical Review E: Statistical, Nonlinear, and Soft Matter Physics , 64, 031903.         [ Links ]


Other articles on Bioinformatics on this Open Access Journal include:

Bioinformatics Tool Review: Genome Variant Analysis Tools

2017 Agenda – BioInformatics: Track 6: BioIT World Conference & Expo ’17, May 23-35, 2017, Seaport World Trade Center, Boston, MA

Better bioinformatics

Broad Institute, Google Genomics combine bioinformatics and computing expertise

Autophagy-Modulating Proteins and Small Molecules Candidate Targets for Cancer Therapy: Commentary of Bioinformatics Approaches

CRACKING THE CODE OF HUMAN LIFE: The Birth of BioInformatics & Computational Genomics

Read Full Post »

LIVE 9/21 8AM to 10:55 AM Expoloring the Versatility of CRISPR/Cas9 at CHI’s 14th Discovery On Target, 9/19 – 9/22/2016, Westin Boston Waterfront, Boston



Leaders in Pharmaceutical Business Intelligence (LPBI) Group is a

Media Partner of CHI for CHI’s 14th Annual Discovery on Targettaking place September 19 – 22, 2016 in Boston.

In Attendance, streaming LIVE using Social Media

Aviva Lev-Ari, PhD, RN






COMMENTS BY Stephen J Williams, PhD



8:00 Chairperson’s Opening Remarks

TJ Cradick , Ph.D., Head of Genome Editing, CRISPR Therapeutics




8:10 Functional Genomics Using CRISPR-Cas9: Technology and Applications

Neville Sanjana, Ph.D., Core Faculty Member, New York Genome Center and Assistant Professor, Department of Biology & Center for Genomics and Systems Biology, New York University


CRISPR Cas9 is easier to target to multiple genomic loci; RNA specifies DNA targeting; with zinc finger nucleases or TALEEN in the protein specifies DNA targeting


  • This feature of crisper allows you to make a quick big and cheap array of a GENOME SCALE Crisper Knock out (GeCKO) screening library
  • How do you scale up the sgRNA for whole genome?; for all genes in RefSeq, identify consitutive exons using RNA-sequencing data from 16 primary human tissue (alot of genes end with ‘gg’) changing the bases on 3’ side negates crisper system but changing on 5’ then crisper works fine
  • Rank sequences to be specific for target
  • Cloned array into lentiviral and put in selectable markers
  • GeCKO displays high consistency betweens reagents for the same gene versus siRNA; GeCKO has high screening sensitivity
  • 98% of genome is noncoding so what about making a library for intronic regions (miRNA, promoter regions?)
  • So you design the sgRNA library by taking 100kb of gene-adjacent regions
  • They looked at CUL3; (data will soon be published in Science)
  • Do a transcription CHIP to verify the lack of binding of transcription factor of interest
  • Can also target histone marks on promoter and enhancer elements
  • NYU wants to explore this noncoding screens
  • sanjanalab.org




8:40 Therapeutic Gene Editing With CRISPR/Cas9

TJ Cradick , Ph.D., Head of Genome Editing, CRISPR Therapeutics


NEHJ is down and dirty repair of single nonhomologous end but when have two breaks the NEHJ repair can introduce the inversions or deletions


    • High-throughput screens are fine but can limit your view of genomic context; genome searches pick unique sites so use bioinformatic programs  to design specific guide Rna
    • Bioinformatic directed, genome wide, functional screens
    • Compared COSMID and CCTOP; 320 COSMID off-target sites, 333 CCtop off target
    • Young lab GUIDESeq program genome wide assay useful to design guides
    • If shorten guide may improve specificity; also sometime better sensitivity if lengthen guide


  • Manufacturing of autologous gene corrected product ex vivo gene correction (Vertex, Bayer, are partners in this)



They need to use a clones from multiple microarrays before using the GUidESeq but GUIDEseq is better for REMOVING the off targets than actually producing the sgRNA library you want (seems the methods for library development are not fully advanced to do this)


The score sometimes for the sgRNA design programs do not always give the best result because some sgRNAs are genome context dependent

9:10 Towards Combinatorial Drug Discovery: Mining Heterogeneous Phenotypes from Large Scale RNAi/Drug Perturbations

Arvind Rao, Ph.D., Assistant Professor, Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center


Bioinformatics in CRISPR screens:  they looked at image analysis of light microscopy of breast cancer cells and looked for phenotypic changes


  • Then they modeled in a small pilot and then used the algorithm for 20,000 images (made morphometric measurements)
  • Can formulate training statistical algorithms to make a decision tree how you classify data points
  • Although their algorithms worked well there was also human input from scientists

Aggregate ranking of hits programs available on web like LINKS




10:25 CRISPR in Stem Cell Models of Eye Disease

Alexander Bassuk, M.D., Ph.D., Associate Professor of Pediatrics, Department of Molecular and Cellular Biology, University of Iowa


Blind athlete Michael Stone, biathlete, had eye disease since teenager helped fund and start the clinical trial for Starbardt disease; had one bad copy of ABCA4, heterozygous (inheritable in Ahkenazi Jewish) – a recessive inheritable mutation with juvenile macular degeneration

  • Also had another male in family with disease but he had another mutation in the RPGR gene
  • December 2015 paper Precision Medicine: Genetic Repair of retinitis pigmentosa in patient derived stem cells
  • They were able to correct the iPSCs in the RPGR gene derived from patient however low efficiency of repair, scarless repair, leaves changes in DNA, need clinical grade iPSCs, and need a humanized model of RPGR


10:55 CRISPR in Mouse Models of Eye Disease

Vinit Mahajan, M.D., Ph.D., Assistant Professor of Ophthalmology and Visual Sciences, University of Iowa College of Medicine

  • degeneration of the retina will see brown spots, the macula will often be preserved but retinal cells damaged but with RPGR have problems with peripheral vision, retinitis pigmentosa get tunnel vision with no peripheral vision (a mouse model of PDE6 Knockout recapitulates this phenotype)
  • the PDE6 is linked to the rhodopsin GTP pathway
  • rd1 -/- mouse has something that looks like retinal pigmentosa; has mutant PDE6; is actually a nonsense mutation in rd1 so they tried a crisper to fix in mice
  • with crisper fix of rd1 nonsense mutation the optic nerve looked comparible to normal and the retina structure restored
  • photoreceptors layers- some recovery but not complete
  • sequence results show the DNA is a mosaic so not correcting 100% but only 35% but stil leads to a phenotypic recovery; NHEJ was about 12% to 25% with large deletions
  • histology is restored in crspr repaired mice
  • CRSPR off target effects: WGS and analyze for variants SNV/indels, also looked at on target and off target regions; there were no off target SNVs indels while variants that did not pass quality control screening not a single SNV
  • Rhodopsin mutation accounts for a large % of patients (RhoD190N)
  • injection of gene therapy vectors: AAV vector carrying CRSPR and cas9 repair templates

CAPN mouse models

  • family in Iowa have dominant mutation in CAPN5; retinal degenerates
  • used CRSPR to generate mouse model with mutation in CAPN5 similar to family mutation
  • compared to other transgenic methods CRSPR is faster to produce a mouse model


Meeting #: #BostonDOT16

Meeting @: @BostonDOT


Overall good meeting #s:












AND FOLLOW these @






Read Full Post »

10:15AM 11/13/2014 – 10th Annual Personalized Medicine Conference at the Harvard Medical School, Boston

Reporter: Aviva Lev-Ari, PhD, RN


REAL TIME Coverage of this Conference by Dr. Aviva Lev-Ari, PhD, RN – Director and Founder of LEADERS in PHARMACEUTICAL BUSINESS INTELLIGENCE, Boston http://pharmaceuticalintelligence.com

10:15 a.m. Panel Discussion — IT/Big Data

IT/Big Data

The human genome is composed of 6 billion nucleotides (using the genetic alphabet of T, C, G and A). As the cost of sequencing the human genome is decreasing at a rapid rate, it might not be too far into the future that every human being will be sequenced at least once in their lifetime. The sequence data together with the clinical data are going to be used more and more frequently to make clinical decisions. If that is true, we need to have secure methods of storing, retrieving and analyzing all of these data.  Some people argue that this is a tsunami of data that we are not ready to handle. The panel will discuss the types and volumes of data that are being generated and how to deal with it.

IT/Big Data


Amy Abernethy, M.D.
Chief Medical Officer, Flatiron

Role of Informatics, SW and HW in PM. Big data and Healthcare

How Lab and Clinics can be connected. Oncologist, Hematologist use labs in clinical setting, Role of IT and Technology in the environment of the Clinicians

Compare Stanford Medical Center and Harvard Medical Center and Duke Medical Center — THREE different models in Healthcare data management

Create novel solutions: Capture the voice of the patient for integration of component: Volume, Veracity, Value

Decisions need to be made in short time frame, documentation added after the fact

No system can be perfect in all aspects

Understanding clinical record for conversion into data bases – keeping quality of data collected

Key Topics


Stephen Eck, M.D., Ph.D.
Vice President, Global Head of Oncology Medical Sciences,
Astellas, Inc.

Small data expert, great advantage to small data. Populations data allows for longitudinal studies,

Big Mac Big Data – Big is Good — Is data been collected suitable for what is it used, is it robust, limitations, of what the data analysis mean

Data analysis in Chemical Libraries – now annotated

Diversity data in NOTED by MDs, nuances are very great, Using Medical Records for building Billing Systems

Cases when the data needed is not known or not available — use data that is available — limits the scope of what Valuable solution can be arrived at

In Clinical Trial: needs of researchers, billing clinicians — in one system

Translation of data on disease to data object

Signal to Noise Problem — Thus Big data provided validity and power


J. Michael Gaziano, M.D., M.P.H., F.R.C.P.
Scientific Director, Massachusetts Veterans Epidemiology Research
and Information Center (MAVERIC), VA Boston Healthcare System;
Chief Division of Aging, Brigham and Women’s Hospital;
Professor of Medicine, Harvard Medical School

at BWH since 1987 at 75% – push forward the Genomics Agenda, VA system 25% – VA is horizontally data integrated embed research and knowledge — baseline questionnaire 200,000 phenotypes – questionnaire and Genomics data to be integrated, Data hierarchical way to be curated, Simple phenotypes, validate phenotypes, Probability to have susceptibility for actual disease, Genomics Medicine will benefit Clinicians

Data must be of visible quality, collect data via Telephone VA – on Med compliance study, on Ability to tolerate medication

–>>Annotation assisted in building a tool for Neurologist on Alzheimer’s Disease (AlzSWAN knowledge base) (see also Genotator , a Disease-Agnostic Tool for Annotation)

–>>Curation of data is very different than statistical analysis of Clinical Trial Data

–>>Integration of data at VA and at BWH are tow different models of SUCCESSFUL data integration models, accessing the data is also using a different model

–>>Data extraction from the Big data — an issue

–>>Where the answers are in the data, build algorithms that will pick up causes of disease: Alzheimer’s – very difficult to do

–>>system around all stakeholders: investment in connectivity, moving data, individual silo, HR, FIN, Clinical Research

–>>Biobank data and data quality


Krishna Yeshwant, M.D.
General Partner, Google Ventures;
Physician, Brigham and Women’s Hospital

Computer Scientist and Medical Student. Were the technology is going?

Messy situation, interaction IT and HC, Boston and Silicon Valley are focusing on Consumers, Google Engineers interested in developing Medical and HC applications — HUGE interest. Application or Wearable – new companies in this space, from Computer Science world to Medicine – Enterprise level – EMR or Consumer level – Wearable — both areas are very active in Silicon Valley

IT stuff in the hospital HARDER that IT in any other environment, great progress in last 5 years, security of data, privacy. Sequencing data cost of big data management with highest security

Constrained data vs non-constrained data

Opportunities for Government cooperation as a Lead needed for standardization of data objects


Questions from the Podium:

  • Where is the Truth: do we have all the tools or we don’t for Genomic data usage
  • Question on Interoperability
  • Big Valuable data — vs Big data
  • quality, uniform, large cohort, comprehensive Cancer Centers
  • Volume of data can compensate quality of data
  • Data from Imaging – Quality and interpretation – THREE radiologist will read cancer screening




– See more at: http://personalizedmedicine.partners.org/Education/Personalized-Medicine-Conference/Program.aspx#sthash.qGbGZXXf.dpuf











Read Full Post »

DNA Sequencing Technology

Reporter: Larry H Bernstein, MD, FCAP

Focus on DNA Sequencing Technology
Nature Biotechnology  feb 2013; 31: 2.

Knocking on the clinic door  Nature Biotechnology 2012; 1009  http://dx.doi.org/10.1038/nbt.2428
The New York Genome Center – pp1021 – 1022  http://dx.doi.org/10.1038/nbt.2429
Direct-to-consumer genomics reinvents itself – pp1027 – 1029
Malorye Allison  By putting its foot in the door at the FDA, can 23andMe reinvigorate direct-to-consumer genomics?

Genomic DNA is fragmented into random pieces a...

Genomic DNA is fragmented into random pieces and cloned as a bacterial library. DNA from individual bacterial clones is sequenced and the sequence is assembled by using overlapping DNA regions.(click to expand) (Photo credit: Wikipedia)

Science Fellows

Science Fellows (Photo credit: AlphachimpStudio)

English: Created by Abizar Lakdawalla.

English: Created by Abizar Lakdawalla. (Photo credit: Wikipedia)

Related articles


Read Full Post »

2013 Genomics: The Era Beyond the Sequencing of the Human Genome: Francis Collins, Craig Venter, Eric Lander, et al.

Curator: Aviva Lev-Ari, PhD, RN


One decade following the completion of the  Sequencing of the Human Genome — the field of Genomics, the discipline that has emerged as a result of project completion has FOUR concentrations:



Sequencing Human Genome: the Contributions of Francis Collins and Craig Venter

By: Jill Adams, Ph.D. (Freelance science writer in Albany, NY) © 2008 Nature Education
Citation: Adams, J. (2008) Sequencing human genome: the contributions of Francis Collins and Craig Venter. Nature Education 1(1)
How did it become possible to sequence the 3 billion base pairs in the human genome? More than a quarter of a century’s worth of work from hundreds of scientists made such projects possible.

Before the middle of the twentieth century, the gene was an abstract concept thought to physically resemble a “bead on a string,” and within the scientific community, it was accepted that each gene was associated with a single protein, enzyme, or metabolic disorder. However, this began to change during the 1950s with the birth of modern molecular genetics. In 1952, Alfred Hershey and Martha Chase proved that DNA was themolecule of heredity, and shortly thereafter, Watson, Crick, Franklin, and Wilkins solved the three-dimensional structure ofDNA. By 1959, Jerome Lejeune had demonstrated that Down syndrome was linked to chromosomal abnormalities (Lejeune et al., 1959). Next, the 1961 discovery of mRNA (Jacob & Monod, 1964) and the 1966 cracking of the genetic code (Figure 1; Nirenberg et al., 1966) made it possible to predict proteinsequences based on DNA sequence alone. Nonetheless, although it was well established by this time that DNA was the heredity material and that each nucleus must contain the complete DNA required to instruct the chemical processes of anorganism, the details of reading individual gene sequences, let alone whole genomes, were out of the technical grasp of scientists.

A large part of the reason for this inability to read genesequences was the fact that there were simply very few sequences available to read; furthermore, the tools required to identify, isolate, and manipulate desired stretches of DNA were just evolving. Then, during the late 1960s and early 1970s, the combined work of several groups of researchers culminated in the isolation of proteins from prokaryotes using DNA cut at specific sites and spliced with DNA from other species(Meselson & Yuan, 1968; Jackson et al., 1972; Cohen et al., 1973). With these tools in place, the recombinant DNA age was about to allow scientists to start cloning genes en masse for the first time. Indeed, with the advent of Maxam-Gilbert DNAsequencing in the mid-1970s (Maxam & Gilbert, 1977), it actually became possible to read the entire sequence of a clonedgene, perhaps 1,000 to 30,000 base pairs long, with relative ease.


Collins and Other Researchers Master Gene Mapping


Thanks to these advances, mapping of important disease genes was all the rage by the 1980s, and Francis Collins was one of the masters of this process. Collins made a name for himself by discovering the location of three important disease genes—those responsible for cystic fibrosis, Duchenne muscular dystrophy, and Huntington’s disease. The accomplishments were a result of both cutting-edge cloning techniques like chromosome jumping (Collins et al., 1987; Richards et al., 1988) and plain perseverance. Collins wasn’t the only researcher actively “gene hunting” at this time, however; hundreds of other investigators were also racing to publish detailed descriptions of every new disease gene found.

During the 1980s, the importance of genes was obvious, but determining their location on chromosomes or their sequence of DNA nucleotides was laborious. Early studies of the genome were technically challenging and slow. Reagents were expensive, and the conditions for performing many reactions were temperamental. It therefore took several years to sequence single genes, and most genes were only partially cloned and described. Scientists had already reached the milestone of fully sequencing their first genome—that of the FX174 bacteriophage, whose 5,375 nucleotides had been determined in 1977 (Sanger et al., 1977b)—but this endeavor proved much easier than sequencing the genomes of more complex life forms. Indeed, the prospect of sequencing the 1 million base pairs of the E. coli genome or the 3 billion nucleotides of the humangenome seemed close to impossible. For example, an article published in the New York Times in 1987 noted that only 500 human genes had been sequenced (Kanigel, 1987). At the time, that was thought to be about 1% of the total, and given the pace of discovery, it was believed that complete sequencing of the human genome would take at least 100 years.

In addition to questions about the technical challenges and costs associated with sequencing large genomes, a number of concerns about the scientific basis of these endeavors were also raised. Why spend the time, money, and resources to sequence the whole genome when only a small percentage of it was actually genes? With the huge scale of these projects, there was a logic to prioritizing certain tasks over others—specifically, the target sequencing of coding sequences (genes). Thus, instead of sequencing the raw genome, many researchers sought to study cDNA collections; these are DNA strands that are generated by collecting mRNA from a tissue, then converting it back to complementary DNA. Because cDNA starts as a message in a cell, it represents an actively expressed gene. Moreover, because cells behave differently in different tissues and at different developmental stages, specialized cDNA libraries are valuable tools for assessing what specific genes are at work in a cell at any given time. Scientists could therefore use these libraries to prioritize their sequencing in order to focus on coding sequences first.

At the same time, researchers were also working to identify many more polymorphic genetic markers to use as tools in genemapping. Polymorphisms are the individual DNA base changes that make each of us unique at the level of DNA. The number of known human polymorphisms and microsatellite repeats increased to more than 2,000 by 1992—or 1 per every 2.5 million bases or so (Weissenbach et al., 1992). As researchers characterized more and more polymorphic markers, their chances ofmapping a gene of interest to its chromosomal location increased dramatically.


Venter Combines Approaches to Make Sequencing Faster and Less Expensive


Thus, by the late 1980s, multiple approaches for sequencingDNA were in use, but costs and time constraints were still a limiting factor to research. However, this all began to change with the work of National Institutes of Health (NIH) scientist J. Craig Venter. For several years, Venter had been using automated DNA sequencers to sequence portions of chromosomes associated with Huntington’s disease and myotonic dystrophy (Adams et al., 1991, 1992). Next, Venter tapped collections of cDNA molecules made from brain tissues. Then, in a 1991 paper, he described how he harnessed the power of his high-tech equipment to sequence more than 600 expressed sequence tags (ESTs) from a brain cDNA collection, identifying about half of them as genes, far more than anyone else had ever reported in a single paper to date. Not only did Venter’s paper make an impact, but so did his claims that in his laboratory alone, he could sequence as many as 10,000 ESTs a year at the low cost of $0.12/base. The next year, in a second paper, Venter published the sequences of more than 2,000 genes, although some were incomplete. This brought the total to 2,500 genes sequenced in one laboratory, which was as many as had been sequenced in the entire world to that point (Figure 2).

Many scientists spoke out in criticism of Venter’s brash approach. They noted that by sequencing ESTs, Venter was missing promoter sequences and other sites on DNA that were important for the regulation of gene expression. Furthermore, many critics argued that a focus on cheap volume was no substitute for careful, painstaking science. However, Venter’s speed also spurred other groups—namely, the NIH effort led by James Watson—to step up their efforts to finish the Human Genome Project sooner.

In 1992, Venter left the NIH and, with the help of a venture capitalist, started a nonprofit research institute at which he quickly set up 30 automated sequencers. Venter’s aim in doing so was to complete the sequencing of the human genomefaster than the government-backed (“public”) effort. This competition would later culminate in the simultaneous publication of the draft human genome sequence by both public and private efforts, ahead of schedule and below budget.

The events that occurred from the discovery of DNA’s structure and role as a heredity molecule up through Venter’s high-throughput EST experiments roughly delimit what is now known as the pregenomic era of molecular biology. The molecular tools and methods developed during this era were essential to reaching the milestone of sequencing the entire humangenome.


References and Recommended Reading

Adams, M. D., et al. Complementary DNA sequencing: “Expressed sequence tags” and the Human Genome Project. Science252, 1651–1656 (1991)

———. Sequence identification of 2,375 human brain genes. Nature 355, 632–634 (1992) doi:10.1038/355632a0 (link to article)

Cohen, S. N., et al. Construction of biologically functional bacterial plasmids in vitroProceedings of the National Academy of Sciences 70, 3240–3244 (1973)

Collins, F. S., et al. Construction of a general human chromosome jumping library, with application to cystic fibrosis. Science235, 1046–1049 (1987)

Dulbecco, R. A turning point in cancer research: Sequencing the human genome. Science 231, 1055–1056 (1986) doi:10.1126/science.3945817

Jackson, D. A., et al. Biochemical method for inserting new genetic information into DNA of simian virus 40: Circular SV40 DNA molecules containing lambda phage genes and the galactose operon of Escherichia coliProceedings of the National Academy of Sciences 69, 2904–2909 (1972)

Jacob, F., & Monod, J. Biochemical and genetic mechanisms of regulation in the bacterial cell. Bulletin de Societe Chimique de France 46, 1499–1532 (1964)

Kanigel, R. The genome project. New York Times, 13 December (1987)

Lejeune, J., et al. Mongolism: A chromosomal disease (trisomy). Bulletin de l’Academie Nationale de Medecine 143, 256–265 (1959)

Maxam, A., & Gilbert, W. A new method of sequencing DNA. Proceedings of the National Academy of Sciences 74, 560–564 (1977)

Meselson, M., & Yuan, R. DNA restriction enzyme from E. coliNature 217, 1110–1114 (1968)

Nirenberg, M. W., et al. The RNA code and protein synthesis. Cold Spring Harbor Symposia on Quantitative Biology 31, 11–24 (1966)

Richards, J. E., et al. Chromosome jumping from D4S10 (G8) toward the Huntington disease gene. Proceedings of the National Academy of Sciences 5, 6437–6441 (1988)

Sanger, F., et al. Nucleotide sequence of bacteriophage phi X174 DNA. Nature 265, 687–695 (1977a) (link to article)

Sanger, F., et al. DNA sequencing with chain-terminating inhibitors. Proceedings of the National Academy of Sciences 74, 5463–5467 (1977b)

Weissenbach, J., et al. A second-generation linkage map of the human genome. Nature 359, 794–801 (1992) doi:10.1038/359794a0 (link to article)

Davies, K. Cracking the Genome: Inside the Race to Unlock Human DNA (New York, Free Press, 2001)



Contributors to Genomics recognized by Dan David Prize Awards



Laureates 2012 – 2012 Future – Genome Research


Founding Director, Broad Institute Harvard and MIT and director of its Genome Biology Program, Cambridge, MA, USA

Prof. Eric Lander has been a major intellectual force in genomics research. Building on his background in mathematics, he placed genomics on a firm quantitative foundation.

With David Botstein and Phil Green he developed algorithms to allow effective use of polymorphism data for genetic mapping and published the first genetic linkage map of the human genome. As the human genome project got underway, he demonstrated an unusual ability to innovate in the organization of high-throughput methods first in creating genetic maps of the mouse and rat genomes and later as a major contributor to the Human Genome Project.

Lander was a powerful and respected voice in the planning and execution of the genome project. The Center he led contributed much of the data, he pioneered many of the analyses of genome sequence data, and he led in the writing of the landmark publication describing the Human Genome Project first as a draft sequence in Nature, 2001 and later as a full sequence in Nature, 2004. This has become the standard human reference sequence.

Lander has also been at the forefront of applying the genome sequence to the study of human disease, generating the first deep SNP catalogs, applying them to understand the haploid structure of the genome and more recently, championing the use of common variation to the study of complex traits. He has led efforts to understand the functional elements of the human genome, generating genome sequence from multiple other mammals to delineate the conserved elements and to define noncoding RNAs and characterize chromatin states.

Among Prof. Lander’s awards are:  Honorary Degree, Columbia University; Honorary Doctorate, Lund University, Sweden; Honorary Doctorate, University of Massachusetts at Lowell; Gairdner  Foundation International Award, Canada; Max Delbruck Medal, Berlin; Honorary Doctorate, Mount Sinai School of Medicine, New York; Honorary Doctorate, Tel Aviv Universtiy; Millennium Lecturer, The White House; Member of the American Academy of Arts and Sciences; Member of the American Academy of Achievement; and Member of the U.S. National Academy of Sciences.

Beyond his immediate scientific contributions, Eric Lander has attracted talented investigators to the field and fostered their careers. He has also served the community, most recently as co-Chair of the President’s Council of Advisors on Science and Technology.



Eric S. Lander, Ph.D.
By Karen Hopkin

In many ways, Eric Lander’s career has taken as many twists and turns as there are in the helical strands of DNA that he now spends his time trying to decode. Before turning his attention to the human genome, Lander worked as a mathematician, an economist, and even a newspaper reporter, amassing an impressive array of awards and achievements along the way. If the equation that describes Lander’s life story has a common denominator, it would have to be his pursuit of intellectual challenge.

It all began with math. From the start, he was captivated by the power and beauty of numbers. “Math is so elegant. Ideas dovetail perfectly with other ideas to form beautiful intellectual edifices,” he says. What’s more, these mathematical constructions can be used to describe and understand the world around us—making mathematics, to Lander’s mind, the purest product of human thought and “the highest form of crystallized abstraction.”

Lander was a master of mathematics. He placed second in a national math test and h ad the highest grades in his class at Stuyvesant High, one of New York City’s top schools for students who show a talent in math or science. His paper on quasi-perfect numbers—which the 17-year-old Lander proved exist only in theory—won him the Westinghouse Prize. His work at Princeton, where he received his undergraduate degree in mathematics, earned him a Rhodes scholarship at Oxford University. There, Lander completed his graduate degree in pure mathematics. He was well on his way to living his life as a chalk-stained mathematician, but he realized something was missing. “I loved pure mathematics,” says Lander. “But I didn’t want to make it a life.”

“Mathematics is kind of monastic,” he notes. “It’s a very lonely and individual pursuit. And I’m not a very good monk. I like doing things with people.”

This connection with people set into motion the series of happy accidents that would eventually draw Lander into a biology lab. When Lander returned from Oxford, a Princeton professor sent Lander’s résumé to a statistician at Harvard’s School of Public Health, who passed it along to someone at the Business School. Lander was offered a job at Harvard—teaching economics. “I knew no economics whatsoever,” he admits. “But I figured you can learn that stuff.”

Lander was a quick study and a decent teacher, but economics did not provide him with the intellectual stimulation he needed. Fortunately, his little brother did. Arthur Lander, a neuroscientist by training, sent his sibling some papers about mathematical neurobiology. Lander realized that he couldn’t fully understand the research until he learned a bit more about neurobiology. And he couldn’t handle the neurobiology without studying some cell biology, which he couldn’t grasp until he tackled molecular biology. So Lander opted to audit a biology course at Harvard and spent his evenings cloning fruit fly genes in the lab. “I essentially picked up biology on the street corner,” he says with a smile. Of course, in Cambridge—home of Harvard and the Massachusetts Institute of Technology—people who hang out on street corners are just as likely to be discussing biology as anything else.

After a lecture one night, Lander ran into David Botstein, a geneticist at MIT who had developed methods for scanning the genome to find an individual gene that may play a role in disease. He was hoping next to develop a means to untangle the genetics behind more complex human disorders that are thought to arise from subtle disturbances in dozens or hundreds of genes—cancer, diabetes, schizophrenia, even obesity.

The two got to arguing (as good New Yorkers will) about how statistics could be used to search for the genes involved in complex human diseases. Soon, they had the outline of a solution. Lander secured a position as a fellow at the Whitehead Institute for Biomedical Research, where he set to work on the problem. The appointment was a bit unusual—Lander was still a professor at the Harvard Business School—but he made enough progress to receive a MacArthur fellowship for his efforts.

Now a geneticist, Lander joined MIT as a tenured faculty member and a year later he launched the Whitehead Institute/MIT Center for Genome Research, becoming director of one of the first genome sequencing centers in the world. “It was a chaotic career path,” notes Lander. “But everything worked out okay.”

As head of the center, Lander helped build a series of maps that show the basic layout of the human and mouse genomes. In addition to providing the scaffolding needed to assemble the full human genome sequence, completed last year, these maps have proved useful for pinpointing the location of genes involved in disease. For Lander, that’s what his efforts are all about. “Disease is my motivation,” he says. “All the information about one’s risk for disease is hiding in the genome. The goal is to tease out that information.

“A cell already knows what it will be, what it will do,” he adds. “So it’s just a matter of persuading the cell to tell us what it knows.” Lander knows how to be persuasive. Already he and his colleagues at the Whitehead Institute have teased out genes involved in diabetes and gained knowledge that will help scientists diagnose and treat cancers. Whitehead researchers have produced approximately one-third of the human genome sequence. But prying the secrets from the human genome is work that is really just beginning.

The first problem: The human genome is big. Imagine someone dumping 1,000 volumes of the Encyclopaedia Britannica in your living room, says Lander. “How would you tackle all that information? Would you read all the spines first? Or would you start at ‘aardvark’ and go from there?”

But size isn’t the only obstacle. The human genome is also written in code. Scientists are still learning how to decipher the information encrypted in the 3 billion letters that provide the instructions for assembling and operating a human being. The human genome may represent a “book of life,” but it is not yet an open book.

“Looking at the genome is not like looking down at Earth from space and seeing all the clouds and oceans,” says Lander. “You have to think of the questions you want to ask. And then you have to figure out how to ask them.

“That’s my main job,” says Lander. “Thinking about the questions.”

Asking these questions often requires new techniques. And for someone who loves data, who wants answers, the waiting can be the hardest part. “Most days are spent just getting things ready,” says Lander. “So you have to be reasonably good at delayed gratification.” For example, before Lander and his team could build a map of the human genome, they spent months developing new biochemical procedures, new robotics, and new analytical software. “Once everything was in place, making the map was fun.”

Biology may involve a lot of grunt work—certainly more than mathematics does—but Lander doesn’t seem to mind. “The highs, when they come, are better than anything you could imagine.

“Getting to pursue new ideas and new directions, always thinking about new things—it’s intoxicating, it’s addicting,” he says. “I could never give it up.”




Laureates 2012 – 2012 Future – Genome Research


Founder, Chairman, and President of the J. Craig Venter Institute, Rockville, MD and La Jolla, CA, USA and CEO of Synthetic Genomics Inc., La Jolla, CA, USA.

Dr. J. Craig Venter has made numerous contributions to genomics—from ESTs and the first genome of a living species, to the human genome and environmental genomics, to the most recent accomplishments of constructing the first synthetic bacterial cell.

Venter’s initial efforts focused on identifying human genes through random cDNA sequencing (through the use of expressed sequence tags or ESTs) which identified fragments of about half the human genes in his 1995 publication.

Venter led the group that produced the first full sequence of a bacterium, H. influenza, using their whole genome shotgun approach. Five years later, Venter co-founded a company, Celera Genomics, to extend the whole genome shotgun method with newly developed algorithms and instrumentation to sequence the drosophila, human, mouse, rat and mosquito genomes. His group published a draft human sequence simultaneously with the publicly-funded Human Genome Project in 2001.

Venter went on to apply high-throughput sequencing to ocean microbial populations and the human gut, contributing greatly to the rapidly expanding field of metagenomics. More recently, Venter has focused much of his group’s efforts on synthetic genomics, first synthesizing the phix-174 viral genome and transplanting the genome of M. mycoides into a cell of a related species. In 2010 he and the team combined those two technologies, using synthetic oligo-nucleotides to recreate a 1.1 million base pair bacterial genome, and placed it in a new host, thereby constructing the largest synthetically made genome and the first synthetic bacterial cell.

Dr. Venter has received numerous awards and honors including: The 2008 National Medal of Science; Washington, DC, Member, National Academy of Sciences, Washington, DC; Member of the American Society of Microbiology; Honorary Doctor of Science – Syracuse University; the Benjamin Rush Medal – College of William and Mary, VA; Honorary Doctor of Science – Mount Sinai School of Medicine, New York; Scientist of the Year – ARCS Foundation, San Diego; Doctor of Science Honoris Causa – University of Melbourne; Doctorat Honoris Causa – University of Montreal; Doctor of Science Honoris Causa – Imperial College, London; Scripps Institute of Oceanography Nierenberg Prize, La Jolla, CA; Honorary Doctor of Science, Chung Yuan University, Taipei; Presidential Distinguished Scientific Award; World Health Award, Presented by Mikhail Gorbachev, World Awards, Vienna, Austria; University College London Prize in Clinical Science – London, England; Honorary Doctor of Technology, Royal Institute of Technology, Stockholm, Sweden; Medal of the Presidency, Italian Republic, Rimini, Italy; Prince of Asturias Award for Technical and Scientific Research; Fellow, American Academy of Arts and Sciences, Washington, DC; and the Exceptional Service Award for Exploring Genomes.




Laureates 2012 – 2012 Future – Genome Research


Anthony B. Evnin Professor of Genomics; Director, Lewis-Sigler Institute for Integrative Genomics; Director, Certificate Program in Quantitative and Computational Biology, Princeton University, Princeton, NJ, USA

Prof. David Botstein has been the intellectual leader of genomics since its inception. He created modern human genetics, championed the Human Genome Project, devised microarrays to exploit genome information for the global assessment of gene expression and has fostered systems biology. He has mentored numerous young scientists in the field, first at MIT, later at Stanford and most recently at Princeton.

Botstein’s 1980 paper “Construction of a Genetic Linkage Map in Man Using Restriction Fragment Length Polymorphisms” was the first to explicitly argue that it would be possible to build a sufficiently dense map of markers through the human genome to permit the mapping of disease genes in families by monitoring the transmission of those markers and disease status through the families. The vision outlined in this paper provided not only the clearest early motivation for the initiation of the human genome project, but its clarity and beauty drew many scientists into the field of genomics.

Following these seminal contributions he has been an intellectual participant in many of the most important key developments in genomics, the most prominent examples of which are: a) the development along with Pat Brown of methods to measure and statistically analyze gene expression profiles and apply these methods to the identification of subtypes of cancer. It would be impossible to overstate the impact of this work both in terms of basic biological research and the direction of thinking about molecular taxonomies of disease;  b) articulation of the need to organize genes into biological groupings to permit systematic pathway analyses, and the initiation of generic systems to do so.   Interestingly, these last areas are clear antecedents of what is now coming to be known as systems biology and which David Botstein is again one of the key intellectual figures.
Among Prof. Botstein’s awards and honors are: Member of the US National Academy of Sciences, the Eli Lilly and Company Award in Microbiology, the Genetics Society of America Medal, the Allen Award of the American Society of Human Genetics, and the Gruber Prize in Genetics.



Laureates 2011 – 2011 Past – Evolution


Professor, Department of Biology, Stanford University, Stanford, CA, USA

Prof. Feldman has produced conceptual results of broad interest in the domain of animal and plant evolution. His work has led to highly focused insights of cultural significance such as the out-of-Africa model of human evolution, as well as cultural preferences in different civilizations. His work not only explores basic scientific topics, but investigates the societal consequences of the conclusions he draws in terms of models of evolution.

Prof. Feldman originated the quantitative theory of genetic modifiers of recombination, mutation, and dispersal. His work was the first to show that the pattern of interactions among genes determined whether sex would evolve.

With Cavalli-Sforza, he originated the quantitative theory of cultural evolution. The application of this theory to the culture of son preference in China, and his work on the significance of male/female birth ratio in that country, seems likely to have very important social management consequences, leading to attempts by the Chinese authorities to reduce this preference.

Prof. Feldman demonstrated that today’s world wide pattern of genomic variation is largely due to the sequence of human migrations over the 60,000 years since modern humans left Africa. His finding that about 10 percent of genomic variation is between continents has inspired much of the subsequent discussion on the meaning of race.

Prof. Feldman and collaborators originated “niche construction,” a generalization of evolutionary theory that stresses the feedbacks between organismic evolution and environmental dynamics, demonstrating via his model that phenotypes have a much more active role in evolution than previously thought. This has profoundly influenced subsequent work in evolutionary ecology.

Feldman’s findings have triggered the development of new scientific fields in both the humanities and life sciences. He sheds light on many key issues of evolution, including hominid evolution and the evolution of culture. Feldman has done much demographic work on trends important to humanity’s future.

Among Marcus Feldman’s honors are Elected Fellow, American Association for the Advancement of Science; Elected Member, American Academy of Arts and Sciences; Doctor Pholosophiae Honoris Causa, Hebrew University Jerusalem; Doctor Philosophiae Honoris Causa, Tel Aviv University; member of the editorial boards of various scientific journals; and a member of various international committees and foundations.




Laureates 2011 – 2011 Future – Ageing-Facing the Challenge


Professor of Genetics, Department of Molecular Biology, Massachusetts General Hospital, Harvard University

Gary Ruvkun has made a major contribution to the future of human health with the discovery of conserved hormonal signaling pathways with universal influence on animal aging. He is a key figure in defining the genetic basis for human health during aging with his discovery of a core set of hormonal signals and signaling pathways that regulate aging and lifespan in animal models, that are likely to act in humans as well.

In a series of reports starting in the early 1990s Ruvkun defined an insulin signaling pathway that regulates aging in the C. elegans worm and showed that the essential elements of this pathway are conserved in mice and humans. He discovered that like mammals, C. elegans uses an insulin-like signaling pathway to control its metabolism and longevity, suggesting that insulin-like regulation of longevity and metabolism is ancient and universal.

The Ruvkun lab discovered the molecular identity of the many genes in the pathway, including the daf-2insulin receptor, the many insulins that act upstream of the daf-2 receptor, the signal transduction components downstream of the insulin receptor such as age-1, daf-18, pdk-1, akt-1, and akt-2, and the downstream transcription factors daf-16 and daf-3, to reveal the signaling pathway from hormone to membrane receptor to the gene expression changes in the nucleus that regulate metabolism and longevity. Their finding by that the DAF-16/FoxO transcription factor is coupled to insulin signaling via conserved interactions with the kinases AKT and PDK also points to these transcriptional cascades as key in metabolic responses to insulin. This finding has been important for understanding the defects in diabetes as well as for aging research, since the mammalian orthologs of daf-16, the FoxO transcription factors, are regulated by insulin and are emerging now as key outputs of insulin signaling.

Recent insulin signaling mutant analyses in mouse and humans have validated the generality of these discoveries to other animals. Not surprisingly, an insulin-like pathway is now a major theme in animal aging regulation, with many reports of insulin-like regulation of lifespan in Drosophila, mouse, and even human beginning to emerge.

This work had an enormous impact on aging research relevant to longevity and later-life health. These findings catalyzed developments across biogerontology by defining hormone interventions with direct relevance to clinical practice and drug development.

Ruvkun is now using RNAi screens and comparative genomics to reveal the downstream genes regulated by insulin signaling. He discovered a connection between longevity and small RNA pathways, with the production of specific small RNA factors induced in long lived mutant animals.

Among Gary Ruvkun’s awards are: Benjamin Franklin Medal, Franklin Institute; Albert Lasker Award for Basic Medical Research; member of the American Academy of Arts and Sciences; and member of the National Academy of Sciences.




Laureates 2005 – 2005 Future – Materials Science – Tissue Engineering


Robert Langer is the Kenneth L. Germeshausen Professor of Chemical and Biomedical Engineering at the Massachusetts Institute of Technology, USA.

Prof. Langer has pioneered the field of biomaterials and tissue engineering. He has contributed to the development of biocompatible polymers for drug delivery and synthetic polymers to form specific tissue structures creating the field of tissue engineering. His work has allowed the controlled release of macromolecules using biocompatible polymers.

Prof. Langer is also responsible for the creation of numerous novel biomaterials, such as shape memory polymers and materials with switchable surfaces, aerosols and microchips. His work has led to the development of synthetic polymers to deliver cells to form specific tissue structures.

He has been a prolific contributor to this new field of materials science. He has mentored numerous students and post docs who have themselves become leaders in the field.

In 2002 he was awarded the Charles Stark Draper Prize of the NAE. He has won numerous other awards and is one of the few people who have been elected to all three US National Academies (Science, Engineering and Medicine).




Laureates 2002 – 2002 Future – Life Sciences


Prof. Robert H. Waterston (born 1943 in Michigan, USA) obtained a bachelor’s degree in engineering from Princeton University in 1965 and received both a medical degree and a doctorate in pathology from the University of Chicago in 1972. After a postdoctoral fellowship at the Medical Research Council Laboratory of Molecular Biology in Cambridge, England, Prof. Waterston joined the Washington University faculty in 1976. He is James S. McDonnel Professor of Genetics, head of the Department of Genetics, and director of the School of Medicine’s Genome Sequencing Center, which he founded in 1993. The center was a principal member of the International Human Genome Sequencing Consortium, the public effort to complete the working draft.

He was a recipient of an American Heart Association Established Investigator Award from 1980 to 1985, and held a John Simon Guggenheim Fellowship from 1985 to 1986. He has served as a member of several NIH study sections and as chairman of the NIH’s Molecular Cytology Study Section. He currently serves on the NIH Advisory Council.

Prof. Waterston is a member of Sigma Xi, Alpha Omega Alpha, the Genetics Society and the American Society of Cell Biology. He has published more than 70 peer-reviewed scientific articles.

“It’s powerful information, and the potential benefits are enormous,” Prof. Waterston says. “We all have a responsibility to educate ourselves about the issues. To realize its great promise, scientific information of this sort must be available in an unrestricted form to citizens and scientists everywhere.” 

“For the next hundred years, scientists will use these foundations to make increasingly detailed discoveries about how human beings and other organisms work,” says geneticist Robert H. Waterston of the advances in genetics research. “As a result, more and more will be understood about all aspects of human health, behavior, and disease – and ultimately about therapy and prevention.”



Laureates 2002 – 2002 Future – Life Sciences


subsequently received the Nobel Prize for Medicine in 2002.

Sir John Sulston graduated from Cambridge University in 1963. After completing his Ph.D. on the chemical synthesis of DNA, he moved to the USA to study prebiotic chemistry (the origins of life on Earth). In 1969, Sir John joined Sydney Brenner’s group at the Medical Research Council Laboratory of Molecular Biology in Cambridge where he studied the biology and genetics of the nematode worm, Caenorhabditis elegans. He and his team collaborated with Bob Waterston at Washington University in the USA to sequence the genome of this model organism. In 1992, Sir Sulston was appointed the first Director of the Sanger Centre in Cambridgeshire, which is behind the UK’s contribution to the international Human Genome Project. He stepped down as Director in September 2000.

Sir John Sulston is co-author with Georgina Ferry of The Common Thread: A Story of Science, Politics, Ethics and the Human Genome, to be published by Bantam Press in February 2002. The book tells the story of the sequencing of the human genome from the point of view of one of its leading figures, and discusses what the achievement means for future medical treatments and our understanding of ourselves. In light of the recent ‘gene rush’ by companies to stake claims to parts of the genome, the authors argue that the information it contains should be freely available for the benefit of all, and not carved up for private profit. “The human genome will be the foundation of biology for decades, centuries or millennia to come”.



Laureates 2002 – 2002 Future – Life Sciences


subsequently received the Nobel Prize for Medicine in 2002.

Prof. Sydney Brenner’s sustained contributions during the course of a scientific career spanning 40 years are exceptional both in their novelty and in their impact on biology.

During 1957 – 1973, he provided fundamental insights into the genetic code. In 1957, he produced a theoretical paper that presented a formal demonstration of the impossibility of all overlapping codes, insisting that further efforts in deciphering the genetic code be restricted to non-overlapping codes. In 1961 he, together with Francis Crick and others, published evidence for the triphet nature of the genetic code deduced from the frame-shift mutagenesis experiments, which remain a tour de force. He published, together with Fran?ois Jacob and Matthew Meselson, their discovery of messenger RNA, a finding that provided fundamental insights into translation of the genetic code. In 1964 and succeeding years, Prof. Brenner and others published a demonstration of the colinearity of a gene and deciphered nonsense codons by genetics. During the mid-1960s Prof. Brenner, together with Fran?ois Jacob and Fran?ois Cuzin, established the fundamental principles underlying the regulation of DNA replication in E coli. From 1974 to 1990, Prof. Brenner and his colleagues introduced the eukaryotic model C. elegans and demonstrated its utility for studying development. He developed the genetic methodology for dissecting the organism’s developmental program, especially of the nervous system. His students have proved the wisdom of his choice by extending the model to aging and apoptosis. Now that the genome sequence of C. elegans is complete, the usefulness of this system is greatly enhanced. During the 1980s and 1990s, Prof. Brenner made great political and scientific contributions to the establishment of recombinant NDA technology in general and to the human genome project in particular. Among other things, he introduced the study of the putter fish, one of the very few vertebrate organisms to have very little “junk” DNA.

Prof. Sydney Brenner was born in South Africa on 13 January 1927 and studied medicine and science at the University of Witwatersrand, Johannesburg . He went on to Oxford, working in the Physical Chemistry Laboratory, and and receiveed a degree of D.Phil. in 1952. After a brief return to South Africa, he joined the MRC Unit in the Cavendish Laboratory at Cambridge in 1956. He worked here and in its successor, the MRC Laboratory of Molecular Biology at Cambridge, where he was Director from 1979 to 1987. In 1987 he became Director of the MRC Unit of Molecular Genetics, retiring in 1992 from the MRC. He is now Director of the Molecular Sciences Institute, a private research institute in Berkeley, California.

Last year, aged 74, Prof. Brenner accepted an offer to become a research professor at the Salk Institute for Biological Studies. He said: “I don’t want to retire to play golf. Science is one’s hobby and one’s work and one’s pleasure.”


Read Full Post »

Reporter:  Aviva Lev-Ari, PhD, RN

Call for Open-Access Publishing in Genomics

January 14, 2013

SAN DIEGO (GenomeWeb News) – Open-access datasets, software, and bioinformatics strategies have become more or less de rigueur in genomics research.

But the field may also be poised to change the way other sorts of information from scientific studies is conveyed to other researchers and to the broader public, according to open-access proponent Michael Eisen, a computational and evolutionary biology researcher at the University of California, Berkeley.

Eisen, a Public Library of Science co-founder, spoke during the morning plenary session here at the International Plant and Animal Genome Conference.

In his presentation, he argued that the inability to freely and unreservedly access the full text of all genome studies performed to date may have led to missed opportunities for the field.

Using the bacteriophage phiX174 genome sequence as an example, he proposed that the general thinking in the genomics field has developed in ways that promote open-access to sequence data and related software. But, he said, the same type of access is not necessarily available for those interested in delving into the details and rationale behind genomics studies, since the corresponding papers may not be accessible in an open-access format.

The UK Medical Research Council Laboratory of Molecular Biology‘s Frederick Sanger and colleagues described the phiX174 sequence in 1977, in a publication that’s generally considered to be the first genome paper. The sequence data presented in that study is now freely available, Eisen explained, in part owing to the advent of sequence databases such as the European Molecular Biology Laboratory Nucleotide Sequence database or the National Center for Biotechnology Information’s sequence database, GenBank.

During the past decade or more, funding agency requirements and pressure from within the genomics community have contributed to the widespread adoption of these and other public genomics resources and repositories.

As these databases have grown and become accepted within the genomics community, Eisen argued that they have spurred the development of computational methods for analyzing genome sequences and datasets that may not have existed otherwise. “Imagine where we would be had we not made the fortunate decision to liberate genome sequences,” he said.

But analogous strategies for combing through text from genomics studies in their entirety have not developed in the same manner, according to Eisen, who noted that the text of the phiX174 genome paper remains behind a pay wall.

“We’ve allowed [journal access] and [data access] to follow very different fates,” said Eisen, who says there are ways to use the information housed within the scientific literature more easily and productively.

He urged attendees to consider publishing their own work in open-access publications. Beyond that, though, Eisen also noted that the community is well positioned to influence the ways in which research information is disseminated, since genomics data increasingly serves as a resource for other spheres of research.



Read Full Post »

Consumer Market for Personal DNA Sequencing: Part 4

Reporter: Aviva Lev-Ari, PhD RN

FDA Warning for the Leader of Consumer Market for Personal DNA Sequencing Part 4

Word Cloud by Daniel Menzin

This Part 4 of the series on Present and Future Frontier of Research in Genomics has been 

UPDATED on 12/6/2013

23andMe Suspends Health Interpretations

December 06, 2013

Direct-to-consumer genetic testing company 23andMe hasstopped offering its health-related test to new customers, bringing it in line with a request from the US Food and Drug Administration.

In letter sent on Nov. 22, FDA said that 23andMe had not adequately responded to its concerns regarding the validity of their Personal Genome Service. The letter instructed 23andMe to “immediately discontinue marketing” the service until it receives authorization from the agency.

According to a post at the company’s blog from CEO Anne Wojcicki, 23andMe customers who purchased their kits on or after Nov. 22 “will not have access to health-related results.” They will, though, have access to ancestry information and their raw genetic data. Wojcicki notes that the customers may have access to the health interpretations in the future depending on FDA marketing authorization. Those customers are also being offered a refund.

Customers who purchased their kits before Nov. 22 will have access to all reports.

“We remain firmly committed to fulfilling our long-term mission to help people everywhere have access to their own genetic data and have the ability to use that information to improve their lives,” a notice at the 23andMe site says.

In a letter appearing in the Wall Street Journal earlier this week, FDA Commissioner Margaret Hamburg wrote that the agency “supports the development of innovative tests.” As an example, she pointed to its recent clearance of sequencing-based testsfrom Illumina.

She added that the agency also understands that some consumers do want to know more about their genomes and their genetic risk of disease, and that a DTC model would let consumers take an active role in their health.

“The agency’s desire to review these particular tests is solely to ensure that they are safe, do what they claim to do and that the results are communicated in a way that a consumer can understand,” Hamburg said.

In a statement, 23andMe’s Wojcicki says that the company remains committed to its ethos of allowing people access to their genetic information. “Our goal is to work cooperatively with the FDA to provide that opportunity in a way that clearly demonstrates the benefit to people and the validity of the science that underlies the test,” Wojcicki adds.


UPDATED on 11/27/2013

FDA Tells Google-Backed 23andMe to Halt DNA Test Service



FDA Letter to 23andME

Department of Health and Human Services logoDepartment of Health and Human Services

Public Health Service
Food and Drug Administration
10903 New Hampshire Avenue
Silver Spring, MD 20993

Nov 22, 2013

Ann Wojcicki
23andMe, Inc.
1390 Shoreline Way
Mountain View, CA 94043
Document Number: GEN1300666
Re: Personal Genome Service (PGS)
Dear Ms. Wojcicki,
The Food and Drug Administration (FDA) is sending you this letter because you are marketing the 23andMe Saliva Collection Kit and Personal Genome Service (PGS) without marketing clearance or approval in violation of the Federal Food, Drug and Cosmetic Act (the FD&C Act).
This product is a device within the meaning of section 201(h) of the FD&C Act, 21 U.S.C. 321(h), because it is intended for use in the diagnosis of disease or other conditions or in the cure, mitigation, treatment, or prevention of disease, or is intended to affect the structure or function of the body. For example, your company’s website at http://www.23andme.com/health (most recently viewed on November 6, 2013) markets the PGS for providing “health reports on 254 diseases and conditions,” including categories such as “carrier status,” “health risks,” and “drug response,” and specifically as a “first step in prevention” that enables users to “take steps toward mitigating serious diseases” such as diabetes, coronary heart disease, and breast cancer. Most of the intended uses for PGS listed on your website, a list that has grown over time, are medical device uses under section 201(h) of the FD&C Act. Most of these uses have not been classified and thus require premarket approval or de novo classification, as FDA has explained to you on numerous occasions.
Some of the uses for which PGS is intended are particularly concerning, such as assessments for BRCA-related genetic risk and drug responses (e.g., warfarin sensitivity, clopidogrel response, and 5-fluorouracil toxicity) because of the potential health consequences that could result from false positive or false negative assessments for high-risk indications such as these. For instance, if the BRCA-related risk assessment for breast or ovarian cancer reports a false positive, it could lead a patient to undergo prophylactic surgery, chemoprevention, intensive screening, or other morbidity-inducing actions, while a false negative could result in a failure to recognize an actual risk that may exist. Assessments for drug responses carry the risks that patients relying on such tests may begin to self-manage their treatments through dose changes or even abandon certain therapies depending on the outcome of the assessment. For example, false genotype results for your warfarin drug response test could have significant unreasonable risk of illness, injury, or death to the patient due to thrombosis or bleeding events that occur from treatment with a drug at a dose that does not provide the appropriately calibrated anticoagulant effect. These risks are typically mitigated by International Normalized Ratio (INR) management under a physician’s care. The risk of serious injury or death is known to be high when patients are either non-compliant or not properly dosed; combined with the risk that a direct-to-consumer test result may be used by a patient to self-manage, serious concerns are raised if test results are not adequately understood by patients or if incorrect test results are reported.
Your company submitted 510(k)s for PGS on July 2, 2012 and September 4, 2012, for several of these indications for use. However, to date, your company has failed to address the issues described during previous interactions with the Agency or provide the additional information identified in our September 13, 2012 letter for(b)(4) and in our November 20, 2012 letter for (b)(4), as required under 21 CFR 807.87(1). Consequently, the 510(k)s are considered withdrawn, see 21 C.F.R. 807.87(1), as we explained in our letters to you on March 12, 2013 and May 21, 2013.  To date, 23andMe has failed to provide adequate information to support a determination that the PGS is substantially equivalent to a legally marketed predicate for any of the uses for which you are marketing it; no other submission for the PGS device that you are marketing has been provided under section 510(k) of the Act, 21 U.S.C. § 360(k).
The Office of In Vitro Diagnostics and Radiological Health (OIR) has a long history of working with companies to help them come into compliance with the FD&C Act. Since July of 2009, we have been diligently working to help you comply with regulatory requirements regarding safety and effectiveness and obtain marketing authorization for your PGS device. FDA has spent significant time evaluating the intended uses of the PGS to determine whether certain uses might be appropriately classified into class II, thus requiring only 510(k) clearance or de novo classification and not PMA approval, and we have proposed modifications to the device’s labeling that could mitigate risks and render certain intended uses appropriate for de novo classification. Further, we provided ample detailed feedback to 23andMe regarding the types of data it needs to submit for the intended uses of the PGS.  As part of our interactions with you, including more than 14 face-to-face and teleconference meetings, hundreds of email exchanges, and dozens of written communications, we provided you with specific feedback on study protocols and clinical and analytical validation requirements, discussed potential classifications and regulatory pathways (including reasonable submission timelines), provided statistical advice, and discussed potential risk mitigation strategies. As discussed above, FDA is concerned about the public health consequences of inaccurate results from the PGS device; the main purpose of compliance with FDA’s regulatory requirements is to ensure that the tests work.
However, even after these many interactions with 23andMe, we still do not have any assurance that the firm has analytically or clinically validated the PGS for its intended uses, which have expanded from the uses that the firm identified in its submissions. In your letter dated January 9, 2013, you stated that the firm is “completing the additional analytical and clinical validations for the tests that have been submitted” and is “planning extensive labeling studies that will take several months to complete.” Thus, months after you submitted your 510(k)s and more than 5 years after you began marketing, you still had not completed some of the studies and had not even started other studies necessary to support a marketing submission for the PGS. It is now eleven months later, and you have yet to provide FDA with any new information about these tests.  You have not worked with us toward de novo classification, did not provide the additional information we requested necessary to complete review of your 510(k)s, and FDA has not received any communication from 23andMe since May. Instead, we have become aware that you have initiated new marketing campaigns, including television commercials that, together with an increasing list of indications, show that you plan to expand the PGS’s uses and consumer base without obtaining marketing authorization from FDA.
Therefore, 23andMe must immediately discontinue marketing the PGS until such time as it receives FDA marketing authorization for the device. The PGS is in class III under section 513(f) of the FD&C Act, 21 U.S.C. 360c(f). Because there is no approved application for premarket approval in effect pursuant to section 515(a) of the FD&C Act, 21 U.S.C. 360e(a), or an approved application for an investigational device exemption (IDE) under section 520(g) of the FD&C Act, 21 U.S.C. 360j(g), the PGS is adulterated under section 501(f)(1)(B) of the FD&C Act, 21 U.S.C. 351(f)(1)(B).  Additionally, the PGS is misbranded under section 502(o) of the Act, 21 U.S.C. § 352(o), because notice or other information respecting the device was not provided to FDA as required by section 510(k) of the Act, 21 U.S.C. § 360(k).
Please notify this office in writing within fifteen (15) working days from the date you receive this letter of the specific actions you have taken to address all issues noted above. Include documentation of the corrective actions you have taken. If your actions will occur over time, please include a timetable for implementation of those actions. If corrective actions cannot be completed within 15 working days, state the reason for the delay and the time within which the actions will be completed. Failure to take adequate corrective action may result in regulatory action being initiated by the Food and Drug Administration without further notice. These actions include, but are not limited to, seizure, injunction, and civil money penalties.
We have assigned a unique document number that is cited above. The requested information should reference this document number and should be submitted to:
James L. Woods, WO66-5688
Deputy Director
Patient Safety and Product Quality
Office of In vitro Diagnostics and Radiological Health
10903 New Hampshire Avenue
Silver Spring, MD 20993
If you have questions relating to this matter, please feel free to call Courtney Lias, Ph.D. at 301-796-5458, or log onto our web site at www.fda.gov for general information relating to FDA device requirements.
Sincerely yours,
Alberto Gutierrez
Office of In vitro Diagnostics
and Radiological Health
 Center for Devices and Radiological Health



Cancer Diagnostics by Genomic Sequencing: ‘No’ to Sequencing Patient’s DNA, ‘No’ to Sequencing Patient’s Tumor, ‘Yes’ to focus on Gene Mutation Aberration & Analysis of Gene Abnormalities



Personal Genetics: An Intersection Between Science, Society, and Policy

Saturday, February 16, 2013: 8:30 AM-11:30 AM

Room 203 (Hynes Convention Center)

On 26 June 2000, scientists announced the completion of a rough draft of the human genome, the result of the $3 billion publicly funded Human Genome Project. In the decade since, the cost of genome sequencing has plummeted, coinciding with the development of deep sequencing technologies and allowing, for the first time, personalized genetic medicine. The advent of personal genetics has profound implications for society that are only beginning to be discussed, even as the technologies are rapidly maturing and entering the market. This symposium will focus on how the genomic revolution may affect our society in coming years and how best to reach out to the general public on these important issues. How has the promise of genomics, as stated early in the last decade, matched the reality we observe today? What are the new promises — and pitfalls — of genomics and personal genetics as of 2013? What are the ethical implications of easy and inexpensive human genome sequencing, particularly with regard to ownership and control of genomic datasets, and what stakeholder interests must be addressed? How can the scientific community engage with the public at large to improve understanding of the science behind these powerful new technologies? The symposium will comprise three 15-minute talks from representatives of relevant sectors (academia/education, journalism, and industry), followed by a 45-minute panel discussion with the speakers.


Peter Yang, Harvard University


Brenna Krieger, Harvard University

and Kevin Bonham, Harvard University


James Thornton, Harvard University



Ting Wu, Harvard University

Personal Genetics and Education

Mary Carmichael, Boston Globe

The Media and the Personal Genetics Revolution

Brian Naughton, 23andMe Inc.

Commercialization of Personal Genomics: Promise and Potential Pitfalls

Mira Irons, Children’s Hospital Boston

Personal Genomic Medicine: How Physicians Can Adapt to a Genomic World

Sheila Jasanoff, Harvard University

Citizenship and the Personal Genomics

Jonathan Gitlin, National Human Genome Research Institute

Personal Genomics and Science Policy


How to Tailor Cancer Therapy to the particular Genetics of a patient’s Cancer

‘No’ to Sequencing Patient’s DNA, ‘No’ to Sequencing Patient’s Tumor, ‘Yes’ to focus on Gene Mutation Aberration & Analysis of Gene Abnormalities PRESENTED in the following FOUR PARTS. Recommended to be read in its entirety for completeness and arrival to the End Point of Present and Future Frontier of Research in Genomics

Part 1:

Research Paradigm Shift in Human Genomics – Predictive Biomarkers and Personalized Medicine


Part 2:

LEADERS in the Competitive Space of Genome Sequencing of Genetic Mutations for Therapeutic Drug Selection in Cancer Personalized Treatment


Part 3:

Personalized Medicine: An Institute Profile – Coriell Institute for Medical Research


Part 4:

The Consumer Market for Personal DNA Sequencing


Part 4:

The Consumer Market for Personal DNA Sequencing

How does 23andMe genotype my DNA?

Technology and Standards

23andMe is a DNA analysis service providing information and tools for individuals to learn about and explore their DNA. We use the Illumina OmniExpress Plus Genotyping BeadChip (shown here). In addition to the variants already included on the chip by Illumina, we’ve included our own, customized set of variants relating to conditions and traits that are interesting. Technical information on the performance of the chip can be found on Illumina’s website.

All of the laboratory testing for 23andMe is done in a CLIA-certified laboratory.

Once our lab receives your sample, DNA is extracted from cheek cells preserved in your saliva. The lab then copies the DNA many times — a process called “amplification” — growing the tiny amount extracted from your saliva until there is enough to be genotyped.

In order to be genotyped, the amplified DNA is “cut” into smaller pieces, which are then applied to our DNA chip, a small glass slide with millions of microscopic “beads” on its surface (read more about this technology). Each bead is attached to a “probe”, a bit of DNA that matches one of the approximately one million genetic variants that we test. The cut pieces of your DNA stick to the matching DNA probes. A fluorescent signal on each probe provides information that can tell us which version of that genetic variant your DNA corresponds to.

Although the human genome is estimated to contain about 10-30 million genetic variants, many of them are correlated due to their proximity to each other. Thus, one genetic variant is often representative of many nearby variants, and the approximately one million variants on our genotyping chip provide very good coverage of common variation across the entire genome.

Our research team has also hand-picked tens of thousands of additional genetic variants linked to various conditions and traits in the scientific literature to analyze on our genotyping chip. As a result we can provide you with personal genetic information available only through 23andMe.

Genetics service 23andMe announced some new cash in the bank today with a $50 million raise from Yuri Milner, 23andMe CEO Anne Wojcicki, Google’s Sergey Brin (who also happens to be Wojcicki’s husband), New Enterprise Associates, MPM Capital, and Google Ventures.

With today’s new funding also comes the reduction of the price of its genome analysis service to $99. This isn’t special holiday pricing (as 23andMe has run repeatedly in the past) the company tells me, but rather what its normal pricing will be from now on.

This move is overdue, at least as far as 23andMe’s business model is concerned. Just yesterday TechCrunch Conference Chair Susan Hobbs told me she was waiting for another $99 pricing deal to buy the Personal Genome Analysis product. Sure 23andMe has experimented with various pricing models, including subscription, since its founding in 2007, but had been at an official and prohibitive $299 price point until today. It’s also apparently been rigorously beta-testing various price points in the past couple of weeks, at some point experimenting with some lower than $99.

For comparison, the company’s original pricing began at $999 and offered subscribers just 14 health and trait reports versus today’s 244 reports, as well as genetic ancestry information. Natera, Counsyl and Pathway Genomics are also in the genomics space, but they work by offering their services through doctors rather than direct to consumer.

Since the company’s launch five years ago, it’s had 180K civilians profile their DNA, and representative Catherine Afarian tells us that, post-price drop and funding, its goal is to reach a million customers in 2013. This is a supremely ambitious goal considering it wants to turn an average user acquisition rate of 36K per year into one of 820K in one year alone.

But Afarian isn’t fazed and brings up how the company once sold out 20k in $99 account inventory on something called “DNA Day.” “Once we can offer the service at $99 it means the average American will buy in,” she said.

That $299 was too pricey, according to Hobbs, but $99 might be just right. She said the $99 price point, which yes, is less than an iPhone, was the main factor in her decision to buy in. “23andMe is more ‘nice-to-know’ information rather than ‘need-to-know’ information. It’s nice to know your ancestry. It’s more of a need to know that you are predisposed genetically for a type of cancer, so that you may take precautionary measures,” she said, implying that the data given by 23andMe isn’t necessarily vital medical information, or actionable when it is. While 23andMe can give you indicators about certain disease risks, it doesn’t close the loop, as in tell you what to do to prevent these diseases.

“Its [utility] depends on your genetic data,” said Afarian when I asked her about the usefulness of the product. “If you’ve got a Factor 5 that puts you at risk for clotting, you might want to invest in anti-clotting socks. [And] there’s always something about themselves that people didn’t know.”

Hobbs said eventually that she wouldn’t buy it, but only because she was looking into more exact lineage information for her little girl, and you need a Y chromosome in all DNA tests to show paternal lineage. Afarian also countered this hesitation, saying that what makes 23andMe unique is that it’s not only looking at just your Y or your mitochondrial DNA, but also your autosomal DNA, which does show some patrilineal information for females who lack that precious Y.

While still sort of a novelty, the potential for 23andMe goes beyond lineage and hopefully that extra $50 million will go further than keeping the price low and into research. The company hopes that a million users will result in a giant database of 23andWe genetic info that can be used to spot trends, like which genes mean a higher risk of diabetes/cancer, etc. Which is great if it happens but for now remains a pipe dream for 23andMe/We.


12/13/2012 @ 5:23PM |6,471 views

What Is 23andMe Really Selling: The Moral Quandary At The Center Of The Personalized Genomics Revolution

This week, 23andme, the personalized genomics company founded by Anne Wojcicki, wife of Google co-founder Sergey Brin, got an influx of investment cash ($50 million). According to their press release, they are using the money to bring the cost of their genetic test down to $99 (it was previously $299) which, they hope, will inspire the masses to get tested.

So should the masses indulge?

I prefer a quantified self approach to this question. At the heart of the quantified self-movement lies a very simple idea: metrics make us better. For devotees, this means “self-tracking,” using everything from the Nike fuel band to the Narcissism Personality Index to gather large quantities of personal data and—the bigger idea—use that data to improve performance.

If you consider that performance suffers when health suffers then a genetic test can been seen as a kind of metric used to improve performance. This strikes me as the best way to evaluate this idea and leads us to ask the same question about personalized genomics that the quantified self movement asks about every other metric: will it improve performance.

Arguments rage all over the place on this one, but the short answer is that SNP tests—which is the kind of DNA scan 23andme relies upon— don’t tell us all that much (yet).  They analyze a million genes out of three billion total and the impact those million play in long term-health outcomes is still in dispute. For example, the nature/nurture split is normally viewed at 30/70—meaning environmental factors play a far more significant role in long-term health outcomes than genetics.

Moreover, all of the performance metrics used by the quantified self movement are used to for behavior modification—to drive self-improvement. Personalized genomics isn’t there yet. As Stanford University’s Nobel Prize-winning RNA researcher Andy Fire once told me, “if someone off the street is looking for pointers on how to live a healthier life, there’s nothing these tests will tell you besides basic physician advice like ‘eat right, don’t smoke and get plenty of exercise.’”

And even with more well-regarded SNP tests, like the ones that examine the BRCA 1 and 2 markers for breast cancer—which  . NYU Langone Medical Center bioethicist Arthur Caplan explains it like this, “Say you test positive for a breast cancer disposition—then what are you going to do? The only preventative step you can take is to chop off your breasts.”

So if prevention is not available the only thing left is fear and anxiety. Unfortunately, in the past few decades, there have been hundreds of studies linking stress to everything from immunological disorders to heart disease to periodonitic troubles. So while finding out you may be at risk for Parkinson’s may make you feel informed, that knowledge isn’t going to stop you from developing the disease—but the resulting stress may contribute to a host of other complications.

This brings up a different question: if personalized genomics can’t yet help us much and could possibly hurt us—where’s the upside?

Turns out there’s a big upside: Citizen science. SNP tests are not yet viable because we need more info. 23andme talks about the “power of one million people,” meaning, if one million take these tests then the resulting genetic database could lead to big research breakthroughs and these could lead to all sorts of health/performance improvements.

This is what 23andme is really selling for $99 bucks a pop—a crowdsourced shot at unraveling a few more DNA mysteries.

And this also means that the question at the heart of the personalized genomics industry is not about metrics at all—it’s about morals: Should I risk my health for the greater good?


You can browse your data for all of the variants we test using the Browse Raw Data feature, or download your data here.

before you buy (59) »

What unexpected things might I learn?

How does 23andMe genotype my DNA?

Can I use the saliva collection kit for infants and toddlers?

getting started (20) »

When and how do I get my data?

How do I collect saliva samples?

How long will it take for my sample to reach the lab?

account/profile settings (20) »

Which Ancestry setting in My Profile should I choose?

How do I use Browse Raw Data?

What do the options under the “Account” link in the upper right-hand corner control?

product features (145) »

I know that a particular person is my relative. What’s the probability that we share a sufficient amount of DNA to be detected by Relative Finder?

What is the average percent DNA shared for different types of cousins?

How does Relative Finder estimate the Predicted Relationship?

research initiatives (8) »

What do I get in return for taking surveys?

What is your research goal?

What is 23andMe Research?








Read Full Post »

Author & Curator: Aviva Lev-Ari, PhD, RN

Cancer Diagnostics by Genomic Sequencing: ‘No’ to Sequencing Patient’s DNA, ‘No’ to Sequencing Patient’s Tumor, ‘Yes’ to focus on Gene Mutation Aberration & Analysis of Gene Abnormalities

How to Tailor Cancer Therapy to the particular Genetics of a patient’s Cancer


‘No’ to Sequencing Patient’s DNA, ‘No’ to Sequencing Patient’s Tumor, ‘Yes’ to focus on Gene Mutation Aberration & Analysis of Gene Abnormalities

PRESENTED in the following FOUR PARTS. Recommended to be read in its entirety for completeness and arrival to the End Point of Present and Future Frontier of Research in Genomics

Part 1:

Research Paradigm Shift in Human Genomics – Predictive Biomarkers and Personalized Medicine

Part 2:

LEADERS in the Competitive Space of Genome Sequencing of Genetic Mutations for Therapeutic Drug Selection in Cancer Personalized Treatment


Part 3:

Personalized Medicine: An Institute Profile – Coriell Institute for Medical Research


Part 4:

The Consumer Market for Personal DNA Sequencing



Part 1:

Research Paradigm Shift in Human Genomics – Predictive Biomarkers and Personalized Medicine


In Part 1, we will address the following FIVE DIRECTIONS in Genomics Research

  • ‘No’ to Sequencing Patient’s DNA, ‘No’ to Sequencing Patient’s Tumor, ‘Yes’ to focus on Gene Mutation Aberration & Analysis of Gene Abnormalities
  • Sequencing DNA from individual cells vs “humans as a whole.” Sequencing DNA from individual cells is changing the way that researchers think of humans as a whole.
  • Promising Research Directions By Watson, 1/10/2013
  • Disruption of Cancer Metabolism targeted by Metabolic Gatekeeper
  • Molecular Analysis of the different Stages of  Cancer Progression for Targeting Therapy


Predictive Biomarkers and Personalized Medicine

No to Sequencing Patient’s DNA, No to Sequencing Patient’s Tumor, Yes to focus on Gene Mutation Aberration & Analysis of Gene Abnormalities


MD Anderson Research

targeted agents matched with tumor molecular aberrations.

Molecular analysis

Patients whose tumors had an aberration were treated with matched targeted therapy, compared with those of consecutive patients who were not treated with matched targeted therapy


40.2% – 1 or more aberration.

In 1 aberration , matched tx higher response rate  27% vs 5%

Longer time ot treatment failure  TTF 5.2 vs. 2.2

Longer survival  13.4 vs. 9 months

Pt. w/1 mutation (molecular aberrationMatched targeted therapy associated with longer TTF vs. prior systemic therapy 5.2 vs. 3.1

matched therapy was an independent factor predicting response superior to TTF


Not randomized study, and patients had diverse tumor types and a median of 5 prior therapies,  results suggest that identifying specific molecular abnormalities and choosing therapy based on these abnormalities is relevant in phase I clinical trials

Clin Cancer Res. 2012 Nov 15;18(22):6373-83. doi: 10.1158/1078-0432.CCR-12-1627. Epub 2012 Sep 10.

Personalized medicine in a phase I clinical trials program: the MD Anderson Cancer Center initiative.

Tsimberidou AM, Iskander NG, Hong DS, Wheler JJ, Falchook GS, Fu S, Piha-Paul S, Naing A, Janku F, Luthra R, Ye Y, Wen S, Berry D, Kurzrock R.


Department of Investigational Cancer Therapeutics, Phase I Clinical Trials Program, The University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA. atsimber@mdanderson.org



Opinion by Dr. Pierluigi Scalia, 1/11/2013.

The fact of using nanotechnology in order to target and treat abnormal cancer cells and tissues adds a powerful weapon towards eradicating the disease in the foreseeable future. However, focusing on weapons when we still have not found a reliable way to build that personalized “shooting target” (Cancer Fingerprinting) still constitutes, in my opinion, the single most relevant barrier to the adoption of Personalized treatments.


Ritu Saxena’s interview


Other studies supporting this perspective


p53 gene deletion predicts for poor survival and non-response to therapy with purine analogs in chronic B-cell leukemias


Chromosome aberrations in solid tumors


Chromosome aberrations in B-cell chronic lymphocytic leukemia: reassessment based on molecular cytogenetic analysis


Multivariate analysis of prognostic factors in CLL: clinical stage, IGVH gene mutational status, and loss or mutation of the p53 gene are independent prognostic factors


Clonal analysis of delayed karyotypic abnormalities and gene mutations in radiation-induced genetic instability.


Comprehensive genetic characterization of CLL: a study on 506 cases analysed with chromosome banding analysis, interphase FISH, IgVH status and …


Detection of aberrations of the p53 alleles and the gene transcript in human tumor cell lines by single-strand conformation polymorphism analysis


Genetic aberrations detected by comparative genomic hybridization are associated with clinical outcome in renal cell carcinoma


VH mutation status, CD38 expression level, genomic aberrations, and survival in chronic lymphocytic leukemia


Microarray gene expression profiling of B-cell chronic lymphocytic leukemia subgroups defined by genomic aberrations and VH mutation status


… nucleophosmin (NPM1) predicts favorable prognosis in younger adults with acute myeloid leukemia and normal cytogenetics: interaction with other gene mutations


Transformation of follicular lymphoma to diffuse large cell lymphoma is associated with a heterogeneous set of DNA copy number and gene expression alterations

[DOC] Pax 6 Gene Research and the Pancreas


Molecular analysis of the cyclin-dependent kinase inhibitor gene p27/Kip1 in human malignancies

Molecular genetic analysis of oligodendroglial tumors shows preferential allelic deletions on 19q and 1p.

Cytogenetic analysis of soft tissue sarcomas: recurrent chromosome abnormalities in malignant peripheral nerve sheath tumors (MPNST)

Radiation-induced genomic instability: delayed cytogenetic aberrations and apoptosis in primary human bone marrow cells



Gene Mutation Aberration & Analysis of Gene Abnormalities



Sequencing DNA from individual cells vs “humans as a whole.”

Sequencing DNA from individual cells is changing the way that researchers think of humans as a whole.

The ability to sequence single cells meant that researchers could take another approach. Working with a team at the Chinese sequencing powerhouse BGI, Auton sequenced nearly 200 sperm cells and was able to estimate the recombination rate for the man who had donated them. The work is not yet published, but Auton says that the group found an average of 24.5 recombination events per sperm cell, which is in line with estimates from indirect experiments2. Stephen Quake, a bioengineer at Stanford University in California, has performed similar experiments in 100 sperm cells and identified several places in the genome in which recombination is more likely to occur. The location of these recombination ‘hotspots’ could help population biologists to map the position of genetic variants associated with disease.

Quake also sequenced half a dozen of those 100 sperm in greater depth, and was able to determine the rate at which new mutations arise: about 30 mutations per billion bases per generation3, which is slightly higher than what others have found. “It’s basically the population biology of a sperm sample,” Quake says, and it will allow researchers to study meiosis and recombination in greater detail.




Nature 491, 27–29 (01 November 2012) doi:10.1038/491027a




Promising Research Directions By Watson, 1/10/2013

The main reason drugs that target genetic glitches are not cures is that cancer cells have a work-around. If one biochemical pathway to growth and proliferation is blocked by a drug — the cancer cells activate a different, equally effective pathway.

Watson advocates a different approach: targeting features that all cancer cells, especially those in metastatic cancers, have in common.

A protein in cells called Myc. It controls more than 1,000 other molecules inside cells, including many involved in cancer. Studies suggest that turning off Myc causes cancer cells to self-destruct in a process called apoptosis.

cancer biologist Hans-Guido Wendel of Sloan-Kettering. “Blocking production of Myc is an interesting line of investigation. I think there’s promise in that.”

Personalized medicine” that targets a patient’s specific cancer-causing mutation

Watson wrote, may be “the inherently conservative nature of today’s cancer research establishments.”



Opinion by Dr. Stephen Willliams, 1/11/2013

Kudos to both Watson and Weinstein for stating we really need to delve into tumor biology to determine functional pathways (like metabolism) which are a common feature of the malignant state ( also see my posting on differentiation therapy).




Disruption of Cancer Metabolism targeted by Metabolic Gatekeeper


Figure’s SOURCE:

Figure brought to my attention by Dr. Tilda Barlyia, 1/10/2013


Author: Yevgeniy Grigoryev

In the 1920s, the German physiologist Otto Warburgproposed that cancer cells generate energy in ways that are distinct from normal cells. Healthy cells mainly metabolize sugar via respiration in the mitochondria, switching only to glycolysis in the cytoplasm when oxygen levels are low. In contrast, cancer cells rely on glycolysis all the time, even under oxygen-rich scenarios. This shift in how energy is produced—the so-called ‘Warburg effect’, as the observation came to be known—is now recognized as a primary driver of tumor formation, but a mechanistic explanation for the phenomenon has remained elusive.

Now, researchers have implicated a chromatin regulator known as SIRT6 as a key mediator of the switch to glycolysis in cancer cells, a finding that could lead to new therapeutic modalities. “This work is very significant for the cancer field,” says Andrei Seluanov, a cancer biologist at the University of Rochester in New York State who studies SIRT6 but was not involved in the latest study. “It establishes the role ofSIRT6 as a tumor suppressor and shows that SIRT6 loss leads to tumor formation in mice and humans.”

SIRT6 encodes one of seven mammalian proteins called sirtuins, a group of histone deacetylases that play a role in regulating metabolism, lifespan and aging. SIRT1—which is activated by resveratrol, a molecule found in the skin of red grapes—is perhaps the best known sirtuin, but several of the others are now the focus of active investigation as therapeutic targets for a range of conditions, from metabolic syndrome tocancer. Just last month, for example, a paper in Nature Medicine demonstrated that SIRT6 plays an important role in heart disease.

Six years ago, a team led by Raul Mostoslavsky, a molecular biologist at the Massachusetts General Hospital Cancer Center in Boston, first showed that SIRT6 protects mice from DNA damage and had anti-aging properties. In 2010, the same team established SIRT6 as a critical regulator of glycolysis. Now,reporting today in Cell, Mostoslavsky and his colleagues have shown that SIRT6 function is lost in cancer cells—thus, definitively establishing SIRT6 as a potent tumor suppressor.

In the latest study, the researchers showed that mouse embryonic cells genetically engineered to lackSIRT6 proliferated much faster than normal cells, growing from 5,000 cells to 200,000 cells in three days. In contrast, SIRT6-expressiong cells grew at less than half that rate over the same time period. When injected into adult mice, these SIRT6-deficient cells also rapidly formed tumors, but this tumor growth was reversed when the scientists put SIRT6 back into the cells.

“Our study provides a proof-of-concept that inhibiting glycolysis in SIRT6-deficient cells and tumors could provide a potential therapeutic approach to combat cancer,” says Mostoslavsky. “Additionally, SIRT6 may be a valuable prognostic biomarker for cancer detection.”

Currently, there are no approved anti-glycolytic drugs against cancer. However, the latest findings indicate that pharmacologically elevating SIRT6 levels might help keep tumor growth at bay. And there’s preliminary data to suggest that the work will translate from the bench to the clinic: looking at a range of cancers from human patients, Mostoslavsky’s team showed that the higher the level of SIRT6 the better the prognosis and the longer the survival times.



Molecular Analysis of the different Stages of  Cancer Progression: The Example of Breast Cancer 


Figure’s SOURCE:

The molecular pathology of breast cancer progression

Alessandro Bombonati1 and Dennis C Sgroi1,2* Journal of Pathology, J Pathol 2011; 223: 307–317

(wileyonlinelibrary.com) DOI: 10.1002/path.2808


Post by Dr. Tilda Barlyia and Comments on   “The Molecular Pathology of Breast Cancer Progression”



The Paradigm Shift in Human Genomics will follow the following FIVE DIRECTIONS:

  • No to Sequencing Patient’s DNA, No to Sequencing Patient’s Tumor, Yes to focus on Gene Mutation Aberration & Analysis of Gene Abnormalities
  • Sequencing DNA from individual cells vs “humans as a whole.” Sequencing DNA from individual cells is changing the way that researchers think of humans as a whole.
  • Promising Research Directions By Watson, 1/10/2013
  • Disruption of Cancer Metabolism targeted by Metabolic Gatekeeper
  • Molecular Analysis of the different Stages of  Cancer Progression for Targeting Therapy

Read Full Post »

%d bloggers like this: