Posts Tagged ‘Nucleic acid sequence’

Genomics and epigenetics link to DNA structure

Larry H. Bernstein, MD, FCAP, Curator



Sequence and Epigenetic Factors Determine Overall DNA Structure



Atomic-level simulations show electrostatic forces between each atom. [Alek Aksimentiev, University of Illinois at Urbana-Champaign]


The traditionally held hypothesis about the highly ordered organization of DNA describes the interaction of various proteins with DNA sequences to mediate the dynamic structure of the molecule. However, recent evidence has emerged that stretches of homologous DNA sequences can associate preferentially with one another, even in the absence of proteins.

Researchers at the University of Illinois Center for the Physics of Living Cells, Johns Hopkins University, and Ulsan National Institute of Science and Technology (UNIST) in South Korea found that DNA molecules interact directly with one another in ways that are dependent on the sequence of the DNA and epigenetic factors, such as methylation.

The researchers described evidence they found for sequence-dependent attractive interactions between double-stranded DNA molecules that neither involve intermolecular strand exchange nor are mediated by DNA-binding proteins.

“DNA molecules tend to repel each other in water, but in the presence of special types of cations, they can attract each other just like nuclei pulling each other by sharing electrons in between,” explained lead study author Hajin Kim, Ph.D., assistant professor of biophysics at UNIST. “Our study suggests that the attractive force strongly depends on the nucleic acid sequence and also the epigenetic modifications.”

The investigators used atomic-level supercomputer simulations to measure the forces between a pair of double-stranded DNA helices and proposed that the distribution of methyl groups on the DNA was the key to regulating this sequence-dependent attraction. To verify their findings experimentally, the scientists were able to observe a single pair of DNA molecules within nanoscale bubbles.

“Here we combine molecular dynamics simulations with single-molecule fluorescence resonance energy transfer experiments to examine the interactions between duplex DNA in the presence of spermine, a biological polycation,” the authors wrote. “We find that AT-rich DNA duplexes associate more strongly than GC-rich duplexes, regardless of the sequence homology. Methyl groups of thymine act as a steric block, relocating spermine from major grooves to interhelical regions, thereby increasing DNA–DNA attraction.”

The findings from this study were published recently in Nature Communications in an article entitled “Direct Evidence for Sequence-Dependent Attraction Between Double-Stranded DNA Controlled by Methylation.”

After conducting numerous further simulations, the research team concluded that direct DNA–DNA interactions could play a central role in how chromosomes are organized in the cell and which ones are expanded or folded up compactly, ultimately determining functions of different cell types or regulating the cell cycle.

“Biophysics is a fascinating subject that explores the fundamental principles behind a variety of biological processes and life phenomena,” Dr. Kim noted. “Our study requires cross-disciplinary efforts from physicists, biologists, chemists, and engineering scientists and we pursue the diversity of scientific disciplines within the group.”

Dr. Kim concluded by stating that “in our lab, we try to unravel the mysteries within human cells based on the principles of physics and the mechanisms of biology. In the long run, we are seeking for ways to prevent chronic illnesses and diseases associated with aging.”


Direct evidence for sequence-dependent attraction between double-stranded DNA controlled by methylation

Jejoong Yoo, Hajin Kim, Aleksei Aksimentiev, and Taekjip Ha
Nature Communications 7 11045 (2016)    DOI:10.1038/ncomms11045BibTex


Although proteins mediate highly ordered DNA organization in vivo, theoretical studies suggest that homologous DNA duplexes can preferentially associate with one another even in the absence of proteins. Here we combine molecular dynamics simulations with single-molecule fluorescence resonance energy transfer experiments to examine the interactions between duplex DNA in the presence of spermine, a biological polycation. We find that AT-rich DNA duplexes associate more strongly than GC-rich duplexes, regardless of the sequence homology. Methyl groups of thymine acts as a steric block, relocating spermine from major grooves to interhelical regions, thereby increasing DNA–DNA attraction. Indeed, methylation of cytosines makes attraction between GC-rich DNA as strong as that between AT-rich DNA. Recent genome-wide chromosome organization studies showed that remote contact frequencies are higher for AT-rich and methylated DNA, suggesting that direct DNA–DNA interactions that we report here may play a role in the chromosome organization and gene regulation.

Formation of a DNA double helix occurs through Watson–Crick pairing mediated by the complementary hydrogen bond patterns of the two DNA strands and base stacking. Interactions between double-stranded (ds)DNA molecules in typical experimental conditions containing mono- and divalent cations are repulsive1, but can turn attractive in the presence of high-valence cations2. Theoretical studies have identified the ion–ion correlation effect as a possible microscopic mechanism of the DNA condensation phenomena3, 4, 5. Theoretical investigations have also suggested that sequence-specific attractive forces might exist between two homologous fragments of dsDNA6, and this ‘homology recognition’ hypothesis was supported by in vitro atomic force microscopy7 and in vivo point mutation assays8. However, the systems used in these measurements were too complex to rule out other possible causes such as Watson–Crick strand exchange between partially melted DNA or protein-mediated association of DNA.

Here we present direct evidence for sequence-dependent attractive interactions between dsDNA molecules that neither involve intermolecular strand exchange nor are mediated by proteins. Further, we find that the sequence-dependent attraction is controlled not by homology—contradictory to the ‘homology recognition’ hypothesis6—but by a methylation pattern. Unlike the previous in vitro study that used monovalent (Na+) or divalent (Mg2+) cations7, we presumed that for the sequence-dependent attractive interactions to operate polyamines would have to be present. Polyamine is a biological polycation present at a millimolar concentration in most eukaryotic cells and essential for cell growth and proliferation9, 10. Polyamines are also known to condense DNA in a concentration-dependent manner2, 11. In this study, we use spermine4+(Sm4+) that contains four positively charged amine groups per molecule.

Sequence dependence of DNA–DNA forces

To characterize the molecular mechanisms of DNA–DNA attraction mediated by polyamines, we performed molecular dynamics (MD) simulations where two effectively infinite parallel dsDNA molecules, 20 base pairs (bp) each in a periodic unit cell, were restrained to maintain a prescribed inter-DNA distance; the DNA molecules were free to rotate about their axes. The two DNA molecules were submerged in 100mM aqueous solution of NaCl that also contained 20 Sm4+molecules; thus, the total charge of Sm4+, 80 e, was equal in magnitude to the total charge of DNA (2 × 2 × 20 e, two unit charges per base pair; Fig. 1a). Repeating such simulations at various inter-DNA distances and applying weighted histogram analysis12 yielded the change in the interaction free energy (ΔG) as a function of the DNA–DNA distance (Fig. 1b,c). In a broad agreement with previous experimental findings13, ΔG had a minimum, ΔGmin, at the inter-DNA distance of 25−30Å for all sequences examined, indeed showing that two duplex DNA molecules can attract each other. The free energy of inter-duplex attraction was at least an order of magnitude smaller than the Watson–Crick interaction free energy of the same length DNA duplex. A minimum of ΔG was not observed in the absence of polyamines, for example, when divalent or monovalent ions were used instead14, 15.

Figure 1: Polyamine-mediated DNA sequence recognition observed in MD simulations and smFRET experiments.
Polyamine-mediated DNA sequence recognition observed in MD simulations and smFRET experiments.

(a) Set-up of MD simulations. A pair of parallel 20-bp dsDNA duplexes is surrounded by aqueous solution (semi-transparent surface) containing 20 Sm4+ molecules (which compensates exactly the charge of DNA) and 100mM NaCl. Under periodic boundary conditions, the DNA molecules are effectively infinite. A harmonic potential (not shown) is applied to maintain the prescribed distance between the dsDNA molecules. (b,c) Interaction free energy of the two DNA helices as a function of the DNA–DNA distance for repeat-sequence DNA fragments (b) and DNA homopolymers (c). (d) Schematic of experimental design. A pair of 120-bp dsDNA labelled with a Cy3/Cy5 FRET pair was encapsulated in a ~200-nm diameter lipid vesicle; the vesicles were immobilized on a quartz slide through biotin–neutravidin binding. Sm4+ molecules added after immobilization penetrated into the porous vesicles. The fluorescence signals were measured using a total internal reflection microscope. (e) Typical fluorescence signals indicative of DNA–DNA binding. Brief jumps in the FRET signal indicate binding events. (f) The fraction of traces exhibiting binding events at different Sm4+ concentrations for AT-rich, GC-rich, AT nonhomologous and CpG-methylated DNA pairs. The sequence of the CpG-methylated DNA specifies the methylation sites (CG sequence, orange), restriction sites (BstUI, triangle) and primer region (underlined). The degree of attractive interaction for the AT nonhomologous and CpG-methylated DNA pairs was similar to that of the AT-rich pair. All measurements were done at [NaCl]=50mM and T=25°C. (g) Design of the hybrid DNA constructs: 40-bp AT-rich and 40-bp GC-rich regions were flanked by 20-bp common primers. The two labelling configurations permit distinguishing parallel from anti-parallel orientation of the DNA. (h) The fraction of traces exhibiting binding events as a function of NaCl concentration at fixed concentration of Sm4+ (1mM). The fraction is significantly higher for parallel orientation of the DNA fragments.

Unexpectedly, we found that DNA sequence has a profound impact on the strength of attractive interaction. The absolute value of ΔG at minimum relative to the value at maximum separation, |ΔGmin|, showed a clearly rank-ordered dependence on the DNA sequence: |ΔGmin| of (A)20>|ΔGmin| of (AT)10>|ΔGmin| of (GC)10>|ΔGmin| of (G)20. Two trends can be noted. First, AT-rich sequences attract each other more strongly than GC-rich sequences16. For example, |ΔGmin| of (AT)10 (1.5kcalmol−1 per turn) is about twice |ΔGmin| of (GC)10 (0.8kcalmol−1 per turn) (Fig. 1b). Second, duplexes having identical AT content but different partitioning of the nucleotides between the strands (that is, (A)20 versus (AT)10 or (G)20 versus (GC)10) exhibit statistically significant differences (~0.3kcalmol−1 per turn) in the value of |ΔGmin|.

To validate the findings of MD simulations, we performed single-molecule fluorescence resonance energy transfer (smFRET)17 experiments of vesicle-encapsulated DNA molecules. Equimolar mixture of donor- and acceptor-labelled 120-bp dsDNA molecules was encapsulated in sub-micron size, porous lipid vesicles18 so that we could observe and quantitate rare binding events between a pair of dsDNA molecules without triggering large-scale DNA condensation2. Our DNA constructs were long enough to ensure dsDNA–dsDNA binding that is stable on the timescale of an smFRET measurement, but shorter than the DNA’s persistence length (~150bp (ref. 19)) to avoid intramolecular condensation20. The vesicles were immobilized on a polymer-passivated surface, and fluorescence signals from individual vesicles containing one donor and one acceptor were selectively analysed (Fig. 1d). Binding of two dsDNA molecules brings their fluorescent labels in close proximity, increasing the FRET efficiency (Fig. 1e).

FRET signals from individual vesicles were diverse. Sporadic binding events were observed in some vesicles, while others exhibited stable binding; traces indicative of frequent conformational transitions were also observed (Supplementary Fig. 1A). Such diverse behaviours could be expected from non-specific interactions of two large biomolecules having structural degrees of freedom. No binding events were observed in the absence of Sm4+ (Supplementary Fig. 1B) or when no DNA molecules were present. To quantitatively assess the propensity of forming a bound state, we chose to use the fraction of single-molecule traces that showed any binding events within the observation time of 2min (Methods). This binding fraction for the pair of AT-rich dsDNAs (AT1, 100% AT in the middle 80-bp section of the 120-bp construct) reached a maximum at ~2mM Sm4+(Fig. 1f), which is consistent with the results of previous experimental studies2, 3. In accordance with the prediction of our MD simulations, GC-rich dsDNAs (GC1, 75% GC in the middle 80bp) showed much lower binding fraction at all Sm4+ concentrations (Fig. 1b,c). Regardless of the DNA sequence, the binding fraction reduced back to zero at high Sm4+ concentrations, likely due to the resolubilization of now positively charged DNA–Sm4+ complexes2, 3, 13.

Because the donor and acceptor fluorophores were attached to the same sequence of DNA, it remained possible that the sequence homology between the donor-labelled DNA and the acceptor-labelled DNA was necessary for their interaction6. To test this possibility, we designed another AT-rich DNA construct AT2 by scrambling the central 80-bp section of AT1 to remove the sequence homology (Supplementary Table 1). The fraction of binding traces for this nonhomologous pair of donor-labelled AT1 and acceptor-labelled AT2 was comparable to that for the homologous AT-rich pair (donor-labelled AT1 and acceptor-labelled AT1) at all Sm4+ concentrations tested (Fig. 1f). Furthermore, this data set rules out the possibility that the higher binding fraction observed experimentally for the AT-rich constructs was caused by inter-duplex Watson–Crick base pairing of the partially melted constructs.

Next, we designed a DNA construct named ATGC, containing, in its middle section, a 40-bp AT-rich segment followed by a 40-bp GC-rich segment (Fig. 1g). By attaching the acceptor to the end of either the AT-rich or GC-rich segments, we could compare the likelihood of observing the parallel binding mode that brings the two AT-rich segments together and the anti-parallel binding mode. Measurements at 1mM Sm4+ and 25 or 50mM NaCl indicated a preference for the parallel binding mode by ~30% (Fig. 1h). Therefore, AT content can modulate DNA–DNA interactions even in a complex sequence context. Note that increasing the concentration of NaCl while keeping the concentration of Sm4+ constant enhances competition between Na+ and Sm4+ counterions, which reduces the concentration of Sm4+ near DNA and hence the frequency of dsDNA–dsDNA binding events (Supplementary Fig. 2).

Methylation determines the strength of DNA–DNA attraction

Analysis of the MD simulations revealed the molecular mechanism of the polyamine-mediated sequence-dependent attraction (Fig. 2). In the case of the AT-rich fragments, the bulky methyl group of thymine base blocks Sm4+ binding to the N7 nitrogen atom of adenine, which is the cation-binding hotspot21, 22. As a result, Sm4+ is not found in the major grooves of the AT-rich duplexes and resides mostly near the DNA backbone (Fig. 2a,d). Such relocated Sm4+ molecules bridge the two DNA duplexes better, accounting for the stronger attraction16, 23, 24, 25. In contrast, significant amount of Sm4+ is adsorbed to the major groove of the GC-rich helices that lacks cation-blocking methyl group (Fig. 2b,e).

Figure 2: Molecular mechanism of polyamine-mediated DNA sequence recognition.
Molecular mechanism of polyamine-mediated DNA sequence recognition.

(ac) Representative configurations of Sm4+ molecules at the DNA–DNA distance of 28Å for the (AT)10–(AT)10 (a), (GC)10–(GC)10 (b) and (GmC)10–(GmC)10 (c) DNA pairs. The backbone and bases of DNA are shown as ribbon and molecular bond, respectively; Sm4+ molecules are shown as molecular bonds. Spheres indicate the location of the N7 atoms and the methyl groups. (df) The average distributions of cations for the three sequence pairs featured in ac. Top: density of Sm4+ nitrogen atoms (d=28Å) averaged over the corresponding MD trajectory and the z axis. White circles (20Å in diameter) indicate the location of the DNA helices. Bottom: the average density of Sm4+ nitrogen (blue), DNA phosphate (black) and sodium (red) atoms projected onto the DNA–DNA distance axis (x axis). The plot was obtained by averaging the corresponding heat map data over y=[−10, 10] Å. See Supplementary Figs 4 and 5 for the cation distributions at d=30, 32, 34 and 36Å.

If indeed the extra methyl group in thymine, which is not found in cytosine, is responsible for stronger DNA–DNA interactions, we can predict that cytosine methylation, which occurs naturally in many eukaryotic organisms and is an essential epigenetic regulation mechanism26, would also increase the strength of DNA–DNA attraction. MD simulations showed that the GC-rich helices containing methylated cytosines (mC) lose the adsorbed Sm4+ (Fig. 2c,f) and that |ΔGmin| of (GC)10 increases on methylation of cytosines to become similar to |ΔGmin| of (AT)10 (Fig. 1b).

To experimentally assess the effect of cytosine methylation, we designed another GC-rich construct GC2 that had the same GC content as GC1 but a higher density of CpG sites (Supplementary Table 1). The CpG sites were then fully methylated using M. SssI methyltransferase (Supplementary Fig. 3; Methods). As predicted from the MD simulations, methylation of the GC-rich constructs increased the binding fraction to the level of the AT-rich constructs (Fig. 1f).

The sequence dependence of |ΔGmin| and its relation to the Sm4+ adsorption patterns can be rationalized by examining the number of Sm4+ molecules shared by the dsDNA molecules (Fig. 3a). An Sm4+ cation adsorbed to the major groove of one dsDNA is separated from the other dsDNA by at least 10Å, contributing much less to the effective DNA–DNA attractive force than a cation positioned between the helices, that is, the ‘bridging’ Sm4+ (ref. 23). An adsorbed Sm4+ also repels other Sm4+ molecules due to like-charge repulsion, lowering the concentration of bridging Sm4+. To demonstrate that the concentration of bridging Sm4+ controls the strength of DNA–DNA attraction, we computed the number of bridging Sm4+ molecules, Nspm (Fig. 3b). Indeed, the number of bridging Sm4+ molecules ranks in the same order as |ΔGmin|: Nspm of (A)20>Nspm of (AT)10Nspm of (GmC)10>Nspm of (GC)10>Nspm of (G)20. Thus, the number density of nucleotides carrying a methyl group (T and mC) is the primary determinant of the strength of attractive interaction between two dsDNA molecules. At the same time, the spatial arrangement of the methyl group carrying nucleotides can affect the interaction strength as well (Fig. 3c). The number of methyl groups and their distribution in the (AT)10 and (GmC)10 duplex DNA are identical, and so are their interaction free energies, |ΔGmin| of (AT)10Gmin| of (GmC)10. For AT-rich DNA sequences, clustering of the methyl groups repels Sm4+ from the major groove more efficiently than when the same number of methyl groups is distributed along the DNA (Fig. 3b). Hence, |ΔGmin| of (A)20>|ΔGmin| of (AT)10. For GC-rich DNA sequences, clustering of the cation-binding sites (N7 nitrogen) attracts more Sm4+ than when such sites are distributed along the DNA (Fig. 3b), hence |ΔGmin| is larger for (GC)10 than for (G)20.

Figure 3: Methylation modulates the interaction free energy of two dsDNA molecules by altering the number of bridging Sm4+.
Methylation modulates the interaction free energy of two dsDNA molecules by altering the number of bridging Sm4+.

(a) Typical spatial arrangement of Sm4+ molecules around a pair of DNA helices. The phosphates groups of DNA and the amine groups of Sm4+ are shown as red and blue spheres, respectively. ‘Bridging’ Sm4+molecules reside between the DNA helices. Orange rectangles illustrate the volume used for counting the number of bridging Sm4+ molecules. (b) The number of bridging amine groups as a function of the inter-DNA distance. The total number of Sm4+ nitrogen atoms was computed by averaging over the corresponding MD trajectory and the 10Å (x axis) by 20Å (y axis) rectangle prism volume (a) centred between the DNA molecules. (c) Schematic representation of the dependence of the interaction free energy of two DNA molecules on their nucleotide sequence. The number and spatial arrangement of nucleotides carrying a methyl group (T or mC) determine the interaction free energy of two dsDNA molecules.

Genome-wide investigations of chromosome conformations using the Hi–C technique revealed that AT-rich loci form tight clusters in human nucleus27, 28. Gene or chromosome inactivation is often accompanied by increased methylation of DNA29 and compaction of facultative heterochromatin regions30. The consistency between those phenomena and our findings suggest the possibility that the polyamine-mediated sequence-dependent DNA–DNA interaction might play a role in chromosome folding and epigenetic regulation of gene expression.

  1. Rau, D. C., Lee, B. & Parsegian, V. A. Measurement of the repulsive force between polyelectrolyte molecules in ionic solution: hydration forces between parallel DNA double helices. Proc. Natl Acad. Sci. USA 81, 26212625 (1984).
  2. Raspaud, E., Olvera de la Cruz, M., Sikorav, J. L. & Livolant, F. Precipitation of DNA by polyamines: a polyelectrolyte behavior. Biophys. J. 74, 381393 (1998).
  3. Besteman, K., Van Eijk, K. & Lemay, S. G. Charge inversion accompanies DNA condensation by multivalent ions. Nat. Phys. 3, 641644 (2007).
  4. Lipfert, J., Doniach, S., Das, R. & Herschlag, D. Understanding nucleic acid-ion interactions.Annu. Rev. Biochem. 83, 813841 (2014).
  5. Grosberg, A. Y., Nguyen, T. T. & Shklovskii, B. I. The physics of charge inversion in chemical and biological systems. Rev. Mod. Phys. 74, 329345 (2002).
  6. Kornyshev, A. A. & Leikin, S. Sequence recognition in the pairing of DNA duplexes. Phys. Rev. Lett. 86, 36663669 (2001).
  7. Danilowicz, C. et al. Single molecule detection of direct, homologous, DNA/DNA pairing.Proc. Natl Acad. Sci. USA 106, 1982419829 (2009).
  8. Gladyshev, E. & Kleckner, N. Direct recognition of homology between double helices of DNA in Neurospora crassa. Nat. Commun. 5, 3509 (2014).
  9. Tabor, C. W. & Tabor, H. Polyamines. Annu. Rev. Biochem. 53, 749790 (1984).
  10. Thomas, T. & Thomas, T. J. Polyamines in cell growth and cell death: molecular mechanisms and therapeutic applications. Cell. Mol. Life Sci. 58, 244258 (2001).

Read Full Post »

Gene Expression: Algorithms for Protein Dynamics

Reporter:  Aviva Lev-Ari, PhD, RN

Stanford-developed algorithm reveals complex protein dynamics behind gene expression


Michael Snyder

In yet another coup for a research concept known as “big data,” researchers at the Stanford University School of Medicine have developed a computerized algorithm to understand the complex and rapid choreography of hundreds of proteins that interact in mindboggling combinations to govern how genes are flipped on and off within a cell.

To do so, they coupled findings from 238 DNA-protein-binding experiments performed by the ENCODE project — a massive, multiyear international effort to identify the functional elements of the human genome — with a laboratory-based technique to identify binding patterns among the proteins themselves.

The analysis is sensitive enough to have identified many previously unsuspected, multipartner trysts. It can also be performed quickly and repeatedly to track how a cell responds to environmental changes or crucial developmental signals.

“At a very basic level, we are learning who likes to work with whom to regulate around 20,000 human genes,” said Michael Snyder, PhD, professor and chair of genetics at Stanford. “If you had to look through all possible interactions pair-wise, it would be ridiculously impossible. Here we can look at thousands of combinations in an unbiased manner and pull out important and powerful information. It gives us an unprecedented level of understanding.”

Snyder is the senior author of a paper describing the research published Oct. 24 in Cell. The lead authors are postdoctoral scholars Dan Xie, PhD, Alan Boyle, PhD, and Linfeng Wu, PhD.

Proteins control gene expression by either binding to specific regions of DNA, or by interacting with other DNA-bound proteins to modulate their function. Previously, researchers could only analyze two to three proteins and DNA sequences at a time, and were unable to see the true complexities of the interactions among proteins and DNA that occur in living cells.

The challenge resembled trying to figure out interactions in a crowded mosh pit by studying a few waltzing couples in an otherwise empty ballroom, and it has severely limited what could be learned about the dynamics of gene expression.

The ENCODE, for the Encyclopedia of DNA Elements, project was a five-year collaboration of more than 440 scientists in 32 labs around the world to reveal the complex interplay among regulatory regions, proteins and RNA molecules that governs when and how genes are expressed. The project has been generating a treasure trove of data for researchers to analyze for the last eight years.

In this study, the researchers combined data from genomics (a field devoted to the study of genes) and proteomics (which focuses on proteins and their interactions). They studied 128 proteins, called trans-acting factors, which are known to regulate gene expression by binding to regulatory regions within the genome. Some of the regions control the expression of nearby genes; others affect the expression of genes great distances away.

The researchers used 238 data sets generated by the ENCODE project to study the specific DNA sequences bound by each of the 128 trans-acting factors. But these factors aren’t monogamous; they bind many different sequences in a variety of protein-DNA combinations. Xie, Boyle and Snyder designed a machine-learning algorithm to analyze all the data and identify which trans-acting factors tend to be seen together and which DNA sequences they prefer.

Wu then performed immunoprecipitation experiments, which use antibodies to identify protein interactions in the cell nucleus. In this way, they were able to tell which proteins interacted directly with one another, and which were seen together because their preferred DNA binding sites were adjoining.

“Before our work, only the combination of two or three regulatory proteins were studied, which oversimplified how gene regulators collaborate to find their targets,” Xie said. “With our method we are able to study the combination of more than 100 regulators and see a much more complex structure of collaboration. For example, it had been believed that a key regulator of cell proliferation called FOS typically only works with JUN protein family members. We show, in addition to JUN, FOS has different partners under different circumstances. In fact, we found almost all the canonical combinations of two or three trans-acting factors have many more partners than we previously thought.”

To broaden their analysis, the researchers included data from other sources that explored protein-binding patterns in five cell types. They found that patterns of co-localization among proteins, in which several proteins are found clustered closely on the DNA to govern gene expression, vary according to cell type and the conditions under which the cells are grown. They also found that many of these clusters can be explained through interactions among proteins, and that not every protein bound to DNA directly.

“We’d like to understand how these interactions work together to make different cell types and how they gain their unique identities in development,” Snyder said. “Furthermore, diseased cells will have a very different type of wiring diagram. We hope to understand how these cells go astray.”

Other Stanford co-authors include life science research assistant Jie Zhai and life science research associate Trupti Kawli, PhD.

The research was supported by the National Human Genome Research Institute (grants U54HG004558 and U54HG006996).

Information about Stanford’s Department of Genetics, which also supported the work, is available at http://genetics.stanford.edu.

Krista Conger | Tel (650) 725-5371
M.A. Malone | Tel (650) 723-6912

Stanford Medicine integrates research, medical education and patient care at its three institutions – Stanford University School of MedicineStanford Hospital & Clinics and Lucile Packard Children’s Hospital. For more information, please visit the Office of Communication & Public Affairs site at



Dynamic trans-Acting Factor Colocalization in Human Cells

Cell, Volume 155, Issue 3, 713-724, 24 October 2013
Copyright © 2013 Elsevier Inc. All rights reserved.


    • Highlights
    • Colocalization patterns of 128 TFs in human cells
    • An application of SOMs to study high-dimensional TF colocalization patterns
    • Colocalization patterns are dynamic through stimulation and across cell types
    • Many TF colocalizations can be explained by protein-protein interaction


    Different trans-acting factors (TFs) collaborate and act in concert at distinct loci to perform accurate regulation of their target genes. To date, the cobinding of TF pairs has been investigated in a limited context both in terms of the number of factors within a cell type and across cell types and the extent of combinatorial colocalizations. Here, we use an approach to analyze TF colocalization within a cell type and across multiple cell lines at an unprecedented level. We extend this approach with large-scale mass spectrometry analysis of immunoprecipitations of 50 TFs. Our combined approach reveals large numbers of interesting TF-TF associations. We observe extensive change in TF colocalizations both within a cell type exposed to different conditions and across multiple cell types. We show distinct functional annotations and properties of different TF cobinding patterns and provide insights into the complex regulatory landscape of the cell.


    Personalized medicine aims to assess medical risks, monitor, diagnose and treat patients according to their specific genetic composition and molecular phenotype. The advent of genome sequencing and the analysis of physiological states has proven to be powerful (Cancer Genome Atlas Research Network, 2011). However, its implementation for the analysis of otherwise healthy individuals for estimation of disease risk and medical interpretation is less clear. Much of the genome is difficult to interpret and many complex diseases, such as diabetes, neurological disorders and cancer, likely involve a large number of different genes and biological pathways (Ashley et al., 2010,Grayson et al., 2011,Li et al., 2011), as well as environmental contributors that can be difficult to assess. As such, the combination of genomic information along with a detailed molecular analysis of samples will be important for predicting, diagnosing and treating diseases as well as for understanding the onset, progression, and prevalence of disease states (Snyder et al., 2009).

    Presently, healthy and diseased states are typically followed using a limited number of assays that analyze a small number of markers of distinct types. With the advancement of many new technologies, it is now possible to analyze upward of 105 molecular constituents. For example, DNA microarrays have allowed the subcategorization of lymphomas and gliomas (Mischel et al., 2003), and RNA sequencing (RNA-Seq) has identified breast cancer transcript isoforms (Li et al., 2011,van der Werf et al., 2007,Wu et al., 2010,Lapuk et al., 2010). Although transcriptome and RNA splicing profiling are powerful and convenient, they provide a partial portrait of an organism’s physiological state. Transcriptomic data, when combined with genomic, proteomic, and metabolomic data are expected to provide a much deeper understanding of normal and diseased states (Snyder et al., 2010). To date, comprehensive integrative omics profiles have been limited and have not been applied to the analysis of generally healthy individuals.

    To obtain a better understanding of: (1) how to generate an integrative personal omics profile (iPOP) and examine as many biological components as possible, (2) how these components change during healthy and diseased states, and (3) how this information can be combined with genomic information to estimate disease risk and gain new insights into diseased states, we performed extensive omics profiling of blood components from a generally healthy individual over a 14 month period (24 months total when including time points with other molecular analyses). We determined the whole-genome sequence (WGS) of the subject, and together with transcriptomic, proteomic, metabolomic, and autoantibody profiles, used this information to generate an iPOP. We analyzed the iPOP of the individual over the course of healthy states and two viral infections (Figure 1A). Our results indicate that disease risk can be estimated by a whole-genome sequence and by regularly monitoring health states with iPOP disease onset may also be observed. The wealth of information provided by detailed longitudinal iPOP revealed unexpected molecular complexity, which exhibited dynamic changes during healthy and diseased states, and provided insight into multiple biological processes. Detailed omics profiling coupled with genome sequencing can provide molecular and physiological information of medical significance. This approach can be generalized for personalized health monitoring and medicine.


    Read Full Post »

    Reporter: Aviva Lev-Ari, PhD, RN

    The 6/13/2013 Supreme Court Decision is covered on this Open Access Online Scientific Journal

    Genomics & Ethics: DNA Fragments are Products of Nature or Patentable Genes?

    Geneticist Ricki Lewis, PhD: Genetics Errors in Supreme Court Decision of 6/13/2013

    DNA Science BlogDNA Science Blog


    Earlier today, my “in” box began to fill with info from everyone I’ve ever met letting me know that the Supreme Court had ruled on the Myriad case about patenting the breast cancer genes BRCA1 and BRCA2. I also received a dozen pitches from PR people offering me all manner of instant interviews with lawyers, doctors, bioethicists, and health care analysts.

    No one offered me an interview with a geneticist – a person who knows something about DNA. So being such a person myself, I decided to take a look at the decision. And I found errors – starting right smack in the opening paragraph.

    “Scientists can extract DNA from cells to isolate specific segments for study. They can also synthetically create exons-only strands of nucleotides known as composite DNA (cDNA). cDNA contains only the exons that occur in DNA, omitting the intervening exons.”

    The definition is correct, the terminology, not. “cDNA” does not stand for “composite DNA.” It stands for “complementary DNA.”

    cDNA came into fashion when I was in grad school, circa 1977. Like many genetics terms, it has a very precise meaning, something I pay attention to because I write human genetics books, including 10 editions of a textbook.

    A cDNA is termed “complementary” because it is complementary in nucleotide base sequence to the messenger RNA (mRNA) that is made from the gene. Enzymes cut from the mRNA the sequences (introns) that do not encode amino acids and retains those (exons) that do encode protein. So a cDNA represents the part of a gene that is actually used to tell the cell to make protein. End of biology lesson.

    A cDNA is created in the laboratory, and it is not a DNA sequence that occurs in nature. Hence, the Supreme Court’s part 2 of the decision, which acknowledges Myriad’s right to use a test based on a complementary, or cDNA.

    I did a google search for “composite DNA” and just found the media parroting of today’s decision, and a few old forensics uses. So a caveat: my conclusion that the term is incorrect and invented is based on negative evidence. If I’m wrong, mea culpa in advance and I will feel like an idiot.

    But cDNA isn’t the only error. I soon found another. On page 16, footnote #8 discusses a pseudogene as resulting from “random incorporation of fragments of cDNA.” That’s not even close to what a pseudogene is.

    A pseudogene results from a DNA replication error that makes an extra copy of a gene. Over time, one copy mutates itself into a form that can’t do its job. The pseudogene remains in the genome like a ghost of a functional gene. The mutations occur at random because the pseudogene, not being used, isn’t subject to natural selection – that’s probably what the Court means by “random.” The globin gene locus on chromosome 11 is chock full of pseudogenes. This is such a classic example of basic genetics that my head is about to explode.

    And how on earth is the Supreme Court’s definition of a pseudogene supposed to happen, in nature or otherwise? A cDNA exists in a lab dish. A gene exists in a cell that is part of an organism. How does the cDNA “randomly incorporate” itself inside the cell? Jump in from the dish? Part of the footnote states, “… given pseudogenes’ apparently random origins … ” Pseudogenes’ origins aren’t random at all. They happen in specific genes that tend to have repeats in the sequence, “confusing” the replication enzymes.

    Today’s decision is undoubtedly a wonderful leap forward for patients, their families, and researchers. And some may think I am nitpicking. But these two errors jumped right out at me — I’d troll for more but I want to post this. What else is wrong? How can we trust the decision if the science is wrong? And what is the background of the people who research the decisions?

    I know nothing about the law, zero, which is why I’m not writing about that. But the science in something as important as a Supreme Court decision should accurately use the language of the field under discussion.


    Read Full Post »

    Author: Tilda Barliya PhD

    The field of DNA and RNA nanotechnologies  are considered one of the most dynamic research areas in the field of drug delivery in molecular medicine. Both DNA and RNA have a wide aspect of medical application including: drug deliveries, for genetic immunization, for metabolite and nucleic acid detection, gene regulation, siRNA delivery for cancer treatment (I), and even analytical and therapeutic applications.

    Seeman (6,7) pioneered the concept 30 years ago of using DNA as a material for creating nanostructures; this has led to an explosion of knowledge in the now well-established field of DNA nanotechnology. The unique properties in terms of free energy, folding, noncanonical base-pairing, base-stacking, in vivo transcription and processing that distinguish RNA from DNA provides sufficient rationale to regard RNA nanotechnology as its own technological discipline. Herein, we will discuss the advantages of DNA nanotechnology and it’s use in medicine.

    So What is the rational of using DNA nanotechnology(3)?

    • Genetic studies – its application in various biological fields like biomedicine, cancer research, medical devices  and genetic engineering.
    • Its unique properties of structural stability, programmability of sequences, and predictable self-assembly.
    DNA origami

    Structures made from DNA using the DNA-origami method (Rothemund, 2006)

    Structural DNA nanotechnology rests on three pillars: [1] Hybridization; [2] Stably branched DNA; and [3] Convenient synthesis of designed sequences.


    Hybridization. The self-association (self=assembly) of complementary nucleic acid molecules or parts of molecules, is implicit in all aspects of structural DNA nanotechnology. Individual motifs are formed by the hybridization of strands designed to produce particular topological species. A key aspect of hybridization is the use of sticky ended cohesion to combine pieces of linear duplex DNA; this has been a fundamental component of genetic engineering for over 35 years (7). Not only is hybridization critical to the formation of structure, but it is deeply involved in almost all the sequence-dependent nanomechanical devices that have been constructed, and it is central to many attempts to build structural motifs in a sequential fashion (7,8 ).

    Stably Branched DNA

    branched DNA molecules are central to DNA nanotechnology. It is the combination of in vitro hybridization and synthetic branched DNA that leads to the ability to use DNA as a construction material. Such branched DNA is thought to be intermediates in genetic recombination (such as Holliday junctions).

    Convenient Synthesis of Designed Sequences

    Biologically derived branched DNA molecules, such as Holliday junctions, are inherently unstable, because they exhibit sequence symmetry; i.e., the four strands actually consist of two pairs of strands with the same sequence. This symmetry enables an isomerization known as branch migration that allows the branch point to relocate.  DNA nanotechnology entailed sequence design that attempted to minimize sequence symmetry in every way possible.

    One of the most remarkable innovations in structural DNA-nanotechnology in recent years is DNA origami, which was invented in 2006 by Paul Rothemund (1) (see Fig above). DNA origami utilizes the genome from a virus together with a large number of shorter DNA strands to enable the creation of numerous DNA-based structures (Figure 1). The shorter DNA strands forces the long viral DNA to fold into a pattern that is defined by the interaction between the long and the short DNA strands (1,2).

    Rothemund believes that an  application of patterned DNA origami would be the creation of a ‘nanobreadboard’, to which diverse components could be added. The attachment of proteins23, for example, might allow novel biological experiments aimed at modelling complex protein assemblies and examining the effects of spatial organization, whereas molecular electronic or plasmonic circuits might be created by attaching nanowires, carbon nanotubes or gold nanoparticles (1).

    DNA nanotechnology and Biological Application

    The physical and chemical properties of nanomaterials such as polymers, semiconductors, and metals present diverse advantages for various in vivo applications (3,9 ). For example:

    • Therapeutics – In cancer for example, nanosystems that are designed from biological materials such as DNA and RNA are ‘programmed’ to be able to evade most, if not all, drug-resistance mechanisms. Based on these properties, most nanosystems are able to deliver high concentrations of drugs to cancer cells while curtailing damage to surrounding healthy cells (2b, 3, 9, 11, 15).
    • Biosensors – capable of picking up very specific biological signals and converting them into electrical outputs that can be analyzed for identification. Biosensors are efficient as they have a high ratio of surface area to volume as well as adjustable electronic, magnetic, optical, and biological properties (3, 12, 13, 14).
    • **Amin and colleagues have developed a biotinylated DNA thin film-coated fiber optic reflectance biosensor for the detection of streptavidin aerosols. DNA thin films were prepared by dropping DNA samples into a polymer optical fiber which responded quickly to the specific biomolecules in the atmosphere. This approach of coating optical fibers with DNA nanostructures could be very useful in the future for detecting atmospheric bio-aerosols with high sensitivity and specificity (3, 14)
    • Computing – Another aspect uses the programmability of DNA to create devices that are capable of computing. Here, the structure of the assembled DNA is not of primary interest. Instead, control of the DNA sequence is used in the creation of computational algorithms, like e.g. artificial neural networks. Qian et al for example, built on the richness of DNA computing and strand displacement circuitry, they showed how molecular systems can exhibit autonomous brain-like behaviours. Using a simple DNA gate architecture that allows experimental scale-up of multilayer digital circuits, they systematically transform arbitrary linear threshold circuits (an artificial neural network model) into DNA strand displacement cascades that function as small neural networks (3, 10).
    • Additional features: 3rd generation DNA sequencers (II), Biomimetic systems, Energy transfer and photonics etc


    DNA nanotechnology is an evolving field that affects medicine, computation, material sciences, and physics. DNA nanostructures offer unprecedented control over shape, size, mechanical flexibility and anisotropic surface  modification. Clearly, proper control over these aspects can increase  circulation times by orders of magnitude, as can be seen for longcirculating particles such as erythrocytes and various pathogenic particles evolved to overcome this issue.  The use of DNA in DNA/protein-based matrices makes these structures inherently amenable to structural tunability. More research in this direction  will certainly be developed, making DNA a promising biomaterial  in tissue engineering. future development of novel ways in which DNA would be utilized to have a much more comprehensive role in biological computation and data storage is envisaged.


    1. Paul W. K. Rothemund. Folding DNA to create nanoscale shapes and patterns. NATURE 2006 (March 16)|Vol 440: 297-302. http://www.nature.com/nature/journal/v440/n7082/full/nature04586.html


    2. Andre V. Pinheiro, Dongran Han, William M. Shih and Hao Yan. Challenges and opportunities for structural DNA nanotechnology. Nature Nanotechnology 2011 Dec | VOL 6: 763-772.  http://www.nature.com/nnano/journal/v6/n12/pdf/nnano.2011.187.pdf

    2b. Thi Huyen La, Thi Thu Thuy Nguyen, Van Phuc Pham, Thi Minh Huyen Nguyen and Quang Huan Le.  Using DNA nanotechnology to produce a drug delivery system. Adv. Nat. Sci.: Nanosci. Nanotechnol. 4 (2013) 015002 (7pp). http://iopscience.iop.org/2043-6262/4/1/015002http://iopscience.iop.org/2043-6262/4/1/015002/pdf/2043-6262_4_1_015002.pdf

    3. Muniza Zahid, Byeonghoon Kim, Rafaqat Hussain, Rashid Amin and Sung H Park. DNA nanotechnology: a future perspective. Nanoscale Research Letters 2013, 8:119. http://www.nanoscalereslett.com/content/8/1/119

    4.By: Cientifica Ltd 2007. The Nanotech Revolution in Drug Delivery.  http://www.cientifica.com/WhitePapers/054_Drug%20Delivery%20White%20Paper.pdf

    5. Gemma Campbell. Nanotechnology and its implications for the health of the E.U citizen: Diagnostics, drug discovery and drug delivery. Institute of Nanotechnology and Nanoforum. http://www.nano.org.uk/nanomednet/images/stories/Reports/diagnostics,%20drug%20discovery%20and%20drug%20delivery.pdf

    6.Peixuan Guo., Haque F., Brent Hallahan, Randall Reif and Hui Li. Uniqueness, Advantages, Challenges, Solutions, and Perspectives in Therapeutics Applying RNA Nanotechnology. Nucleic Acid Ther. 2012 August; 22(4): 226–245. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3426230/

    7. SEEMAN N.C. Nanomaterials based on DNA. Annu. Rev. Biochem. 2010;79:65–87. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3454582/

    8. Yin P, Choi HMT, Calvert CR, Pierce NA. Programming biomolecular self-assembly pathways. Nature.2008;451:318–323.  http://www.ncbi.nlm.nih.gov/pubmed/18202654

    9. Yan Lee P, Wong KY: Nanomedicine: a new frontier in cancer therapeutics. Curr Drug Deliv 2011, 8(3):245-253. OpenURLhttp://www.eurekaselect.com/73728/article

    10. Qian, L.L., Winfree, E., and Bruck, J. Neural Network Computation with DNA Strand Displacement Cascades. Nature 2011 475, 368-372.  http://www.nature.com/nature/journal/v475/n7356/full/nature10262.html

    11. Acharya S, Dilnawaz F, Sahoo SK: Targeted epidermal growth factor receptor nanoparticle bioconjugates for breast cancer therapy. Biomaterials 2009, 30(29):5737-5750. http://www.sciencedirect.com/science/article/pii/S0142961209006929

    12. Bohunicky B, Mousa SA: Biosensors: the new wave in cancer diagnosisNanotechnology, Science and Applications 2011, 4:1-10. http://www.dovepress.com/biosensors-the-new-wave-in-cancer-diagnosis-peer-reviewed-article-NSA-recommendation1

    13. Sanvicens N, Mannelli I, Salvador J, Valera E, Marco M: Biosensors for pharmaceuticals based on novel technologyTrends Anal Chem 2011, 30:541-553. http://www.sciencedirect.com/science/article/pii/S016599361100015X

    14. Amin R, Kulkarni A, Kim T, Park SH: DNA thin film coated optical fiber biosensorCurr Appl Phys 2011, 12(3):841-845. http://www.sciencedirect.com/science/article/pii/S1567173911005888

    15. Choi, Y.; Baker, J. R. Targeting Cancer Cells with DNA Assembled Dendrimers: A Mix and Match Strategy for Cancer. Cell Cycle 2005, 4, 669–671. http://www.ncbi.nlm.nih.gov/pubmed/15846063  http://www.landesbioscience.com/journals/cc/article/1684/

    Other related articles on this Open Access Online Scientific Journal include the following

    I. By: Ziv Raviv PhD. The Development of siRNA-Based Therapies for Cancer. https://pharmaceuticalintelligence.com/2013/05/09/the-development-of-sirna-based-therapies-for-cancer/

    II. By: Tilda Barliya PhD. Nanotechnology, personalized medicine and DNA sequencing. https://pharmaceuticalintelligence.com/2013/01/09/nanotechnology-personalized-medicine-and-dna-sequencing/

    III. By: Larry Bernstein MD FACP. DNA Sequencing Technology. https://pharmaceuticalintelligence.com/2013/03/03/dna-sequencing-technology/

    IV. By: Venkat S Karra PhD. Measuring glucose without needle pricks: nano-sized biosensors made the test easy. https://pharmaceuticalintelligence.com/2012/09/04/measuring-glucose-without-needle-pricks-nano-sized-biosensors-made-the-test-easy/

    Read Full Post »

    Curator: Aviva Lev-Ari, PhD, RN

    In their discussion, the researchers argue that the U.S. Supreme Court now has a chance to shape the balance between the medical good versus inventor protection, adding that, in their opinion, the court should limit the patenting of existing nucleotide sequences, due to their broad scope and non-specificity in the human genome.

    “I am extremely pro-patent, but I simply believe that people should not be able to patent a product of nature,” Dr. Mason says. “Moreover, I believe that individuals have an innate right to their own genome, or to allow their doctor to look at that genome, just like the lungs or kidneys. Failure to resolve these ambiguities perpetuates a direct threat to genomic liberty, or the right to one’s own DNA.”


    Supreme Court May Decide Whether We Own Our Genes

    March 26, 2013
    Image Credit: Photos.com

    Brett Smith for redOrbit.com – Your Universe Online

    They may be responsible for everything in your life, from conception to death, they may be inside every living cell in your body – but you do not own your own genes, legally speaking.

    According to a report in Genome Medicine, patents essentially cover the entire human genome, hampering research and raising the question of “genomic liberty.”

    The legal standing of genomic patents could change next month when the Supreme Court reviews patent rights for two key breast and ovarian cancer genes, BRCA1 and BRCA2, which include segments of genetic code as small as 15 nucleotides, known as 15mers.

    “This is, so to speak, patently ridiculous,” said report co-author Dr. Christopher E. Mason of Weill Cornell Medical College. “If patent claims that use these small DNA sequences are upheld, it could potentially create a situation where a piece of every gene in the human genome is patented by a phalanx of competing patents.”

    In their report, Mason and Dr. Jeffrey Rosenfeld, an assistant professor of medicine at the University of Medicine & Dentistry of New Jersey, looked at patents for two different categories of DNA fragments:

    • long and
    • short.

    They revealed 41 percent of the human genome is covered by “long” DNA patents that can include whole genes. Because many genes share similar sequences within their code that are patented, the combination of all these “short” DNA patents covers 100 percent of the genome.

    “This demonstrates that short patent sequences are extremely non-specific and that a 15mer claim from one gene will always cross-match and patent a portion of another gene as well,” Mason said. “This means it is actually impossible to have a 15mer patent for just one gene.”

    To reach their conclusions, the researchers first looked at small sequences within BRCA1 and noticed one of the company’s BRCA1 patents also covered almost 690 other human genes. Some of these genes are unrelated to breast cancer – instead being associated with brain development and heart functioning.

    Next, researchers determined how many known genes are covered by 15mers in current patent claims. They found 58 patents covered at least ten percent of all bases of all human genes. The broadest patent claim matched 91.5 percent of human genes. When the team took patented 15mers and matched them to known genes, they found 100 percent of known genes are patented.

    Finally, the team also looked at “long” DNA sequences from existing gene patents, ranging from a few dozen to thousands of base pairs. They found these long sequences added up to 41 percent of known human genes.

    “There is a real controversy regarding gene ownership due to the overlap of many competing patent claims. It is unclear who really owns the rights to any gene,” Rosenfeld said. “While the Supreme Court is hearing one case concerning just the BRCA1 patent, there are also many other patents whose claims would cover those same genes.

    “Do we need to go through every gene to look at who made the first claim to that gene, even if only one small part? If we resort to this rule, then the first patents to be granted for any DNA will have a vast claim over portions of the human genome,” he added.

    Another legal question surrounds patented DNA sequences that cross species boundaries. The researchers found one company has the rights to 84 percent of all human genes for a patent they received for cow breeding.

    Source: Brett Smith for redOrbit.com – Your Universe Online

    Topics: Health Medical PharmaGeneticsGene patentBiologyGeneLiving modified organismAssociation for Molecular Pathology v. U.S. Patent and Trademark OfficeBRCA1DNASupreme CourtHuman genome


    Human Genome: Name Your Price

    Posted March 27, 2013 – 12:51 by a staff writer

    Weill Cornell Medical College researchers have issued a warning that, according to the patent system, the vast majority of humans on the planet don’t ‘own’ their own genes, and in fact their biological make-up is being exploited for profit. Even seemingly innocent research into cow breeding can cover human genetic make-up.

    As spotted by a Slashdot user, two researchers combing through patents on human DNA discovered that over 40,000 patents on DNA molecules have effectively declared the human genome for profit. A report in medical journal Genome Medicine said that humans may be losing their grip on “individual genomic liberty”.

    Looking at two kinds of patented DNA sequences, or long and short fragments, 41 percent of the human genome is covered by DNA patents that can cover entire genes. According to the research, if all of the short sequence patents were allowed in aggregate they could cover 100 percent of the human genome.

    Lead author Dr Christopher E Mason and co-author Dr Jeffrey Rosenfeld warned that short sequences from patents cover “virtually the entire genome, even outside of genes”. A Weill Cornell assistant professor asked: “How is it possible that my doctor cannot look at my DNA without being concerned about patent infringement?”

    There will be a Supreme Court hearing about genomic patent rights next month that will debate the morality of a molecular diagnostic company claiming patents on key cancer genes, as well as on any small sequence of code within the BRCA1 gene. Cornell explained that at present, genes are able to be patented by researchers working in companies and institutions who discover genes that have potentially useful applications, like in testing for cancer risks. Because the patents can be held by companies or organisations, it is possible for the patent owner to charge doctors thousands of dollars for each diagnostic test.

    The authors pointed out that in their studies, while engaged in research, it is common to come across a gene that’s patented “almost every day”. Their paper promises to examine how genes may have been impacted by held patents, and the extent of intellectual property on the genome. Gene patents can also relate between different species – for example, a company may have a patent for breeding cows that also covers a large percentage of human genes. They cited one company that owns 84 percent of all human genes because of a patent for cow breeding.

    “There is a real controversy regarding gene ownership due to the overlap of many competing patent claims. It is unclear who really owns the rights to any gene,” Dr Rosenfeld said. “Do we need to go through every gene to look at who made the first claim to that gene, even if only one small part? If we resort to this rule, then the first patents to be granted for any DNA will have a vast claim over portions of the human genome.”

    Lead author Dr Mason insisted he is pro-patent, but believes people “should not be able to patent a product of nature”.

    “I believe that individals have an innate right to their own genome,” he said.


    Other related articles on Genomics and Ethics on this Open Access Online Scientific Journal include the following:

    Aviva Lev-Ari, PhD, RN

    20.2 Understanding the Role of Personalized Medicine

    Larry H Bernstein, MD, FACP

    20.3 Attitudes of Patients about Personalized Medicine

    Larry H Bernstein, MD, FACP

    20.4  Genome Sequencing of the Healthy

    Larry H. Bernstein, MD, FACP and Aviva Lev-Ari, PhD, RN

    20.5   Genomics in Medicine – Tomorrow’s Promise

    Larry H. Bernstein, MD, FACP

    20.6  The Promise of Personalized Medicine

    Larry H. Bernstein, MD, FACP

    Read Full Post »

    Genomics of Bacterial and Archaeal Viruses

    Reporter: Larry H Bernstein, MD, FCAP
    https://pharmaceuticalintelligence.com/?p=10187/Genomics of Bacterial and Archaeal Viruses/


    Image Source: Created by Noam Steiner Tomer 7/31/2020
    Genomics of Bacterial and Archaeal Viruses: Dynamics within the Prokaryotic Virosphere
    M Krupovic, D Prangishvili, RW Hendrix, and DH Bamford
    Over the past few years, the viruses of prokaryotes have been transformed in the view of microbiologists from simply being convenient experimental model systems into being a major component of the biosphere. They are
    • the global champions of diversity,
    • they constitute a majority of organisms on the planet,
    • they have large roles in the planet’s ecosystems,
    • they exert a significant—some would say dominant—force on
    • the evolution of their bacterial and archaeal hosts, and
    • they have been doing this for billions of years,
    • possibly for as long as there have been cells.
    This transformation in status  or, rather, our expanded appreciation of the importance of these viruses in the biosphere is due to a few significant developments in both understanding and technology.
    (i) It has become clear that the population sizes of these viruses are astoundingly large. This realization grew out of electron microscopic enumerations of tailed phage virions in costal seawater, and numerous measurements in other environments have been made since then. A current estimate based on these measurements is that
    • there are  1031 individual tailed phage virions in the global biosphere—
    • enough to reach for 200 million light years if laid end to end—and measurements of population turnover suggest that
    • it takes roughly 1024 productive infections per second to maintain the global population.
    (ii) Advances in DNA sequencing technology have led to dramatic qualitative improvements in how we understand the
    • genetic structure of viral populations,
    • the mechanisms of viral evolution, and
    • the diversity of viral sequences.
    The majority of newly determined gene and protein sequences of these viruses has no relatives detectable in the public sequence databases, and
    • analysis of metagenomic data provides strong evidence that
    • there is more genetic diversity in the genes of the viruses of prokaryotes
      • than in any other compartment of the biosphere.
    (iii) Facilitated by these conceptual and technical advances, studies of bacterial and archaeal viruses as important components of global biology have flourished. These viruses are revealed as important players in
    • carbon and energy cycling in the oceans and other natural environments and
    • as major agents in the ecology and evolution of their cellular hosts.
    (iv) The isolation and characterization of new viruses have accelerated. This has been especially important for the archaeal viruses, where the discovery of new viruses and of new virus types had lagged behind bacteriophage discovery. For the bacteriophages, the isolation of newly discovered viruses has helped improve the still extremely sparse coverage of sequence diversity and the narrow phylogenetic range of hosts represented by current data.
    (v) High-resolution structures determined
    • for capsid proteins and other virion proteins,
    • together with information about virion assembly mechanisms,
    • have allowed surprising inferences about ancestral connections among genes whose DNA sequences and encoded protein sequences
      • have diverged to the point that they are no longer detectably related.
    English: Schematic diagram of the hexon of a v...

    English: Schematic diagram of the hexon of a virus capsid (Photo credit: Wikipedia)

    English: Adsorption of virions to cells. Portu...

    English: Adsorption of virions to cells. Português do Brasil: Adsorção de vírus a células. (Photo credit: Wikipedia)

    Polio virus (picornavirus)

    Polio virus (picornavirus) (Photo credit: Sanofi Pasteur)

    Read Full Post »

    Genomics in Medicine – Tomorrow’s Promise

    Reporter: Larry H Bernstein, MD, FCAP

    Genomics in Medicine: Today’s Issues, Tomorrow’s Promise

    KM Beima-Sofie, EH Dorfman, JM Kocarnik, MY Laurino
    Feb 13, 2013 Medscape Genomic Medicine

    What do you think about these issues before reading this piece?

    The Broader Implications of Genetic Sciences
    The 62nd annual meeting of the American Society of Human Genetics (ASHG), which was held in San Francisco, California, in November 2012, featured a diverse array of research in basic, clinical, and population science contributed by human geneticists across the globe.
    Genetic Sequencing Moves Beyond the Laboratory
    Several presentations at the meeting focused on the lessons learned from the National Heart, Lung, and Blood Institute (NHLBI) Exome Sequencing Project. The goal of the project was to
    • develop and validate a cost-effective and high-throughput sequencing technology
    • capable of analyzing the DNA sequence in the exome, which
    • consists of all protein-coding regions in the human genome.
    At previous ASHG meetings, presentations and discussions largely focused on
    • the development of sequencing technology and on applications of this technology for research.
    Now that sequencing is an increasing reality, this year’s conference featured presentations on
    • what to do with the resulting information, in both research and clinical settings.
    Issues discussed include the challenges of
    • interpreting sequence data,
    • determining which results should be returned to various parties, and
    • the potential impacts of different testing techniques.
    Results from the NHLBI Exome Sequencing Project and other projects are fueling the discussion on
    legal issues surrounding gene patenting, a hotly debated topic that is currently under consideration by the US Supreme Court. During a plenary session on gene discovery and patent law,
    Of particular focus was the lawsuit brought by the American Civil Liberties Union against Myriad Genetics,
    • contesting the company’s patent of the BRCA1 and BRCA2 genes for hereditary breast and ovarian cancer.
    At present, Myriad has exclusive rights to offer clinical genetic testing for these genes; one of the main arguments of the lawsuit is
    • that gene patents hinder the pursuit of confirmatory tests and limit the testing options available to women.
    DNAPrint Genomics

    DNAPrint Genomics (Photo credit: Wikipedia)

    English: Exome sequencing workflow: Part 2. Ta...

    English: Exome sequencing workflow: Part 2. Target exons are enriched, eluted and then amplified by ligation-mediated PCR. Amplified target DNA is then ready for high-throughput sequencing. (Photo credit: Wikipedia)

    Cost per Megabase of DNA Sequence (Why biologi...

    Cost per Megabase of DNA Sequence (Why biologists panic about compute) (Photo credit: dullhunk)

    Read Full Post »

    Consumer Market for Personal DNA Sequencing: Part 4

    Reporter: Aviva Lev-Ari, PhD RN


    This Part 4 of the series on Present and Future Frontier of Research in Genomics has been 

    UPDATED on 12/6/2013

    23andMe Suspends Health Interpretations

    December 06, 2013

    Direct-to-consumer genetic testing company 23andMe hasstopped offering its health-related test to new customers, bringing it in line with a request from the US Food and Drug Administration.

    In letter sent on Nov. 22, FDA said that 23andMe had not adequately responded to its concerns regarding the validity of their Personal Genome Service. The letter instructed 23andMe to “immediately discontinue marketing” the service until it receives authorization from the agency.

    According to a post at the company’s blog from CEO Anne Wojcicki, 23andMe customers who purchased their kits on or after Nov. 22 “will not have access to health-related results.” They will, though, have access to ancestry information and their raw genetic data. Wojcicki notes that the customers may have access to the health interpretations in the future depending on FDA marketing authorization. Those customers are also being offered a refund.

    Customers who purchased their kits before Nov. 22 will have access to all reports.

    “We remain firmly committed to fulfilling our long-term mission to help people everywhere have access to their own genetic data and have the ability to use that information to improve their lives,” a notice at the 23andMe site says.

    In a letter appearing in the Wall Street Journal earlier this week, FDA Commissioner Margaret Hamburg wrote that the agency “supports the development of innovative tests.” As an example, she pointed to its recent clearance of sequencing-based testsfrom Illumina.

    She added that the agency also understands that some consumers do want to know more about their genomes and their genetic risk of disease, and that a DTC model would let consumers take an active role in their health.

    “The agency’s desire to review these particular tests is solely to ensure that they are safe, do what they claim to do and that the results are communicated in a way that a consumer can understand,” Hamburg said.

    In a statement, 23andMe’s Wojcicki says that the company remains committed to its ethos of allowing people access to their genetic information. “Our goal is to work cooperatively with the FDA to provide that opportunity in a way that clearly demonstrates the benefit to people and the validity of the science that underlies the test,” Wojcicki adds.


    UPDATED on 11/27/2013

    FDA Tells Google-Backed 23andMe to Halt DNA Test Service



    FDA Letter to 23andME

    Department of Health and Human Services logoDepartment of Health and Human Services

    Public Health Service
    Food and Drug Administration
    10903 New Hampshire Avenue
    Silver Spring, MD 20993

    Nov 22, 2013

    Ann Wojcicki
    23andMe, Inc.
    1390 Shoreline Way
    Mountain View, CA 94043
    Document Number: GEN1300666
    Re: Personal Genome Service (PGS)
    Dear Ms. Wojcicki,
    The Food and Drug Administration (FDA) is sending you this letter because you are marketing the 23andMe Saliva Collection Kit and Personal Genome Service (PGS) without marketing clearance or approval in violation of the Federal Food, Drug and Cosmetic Act (the FD&C Act).
    This product is a device within the meaning of section 201(h) of the FD&C Act, 21 U.S.C. 321(h), because it is intended for use in the diagnosis of disease or other conditions or in the cure, mitigation, treatment, or prevention of disease, or is intended to affect the structure or function of the body. For example, your company’s website at http://www.23andme.com/health (most recently viewed on November 6, 2013) markets the PGS for providing “health reports on 254 diseases and conditions,” including categories such as “carrier status,” “health risks,” and “drug response,” and specifically as a “first step in prevention” that enables users to “take steps toward mitigating serious diseases” such as diabetes, coronary heart disease, and breast cancer. Most of the intended uses for PGS listed on your website, a list that has grown over time, are medical device uses under section 201(h) of the FD&C Act. Most of these uses have not been classified and thus require premarket approval or de novo classification, as FDA has explained to you on numerous occasions.
    Some of the uses for which PGS is intended are particularly concerning, such as assessments for BRCA-related genetic risk and drug responses (e.g., warfarin sensitivity, clopidogrel response, and 5-fluorouracil toxicity) because of the potential health consequences that could result from false positive or false negative assessments for high-risk indications such as these. For instance, if the BRCA-related risk assessment for breast or ovarian cancer reports a false positive, it could lead a patient to undergo prophylactic surgery, chemoprevention, intensive screening, or other morbidity-inducing actions, while a false negative could result in a failure to recognize an actual risk that may exist. Assessments for drug responses carry the risks that patients relying on such tests may begin to self-manage their treatments through dose changes or even abandon certain therapies depending on the outcome of the assessment. For example, false genotype results for your warfarin drug response test could have significant unreasonable risk of illness, injury, or death to the patient due to thrombosis or bleeding events that occur from treatment with a drug at a dose that does not provide the appropriately calibrated anticoagulant effect. These risks are typically mitigated by International Normalized Ratio (INR) management under a physician’s care. The risk of serious injury or death is known to be high when patients are either non-compliant or not properly dosed; combined with the risk that a direct-to-consumer test result may be used by a patient to self-manage, serious concerns are raised if test results are not adequately understood by patients or if incorrect test results are reported.
    Your company submitted 510(k)s for PGS on July 2, 2012 and September 4, 2012, for several of these indications for use. However, to date, your company has failed to address the issues described during previous interactions with the Agency or provide the additional information identified in our September 13, 2012 letter for(b)(4) and in our November 20, 2012 letter for (b)(4), as required under 21 CFR 807.87(1). Consequently, the 510(k)s are considered withdrawn, see 21 C.F.R. 807.87(1), as we explained in our letters to you on March 12, 2013 and May 21, 2013.  To date, 23andMe has failed to provide adequate information to support a determination that the PGS is substantially equivalent to a legally marketed predicate for any of the uses for which you are marketing it; no other submission for the PGS device that you are marketing has been provided under section 510(k) of the Act, 21 U.S.C. § 360(k).
    The Office of In Vitro Diagnostics and Radiological Health (OIR) has a long history of working with companies to help them come into compliance with the FD&C Act. Since July of 2009, we have been diligently working to help you comply with regulatory requirements regarding safety and effectiveness and obtain marketing authorization for your PGS device. FDA has spent significant time evaluating the intended uses of the PGS to determine whether certain uses might be appropriately classified into class II, thus requiring only 510(k) clearance or de novo classification and not PMA approval, and we have proposed modifications to the device’s labeling that could mitigate risks and render certain intended uses appropriate for de novo classification. Further, we provided ample detailed feedback to 23andMe regarding the types of data it needs to submit for the intended uses of the PGS.  As part of our interactions with you, including more than 14 face-to-face and teleconference meetings, hundreds of email exchanges, and dozens of written communications, we provided you with specific feedback on study protocols and clinical and analytical validation requirements, discussed potential classifications and regulatory pathways (including reasonable submission timelines), provided statistical advice, and discussed potential risk mitigation strategies. As discussed above, FDA is concerned about the public health consequences of inaccurate results from the PGS device; the main purpose of compliance with FDA’s regulatory requirements is to ensure that the tests work.
    However, even after these many interactions with 23andMe, we still do not have any assurance that the firm has analytically or clinically validated the PGS for its intended uses, which have expanded from the uses that the firm identified in its submissions. In your letter dated January 9, 2013, you stated that the firm is “completing the additional analytical and clinical validations for the tests that have been submitted” and is “planning extensive labeling studies that will take several months to complete.” Thus, months after you submitted your 510(k)s and more than 5 years after you began marketing, you still had not completed some of the studies and had not even started other studies necessary to support a marketing submission for the PGS. It is now eleven months later, and you have yet to provide FDA with any new information about these tests.  You have not worked with us toward de novo classification, did not provide the additional information we requested necessary to complete review of your 510(k)s, and FDA has not received any communication from 23andMe since May. Instead, we have become aware that you have initiated new marketing campaigns, including television commercials that, together with an increasing list of indications, show that you plan to expand the PGS’s uses and consumer base without obtaining marketing authorization from FDA.
    Therefore, 23andMe must immediately discontinue marketing the PGS until such time as it receives FDA marketing authorization for the device. The PGS is in class III under section 513(f) of the FD&C Act, 21 U.S.C. 360c(f). Because there is no approved application for premarket approval in effect pursuant to section 515(a) of the FD&C Act, 21 U.S.C. 360e(a), or an approved application for an investigational device exemption (IDE) under section 520(g) of the FD&C Act, 21 U.S.C. 360j(g), the PGS is adulterated under section 501(f)(1)(B) of the FD&C Act, 21 U.S.C. 351(f)(1)(B).  Additionally, the PGS is misbranded under section 502(o) of the Act, 21 U.S.C. § 352(o), because notice or other information respecting the device was not provided to FDA as required by section 510(k) of the Act, 21 U.S.C. § 360(k).
    Please notify this office in writing within fifteen (15) working days from the date you receive this letter of the specific actions you have taken to address all issues noted above. Include documentation of the corrective actions you have taken. If your actions will occur over time, please include a timetable for implementation of those actions. If corrective actions cannot be completed within 15 working days, state the reason for the delay and the time within which the actions will be completed. Failure to take adequate corrective action may result in regulatory action being initiated by the Food and Drug Administration without further notice. These actions include, but are not limited to, seizure, injunction, and civil money penalties.
    We have assigned a unique document number that is cited above. The requested information should reference this document number and should be submitted to:
    James L. Woods, WO66-5688
    Deputy Director
    Patient Safety and Product Quality
    Office of In vitro Diagnostics and Radiological Health
    10903 New Hampshire Avenue
    Silver Spring, MD 20993
    If you have questions relating to this matter, please feel free to call Courtney Lias, Ph.D. at 301-796-5458, or log onto our web site at www.fda.gov for general information relating to FDA device requirements.
    Sincerely yours,
    Alberto Gutierrez
    Office of In vitro Diagnostics
    and Radiological Health
     Center for Devices and Radiological Health



    Cancer Diagnostics by Genomic Sequencing: ‘No’ to Sequencing Patient’s DNA, ‘No’ to Sequencing Patient’s Tumor, ‘Yes’ to focus on Gene Mutation Aberration & Analysis of Gene Abnormalities



    Personal Genetics: An Intersection Between Science, Society, and Policy

    Saturday, February 16, 2013: 8:30 AM-11:30 AM

    Room 203 (Hynes Convention Center)

    On 26 June 2000, scientists announced the completion of a rough draft of the human genome, the result of the $3 billion publicly funded Human Genome Project. In the decade since, the cost of genome sequencing has plummeted, coinciding with the development of deep sequencing technologies and allowing, for the first time, personalized genetic medicine. The advent of personal genetics has profound implications for society that are only beginning to be discussed, even as the technologies are rapidly maturing and entering the market. This symposium will focus on how the genomic revolution may affect our society in coming years and how best to reach out to the general public on these important issues. How has the promise of genomics, as stated early in the last decade, matched the reality we observe today? What are the new promises — and pitfalls — of genomics and personal genetics as of 2013? What are the ethical implications of easy and inexpensive human genome sequencing, particularly with regard to ownership and control of genomic datasets, and what stakeholder interests must be addressed? How can the scientific community engage with the public at large to improve understanding of the science behind these powerful new technologies? The symposium will comprise three 15-minute talks from representatives of relevant sectors (academia/education, journalism, and industry), followed by a 45-minute panel discussion with the speakers.


    Peter Yang, Harvard University


    Brenna Krieger, Harvard University

    and Kevin Bonham, Harvard University


    James Thornton, Harvard University



    Ting Wu, Harvard University

    Personal Genetics and Education

    Mary Carmichael, Boston Globe

    The Media and the Personal Genetics Revolution

    Brian Naughton, 23andMe Inc.

    Commercialization of Personal Genomics: Promise and Potential Pitfalls

    Mira Irons, Children’s Hospital Boston

    Personal Genomic Medicine: How Physicians Can Adapt to a Genomic World

    Sheila Jasanoff, Harvard University

    Citizenship and the Personal Genomics

    Jonathan Gitlin, National Human Genome Research Institute

    Personal Genomics and Science Policy

    THIS IS A SERIES OF FOUR POINTS OF VIEW IN SUPPORT OF the Paradigm Shift in Human Genomics

    How to Tailor Cancer Therapy to the particular Genetics of a patient’s Cancer

    ‘No’ to Sequencing Patient’s DNA, ‘No’ to Sequencing Patient’s Tumor, ‘Yes’ to focus on Gene Mutation Aberration & Analysis of Gene Abnormalities PRESENTED in the following FOUR PARTS. Recommended to be read in its entirety for completeness and arrival to the End Point of Present and Future Frontier of Research in Genomics

    Part 1:

    Research Paradigm Shift in Human Genomics – Predictive Biomarkers and Personalized Medicine


    Part 2:

    LEADERS in the Competitive Space of Genome Sequencing of Genetic Mutations for Therapeutic Drug Selection in Cancer Personalized Treatment


    Part 3:

    Personalized Medicine: An Institute Profile – Coriell Institute for Medical Research


    Part 4:

    The Consumer Market for Personal DNA Sequencing


    Part 4:

    The Consumer Market for Personal DNA Sequencing

    How does 23andMe genotype my DNA?

    Technology and Standards

    23andMe is a DNA analysis service providing information and tools for individuals to learn about and explore their DNA. We use the Illumina OmniExpress Plus Genotyping BeadChip (shown here). In addition to the variants already included on the chip by Illumina, we’ve included our own, customized set of variants relating to conditions and traits that are interesting. Technical information on the performance of the chip can be found on Illumina’s website.

    All of the laboratory testing for 23andMe is done in a CLIA-certified laboratory.

    Once our lab receives your sample, DNA is extracted from cheek cells preserved in your saliva. The lab then copies the DNA many times — a process called “amplification” — growing the tiny amount extracted from your saliva until there is enough to be genotyped.

    In order to be genotyped, the amplified DNA is “cut” into smaller pieces, which are then applied to our DNA chip, a small glass slide with millions of microscopic “beads” on its surface (read more about this technology). Each bead is attached to a “probe”, a bit of DNA that matches one of the approximately one million genetic variants that we test. The cut pieces of your DNA stick to the matching DNA probes. A fluorescent signal on each probe provides information that can tell us which version of that genetic variant your DNA corresponds to.

    Although the human genome is estimated to contain about 10-30 million genetic variants, many of them are correlated due to their proximity to each other. Thus, one genetic variant is often representative of many nearby variants, and the approximately one million variants on our genotyping chip provide very good coverage of common variation across the entire genome.

    Our research team has also hand-picked tens of thousands of additional genetic variants linked to various conditions and traits in the scientific literature to analyze on our genotyping chip. As a result we can provide you with personal genetic information available only through 23andMe.

    Genetics service 23andMe announced some new cash in the bank today with a $50 million raise from Yuri Milner, 23andMe CEO Anne Wojcicki, Google’s Sergey Brin (who also happens to be Wojcicki’s husband), New Enterprise Associates, MPM Capital, and Google Ventures.

    With today’s new funding also comes the reduction of the price of its genome analysis service to $99. This isn’t special holiday pricing (as 23andMe has run repeatedly in the past) the company tells me, but rather what its normal pricing will be from now on.

    This move is overdue, at least as far as 23andMe’s business model is concerned. Just yesterday TechCrunch Conference Chair Susan Hobbs told me she was waiting for another $99 pricing deal to buy the Personal Genome Analysis product. Sure 23andMe has experimented with various pricing models, including subscription, since its founding in 2007, but had been at an official and prohibitive $299 price point until today. It’s also apparently been rigorously beta-testing various price points in the past couple of weeks, at some point experimenting with some lower than $99.

    For comparison, the company’s original pricing began at $999 and offered subscribers just 14 health and trait reports versus today’s 244 reports, as well as genetic ancestry information. Natera, Counsyl and Pathway Genomics are also in the genomics space, but they work by offering their services through doctors rather than direct to consumer.

    Since the company’s launch five years ago, it’s had 180K civilians profile their DNA, and representative Catherine Afarian tells us that, post-price drop and funding, its goal is to reach a million customers in 2013. This is a supremely ambitious goal considering it wants to turn an average user acquisition rate of 36K per year into one of 820K in one year alone.

    But Afarian isn’t fazed and brings up how the company once sold out 20k in $99 account inventory on something called “DNA Day.” “Once we can offer the service at $99 it means the average American will buy in,” she said.

    That $299 was too pricey, according to Hobbs, but $99 might be just right. She said the $99 price point, which yes, is less than an iPhone, was the main factor in her decision to buy in. “23andMe is more ‘nice-to-know’ information rather than ‘need-to-know’ information. It’s nice to know your ancestry. It’s more of a need to know that you are predisposed genetically for a type of cancer, so that you may take precautionary measures,” she said, implying that the data given by 23andMe isn’t necessarily vital medical information, or actionable when it is. While 23andMe can give you indicators about certain disease risks, it doesn’t close the loop, as in tell you what to do to prevent these diseases.

    “Its [utility] depends on your genetic data,” said Afarian when I asked her about the usefulness of the product. “If you’ve got a Factor 5 that puts you at risk for clotting, you might want to invest in anti-clotting socks. [And] there’s always something about themselves that people didn’t know.”

    Hobbs said eventually that she wouldn’t buy it, but only because she was looking into more exact lineage information for her little girl, and you need a Y chromosome in all DNA tests to show paternal lineage. Afarian also countered this hesitation, saying that what makes 23andMe unique is that it’s not only looking at just your Y or your mitochondrial DNA, but also your autosomal DNA, which does show some patrilineal information for females who lack that precious Y.

    While still sort of a novelty, the potential for 23andMe goes beyond lineage and hopefully that extra $50 million will go further than keeping the price low and into research. The company hopes that a million users will result in a giant database of 23andWe genetic info that can be used to spot trends, like which genes mean a higher risk of diabetes/cancer, etc. Which is great if it happens but for now remains a pipe dream for 23andMe/We.


    12/13/2012 @ 5:23PM |6,471 views

    What Is 23andMe Really Selling: The Moral Quandary At The Center Of The Personalized Genomics Revolution

    This week, 23andme, the personalized genomics company founded by Anne Wojcicki, wife of Google co-founder Sergey Brin, got an influx of investment cash ($50 million). According to their press release, they are using the money to bring the cost of their genetic test down to $99 (it was previously $299) which, they hope, will inspire the masses to get tested.

    So should the masses indulge?

    I prefer a quantified self approach to this question. At the heart of the quantified self-movement lies a very simple idea: metrics make us better. For devotees, this means “self-tracking,” using everything from the Nike fuel band to the Narcissism Personality Index to gather large quantities of personal data and—the bigger idea—use that data to improve performance.

    If you consider that performance suffers when health suffers then a genetic test can been seen as a kind of metric used to improve performance. This strikes me as the best way to evaluate this idea and leads us to ask the same question about personalized genomics that the quantified self movement asks about every other metric: will it improve performance.

    Arguments rage all over the place on this one, but the short answer is that SNP tests—which is the kind of DNA scan 23andme relies upon— don’t tell us all that much (yet).  They analyze a million genes out of three billion total and the impact those million play in long term-health outcomes is still in dispute. For example, the nature/nurture split is normally viewed at 30/70—meaning environmental factors play a far more significant role in long-term health outcomes than genetics.

    Moreover, all of the performance metrics used by the quantified self movement are used to for behavior modification—to drive self-improvement. Personalized genomics isn’t there yet. As Stanford University’s Nobel Prize-winning RNA researcher Andy Fire once told me, “if someone off the street is looking for pointers on how to live a healthier life, there’s nothing these tests will tell you besides basic physician advice like ‘eat right, don’t smoke and get plenty of exercise.’”

    And even with more well-regarded SNP tests, like the ones that examine the BRCA 1 and 2 markers for breast cancer—which  . NYU Langone Medical Center bioethicist Arthur Caplan explains it like this, “Say you test positive for a breast cancer disposition—then what are you going to do? The only preventative step you can take is to chop off your breasts.”

    So if prevention is not available the only thing left is fear and anxiety. Unfortunately, in the past few decades, there have been hundreds of studies linking stress to everything from immunological disorders to heart disease to periodonitic troubles. So while finding out you may be at risk for Parkinson’s may make you feel informed, that knowledge isn’t going to stop you from developing the disease—but the resulting stress may contribute to a host of other complications.

    This brings up a different question: if personalized genomics can’t yet help us much and could possibly hurt us—where’s the upside?

    Turns out there’s a big upside: Citizen science. SNP tests are not yet viable because we need more info. 23andme talks about the “power of one million people,” meaning, if one million take these tests then the resulting genetic database could lead to big research breakthroughs and these could lead to all sorts of health/performance improvements.

    This is what 23andme is really selling for $99 bucks a pop—a crowdsourced shot at unraveling a few more DNA mysteries.

    And this also means that the question at the heart of the personalized genomics industry is not about metrics at all—it’s about morals: Should I risk my health for the greater good?


    You can browse your data for all of the variants we test using the Browse Raw Data feature, or download your data here.

    before you buy (59) »

    What unexpected things might I learn?

    How does 23andMe genotype my DNA?

    Can I use the saliva collection kit for infants and toddlers?

    getting started (20) »

    When and how do I get my data?

    How do I collect saliva samples?

    How long will it take for my sample to reach the lab?

    account/profile settings (20) »

    Which Ancestry setting in My Profile should I choose?

    How do I use Browse Raw Data?

    What do the options under the “Account” link in the upper right-hand corner control?

    product features (145) »

    I know that a particular person is my relative. What’s the probability that we share a sufficient amount of DNA to be detected by Relative Finder?

    What is the average percent DNA shared for different types of cousins?

    How does Relative Finder estimate the Predicted Relationship?

    research initiatives (8) »

    What do I get in return for taking surveys?

    What is your research goal?

    What is 23andMe Research?









    Read Full Post »

    Reporter and Curator: Dr. Sudipta Saha, Ph.D.

    With the completion of the mapping of the human genome, we now have access to all the DNA sequence information responsible for human biology. Together with microarray technology, we are ushering in a new era in reproductive medicine—the era of Reproductive Genomics.

    Whole genome microarray analysis of the testis and ovary suggests that a substantial part of the genome is expressed in reproductive tissues and many of them are likely to be important for normal reproduction. Yet adequate expression and functional information is only available for less than 10% of them. Hence, one of the important questions in reproductive studies now is ‘how do we associate function with the genes expressed in reproductive tissues?’ The establishment of mutations in animal models such as the mouse represents one powerful approach to address this question.

    Animal models have played critical roles in improving our understanding of mechanisms and pathogenesis of diseases. Mouse knockout models have often provided highly needed functional validation of genes implicated in human diseases. The rapid advance of human genetics in areas such as

    • single nucleotide polymorphisms (SNP) and
    • haplotyping technology

    now allows the identification of disease-associated single nucleotide variation at a much faster pace. Functional examination of those candidate genes is needed to determine if those genes or variants are indeed involved in reproductive disease. Generating mutations in murine homologs of candidate genes represents a direct way to determine their roles, and mouse models will further allow the dissection of genetic pathways underlying the disease condition and provide models to test possible drug treatments. Thus, how to generate mouse models efficiently becomes a priority issue in the Genomics era of Reproductive Medicine.

    It is known that generating a mouse knockout is no small endeavor, even for a mouse research lab, often requiring specialized expertise and experience in

    • molecular biology,
    • embryonic stem (ES) biology and
    • mouse husbandry.

    Therefore, it could be intimidating for people who have little experience in mouse research. Fortunately, there are some technological developments in the mouse community that make the task of generating mouse mutations less intimidating to people unfamiliar with mouse genetics. One of these developments is the effort led by the International Gene Trap Consortium (IGTC) to generate a library of mouse mutant ES cells covering most of the genes in the mouse genome. This method saves researchers the tedious and sometimes challenging tasks of making knockout vectors and screening ES cell colonies and directly provides researchers an ES cell clone carrying the mutation of the gene of interest.

    Because gene trapping involves the use of different mechanisms in generating mutations from the traditional knockout method, and its efficacy in targeting reproductive genes which often are expressed in later development or adult has not been fully established, it is necessary to examine the benefits and limitations of this technology, especially in the perspective of reproductive medicine so that reproductive researchers and physicians who are interested in mouse models could become familiar with this technology.

    With this in mind, we provide an overview of the gene trapping mutagenesis method and its possible application to Reproductive Medicine. We evaluate gene trapping as a method in terms of its efficiency in comparison with traditional knockout methods and use an in-house software program to screen the IGTC database for existing cell lines with possible mutations in genes expressed in various reproductive tissues. Among over seven thousand genes highly expressed in human ovaries, almost half of them have existing gene trap lines.

    Additionally, from 900 human seminal fluid proteins, 43% of them have gene trap hits in their mouse homologs. Our analysis suggests gene trapping is an effective mutagenesis method for identifying the genetic basis of reproductive diseases and many mutations for important reproductive genes are already present in the database. Given the rapid growth of the number of gene trap lines, the continuing evolution of gene trap vectors, and its easy accessibility to scientific communities, gene trapping could provide a fast and efficient way of generating mouse mutation(s) for any one particular gene of interest or multiple genes involved in a pathway at the same time. Consequently, we recommend gene trapping to be considered in the planning of mouse modeling of human reproductive disease and the IGTC be the first stop for people interested in searching for and generating mouse mutations of genes of interest.

    Gene trapping is a high-throughput approach of generating mutations in murine ES cells through vectors that simultaneously disrupt and report the expression of the endogenous gene at the point of insertion. First-generation vectors trapped genes that were actively transcribed in undifferentiated ES cells. Depending on the areas in which they integrate, these vectors can be roughly divided into two classes:

    • promoter trap vectors and
    • gene trap vectors.

    Promoter trap vectors contain promoterless reporter regions, usually bgeo (a fusion of neomycin phosphotransferase and b-galactosidase), and thus have to be integrated into an exon of a transcriptionally active locus in order for the cell to be selected for neomycin resistance or by LacZ staining. Gene trap vectors demonstrate more utility by their added ability to integrate into an intron. These vectors contain a splice acceptor (SA) site positioned at the 50-end of the reporter gene, allowing the vector to be spliced to the endogenous gene to form a fusion transcript. Later improvements include an internal ribosomal re-entry site (IRES) between the SA site and the reporter gene sequence; as a result, the reporter gene can be translated even when it is not fused to the trapped gene. Second-generation vectors have sought to trap genes that are transcriptionally silent in ES cells. Although these vectors still contain a promoterless reporter gene with a 50 SA sequence, the antibiotic resistance gene is under the control of a constitutive promoter. Consequently, antibiotic selection is independent from the expression of the trapped gene, whereas the expression of the reporter gene is still regulated by the endogenous promoter.

    A disadvantage of these vectors is that all integration events give rise to resistant ES cells regardless of whether or not the vector has integrated into a gene locus. To increase trapping efficiency, a new class of polyA gene trap vectors was developed where the polyadenylation signal of the neo gene was replaced by a splice donor sequence, thereby requiring the vector to trap an endogenous polyA signal for expression of neo. These vectors were recently shown to have a bias toward insertion near the 30-end of a gene due to nonsense-mediated mRNA decay of the fusion transcript. An improved polyA trap vector, UPATrap, was developed to overcome this bias using an IRES sequence placed downstream of a marker containing a termination codon. Gene trap vectors are usually introduced by retroviral infection or electroporation of plasmid DNA, with each approach having its own advantages and disadvantages.

    While relatively difficult to manipulate, retroviral gene traps display a preference toward insertion at the 50-end of genes, which is advantageous for generating null alleles. Moreover, the multiplicity of infection with retroviruses can be tightly controlled to a single trap event or simultaneous disruption in many genes. However, there may be a possible bias integration toward certain ‘hotspots’ of the genome.

    In contrast, plasmid-based gene trap vectors integrate more randomly into the genome. This can, however, potentially result in a functional partial protein and a hypomorphic phenotype. Additionally, plasmid vectors usually result in multiple integrations in 20–50% of cell lines. The most common approach for identifying the gene trap integration site is to use 50 or 30 rapid amplification of cDNA ends (RACE) to amplify the fusion transcript. The sequence provides a DNA tag for the identification of the disrupted gene and can be used for genotypic screens. Mutagenesis screens can also be performed on the basis of gene function or expression, and data from an expression sequence combined with sequence tag information can elucidate novel expression patterns of known genes or to suggest gene function.

    Gene trapping has proven to be an efficacious technique in mutagenesis compared with other methods such as

    • spontaneous mutations,
    • fortuitous transgene integration and
    • N-ethyl-N-nitrosurea (ENU) mutagenesis

    We have been able to use our SpiderGene program to identify genes in reproductive tissues that are present in the IGTC database and moreover to narrow down those with restricted expression in the testis and ovary. Gene trapping possesses an enormous potential for researchers in the reproductive field seeking to create mouse models for a gene mutation. The improving versatility of gene trap vectors has enabled groups to trap an increasing number of genes in various organisms, including Arabidopsis, Zebra fish and Drosophila.

    The gene trap effort has perhaps been the most extensive in the murine genome, with over 57000 cell lines representing more than 40% of the known genome. These large-scale screens will likely achieve the trapping of the entire mouse genome in the coming years, but the power of gene trapping will only be fully demonstrated by its usefulness in investigator-driven focused functional analyses.

    In our laboratory, future work will focus on generating knockout mice in order to investigate gene function and to identify gene products that might have therapeutic value in reproduction. As screening efforts continue, gene trapping will continue to be a valuable tool in mouse genomics and will undoubtedly yield new discoveries in Reproductive Physiology and Pathology.

    Source References:



    Read Full Post »

    Curator: Aviva Lev-Ari, PhD, RN

    Population Genetics

    HAPAA: a tool for ancestral haploblock reconstruction. Specifically, given the genotype  (for instance, as derived by an Illumina genotyping array) of an individual of admixed ancestry, find the source population for each segment of the individual’s genome.

    Protein Interaction Networks

    A tool for aligning multiple global protein interaction networks; Graemlin also supports search for homology between a query module of proteins and a database of interaction networks.

    Machine Learning

    CONTRA: Conditionally trained models for sequence analysis. SeeCONTRAlign, a protein sequence aligner with very high accuracy, especially in twilight alignments. See CONTRAfold, an RNA secondary structure prediction tool. Stay tuned for more…

    RNA Structure Prediction

    CONTRAfold: Prediction of RNA secondary structure with a Conditional Log-Linear model that relies on automatically trained parameters, rather than on a physics-based energy model of RNA folding.

    Protein Alignment

    CONTRAlign: A protein sequence aligner that users can optionally train on feature sets such as secondary structure and solvent accessibility; see the CONTRA project above.
    A protein multiple sequence aligner that exhibits high accuracy on popular benchmarks.
    A protein multiple aligner that automatically finds domain structures of sequences with shuffled and repeated domain architectures.

    Motif Finding

    MotifCut: a non-parametric graph-based motif finding algorithm.
    MotifScan: a non-parametric method for representing motifs and scanning DNA sequences for known motifs.
     CompareProspector: motif finding with Gibbs sampling & alignment.

    Genomic Alignment

    Stanford ENCODE: Multiple Alignments of 1% of the Human genome.
    Typhon: BLAST-like sequence search to a multiple alignments database.
    LAGAN: tools for genomic alignment. These include the MLAGAN multiple alignment tool, and Shuffle-LAGAN for alignment with rearrangements.

    Microarray Analysis

    Application of Independent Component Analysis (ICA) to microarrays.

    Researchers Hope New Database Becomes Universal Cancer Genomics Tool

    Swiss scientists hope that a new online database called “arrayMap” will bring cancer genomics to the desktop, laptop, and tablet computers of pathologists and researchers everywhere.

    The database combines genomic information from three sources: large repositories such as the NCBI Gene Expression Omnibus (GEO) and Cancer Genome Atlas (CGA); journal literature; and submissions from individual investigators. It incorporates more than 42,000 genomic copy number arrays—normal and abnormal DNA comparisons—from 195 cancer types.

    “arrayMap includes a wider range of human cancer copy number samples than any single repository,” said principal investigator Michael Baudis, M.D. Ease of access, visualization, and data manipulation, he added, are top priorities in its ongoing development.

    A product of the University of Zurich Institute for Molecular Life Sciences, where Baudis researches bioinformatics and oncogenomics, arrayMap illustrates the importance of copy number abnormalities (CNA)—dysfunctional DNA gains or losses that visibly lengthen or shorten certain chromosomes—in the diagnosis, staging, and treatment of various malignancies.

    “I have this particular tumor type—are there any CNAs in it that can tell me anything about prognosis or treatment?” said Michael Rossi, Ph.D., director of the Winship Cancer Institute cancer genomics program at the Emory University School of Medicine in Atlanta. “Data mining tools like arrayMap are incredibly useful to help answer such questions.”

    arrayMap – genomic arrays for copy number profiling in human cancer

    arrayMap is a curated reference database and bioinformatics resource targeting copy number profiling data in human cancer. The arrayMap database provides an entry point for meta-analysis and systems level data integration of high-resolution oncogenomic CNA data. The current data reflects:

    • 42875 genomic copy number arrays
    • 634 experimental series
    • 256 array platforms
    • 197 ICD-O cancer entities
    • 480 publications (Pubmed entries)

    For the majority of the samples, probe level visualization as well as customized data representation facilitate gene level and genome wide data review. Results from multi-case selections can be connected to downstream data analysis and visualization tools, as we provide through our Progenetix project.

    arrayMap is developed by the group “Theoretical Cytogenetics and Oncogenomics” at the Institute of Molecular Life Sciences of the University of Zurich.

    These tools were developed for our research projects. You are welcome to try them out, but there is only sparse documentation. If more support and/or custom analysis is needed, please contact Michael Baudis regarding a collaborative project.

    MIT: A New Approach Uses Compression to Speed Up Genome Analysis

    Public-Domain Computing Resources

    Structural Bioinformatics

    The BetaWrap program detects the right-handed parallel beta-helix super-secondary structural motif in primary amino acid sequences by using beta-strand interactions learned from non-beta-helix structures.
    Wrap-and-pack detects beta-trefoils in protein sequences by using both pairwise beta-strand interactions and 3-D energetic packing information
    The BetaWrapPro program predicts right-handed beta-helices and beta-trefoils by using both sequence profiles and pairwise beta-strand interactions, and returns coordinates for the structure.
    The MSARi program indentifies conserved RNA secondary structure in non-coding RNA genes and mRNAs by searching multiple sequence alignments of a large set of candidate catalogs for correlated arrangements of reverse-complementary regions
    The Paircoil2 program predicts coiled-coil domains in protein sequences by using pairwise residue correlations obtained from a coiled-coil database. The original Paircoil program is still available for use.
    The MultiCoil program predicts the location of coiled-coil regions in amino acid sequences and classifies the predictions as dimeric or trimeric. An updated version, Multicoil2, will soon be available.
    The LearnCoil Histidase Kinase program uses an iterative learning algorithm to detect possible coiled-coil domains in histidase kinase receptors.
    The LearnCoil-VMF program uses an iterative learning algorithm to detect coiled-coil-like regions in viral membrane-fusion proteins.
    The Trilogy program discovers novel sequence-structure patterns in proteins by exhaustively searching through three-residue motifs using both sequence and structure information.
    The ChainTweak program efficiently samples from the neighborhood of a given base configuration by iteratively modifying a conformation using a dihedral angle representation.
    The TreePack program uses a tree-decomposition based algorithm to solve the side-chain packing problem more efficiently. This algorithm is more efficient than SCWRL 3.0 while maintaining the same level of accuracy.
    PartiFold: Ensemble prediction of transmembrane protein structures. Using statistical mechanics principles, partiFold computes residue contact probabilities and sample super-secondary structures from sequence only.
    tFolder: Prediction of beta sheet folding pathways. Predict a coarse grained representation of the folding pathway of beta sheet proteins in a couple of minutes.
    RNAmutants: Algorithms for exploring the RNA mutational landscape.Predict the effect of mutations on structures and reciprocally the influence of structures on mutations. A tool for molecular evolution studies and RNA design.
    AmyloidMutants is a statistical mechanics approach for de novo prediction and analysis of wild-type and mutant amyloid structures. Based on the premise of protein mutational landscapes, AmyloidMutants energetically quantifies the effects of sequence mutation on fibril conformation and stability.


    GLASS aligns large orthologous genomic regions using an iterative global alignment system. Rosetta identifies genes based on conservation of exonic features in sequences aligned by GLASS.
    RNAiCut – Automated Detection of Significant Genes from Functional Genomic Screens.
    MinoTar – Predict microRNA Targets in Coding Sequence.

    Systems Biology

    The Struct2Net program predicts protein-protein interactions (PPI) by integrating structure-based information with other functional annotations, e.g. GO, co-expression and co-localization etc. The structure-based protein interaction prediction is conducted using a protein threading server RAPTOR plus logistic regression.
    IsoRank is an algorithm for global alignment of multiple protein-protein interaction (PPI) networks. The intuition is that a protein in one PPI network is a good match for a protein in another network if the former’s neighbors are good matches for the latter’s neighbors.


    t-sample is an online algorithm for time-series experiments that allows an experimenter to determine which biological samples should be hybridized to arrays to recover expression profiles within a given error bound.


    Compressive genomics


    Nature Biotechnology 30, 627–630 (2012) doi:10.1038/nbt.2241

    Published online 10 July 2012


    BMIR is committed to the development of research tools as part of its goal to provide reusable, computational building blocks to facilitate the development of a vast array of systems. Some of these resources are described below.


    The National Center for Biomedical Ontology (NCBO)


    The National Center for Biomedical Ontology is a consortium of leading biologists, clinicians, informaticians, and ontologists who develop innovative technology and methods that allow scientists to create, disseminate, and manage biomedical information and knowledge in machine-processable form.

    visit site


    Protege Logo

    Protégé is a free, open-source platform that provides its community of more than 80,000 users with a suite of tools to construct domain models and knowledge-based applications with ontologies.

    visit site



    PharmGKB curates information that establishes knowledge about the relationships among drugs, diseases and genes, including their variations and gene products. Our mission is to catalyze pharmacogenomics research.

    visit site


    Simbios Logo

    About Simbios

    Simbios, the National NIH Center for Physics-based Simulation of Biological Structures is devoted to helping biomedical researchers understand biological form and function. It provides infrastructure, software, and training to assist users as they create novel drugs, synthetic tissues, medical devices, and surgical interventions.

    Simbios scientists investigate structure-function studies on a wide scale of biology – from molecules to organisms, and are currently focusing on challenging biological problems in RNA folding, myosin dynamics, neuromuscular biomechanics and cardiovascular dynamics.

    visit site

    Stanford BioMedical Informatics Research (BMIR) – Publications by Project

    There are 8 publications for the project “Genomic Nosology for Medicine (GNOMED)”.

    Identifying compartment-specific non-HLA targets after renal transplantation by integrating transcriptome and ‘‘antibodyome’’ measures
    L. Li, P. Wadia, M. Sarwal, N. Kambham, T. Sigdel, D. B. Miklos, R. Chen, M. Naesens, A. J. Butte
    PNAS, 106, 11, 4148-4153. Published in 2009
    Using SNOMED-CT For Translational Genomics Data Integration
    J. Dudley, D. P. Chen, A. J. Butte
    Ronald Cornet, Kent Spackman (eds.): Representing and sharing knowledge using SNOMED. Proceedings of the 3rd International Conference on Knowledge Rep, Pheonix (AZ), USA, CEUR Workshop Proceedings, ISSN 1613-0073, online CEUR-WS.org/Vol-410/, 91-96. Published in 2008
    The Ultimate Model Organism
    A. J. Butte
    Science, 320, 5874, 325-327. Published in 2008
    Novel Integration of Hopsital Electronic Medical Records and Gene Expression Measurements to Identify Genetic Markers of Maturation
    D. P. Chen, S. C. Weber, P. S. Constantinou, T. A. Ferris, H. J. Lowe, A. J. Butte
    Pacific Symposium on Biocomputing, Big Island, Hawaii, 13, 243-254. Published in 2008
    Enabling Integrative Genomic Analysis of High-Impact Human Diseases through Text Mining
    J. Dudley, A. J. Butte
    Pacific Symposium on Biocomputing, Big Island, Hawaii, 13, 580-591. Published in 2008
    Methodologies for Extracting Functional Pharmacogenomic Experiments from International Repository
    Y. Lin, A. P. Chiang, P. Yao, R. Chen, A. J. Butte, R. S. Lin
    AMIA Annual Symposium, Chicago, IL, 463-467. Published in 2007
    Clinical Arrays of Laboratory Measures, or “Clinarrays”, Built from an Electronic Health Record Enable Disease Subtyping by Severity
    D. P. Chen, S. C. Weber, P. S. Constantinou, T. A. Ferris, H. J. Lowe, A. J. Butte
    AMIA Annual Symposium, Chicago, IL, 115-119. Published in 2007
    Finding Disease-Related Genomic Experiments Within an International Repository: First Steps in Translational Bioinformatics
    A. J. Butte, R. Chen
    Annual Symposium of the American Medical Informatics Association, Washington, D.C., 106-10. Published in 2006

    Featured Publications

    The National Center for Biomedical Ontology
    M. A. Musen, N. F. Noy, C. G. Chute, M. A. Storey, B. Smith, N. H. Shah
    . Published in 2011
    Prototyping a Biomedical Ontology Recommender Service
    C. Jonquet, N. H. Shah, M. A. Musen
    Bio-Ontologies: Knowledge in Biology, SIG, ISMB ECCB 2009, Stockholm, Sweden. Published in 2009
    Translational bioinformatics applications in genome medicine
    A. J. Butte
    Genome Medicine, 1, 6, 64. Published in 2009
    Identifying compartment-specific non-HLA targets after renal transplantation by integrating transcriptome and ‘‘antibodyome’’ measures
    L. Li, P. Wadia, M. Sarwal, N. Kambham, T. Sigdel, D. B. Miklos, R. Chen, M. Naesens, A. J. Butte
    PNAS, 106, 11, 4148-4153. Published in 2009
    Technology for Building Intelligent Systems: From Psychology to Engineering
    M. A. Musen
    Modeling Complex Systems, Bill Shuart, Will Spaulding and Jeffrey Poland, U Nebraska P, Lincoln, Nebraska, Vol 52 of the Nebraska Symposium on Motivation, 145-184. Published in 2009
    Software-Engineering Challenges of Building and Deploying Reusable Problem Solvers
    M. J. O’Connor, C. I. Nyulas, A. Okhmatovskaia, D. Buckeridge, S. W. Tu, M. A. Musen
    Artificial Intelligence for Engineering Design, Analysis and Manufacturing, 24, 3. Published in 2009
    Data-Driven Methods to Discover Molecular Determinants of Serious Adverse Drug Events
    A. P. Chiang, A. J. Butte
    Clinical Pharmacology and Therapeutics, 28 January 2009, Advance online publication, doi:10.1038/clpt.2008.274. Published in 2009
    Knowledge-Data Integration for Temporal Reasoning in a Clinical Trial System
    M. J. O’Connor, R. D. Shankar, D. B. Parrish, A. K. Das
    International Journal of Medical Informatics, 78, Suppl. 1, S77-S85. Published in 2009
    GeneChaser: Identifying all biological and clinical conditions in which genes of interest are differentially expressed
    R. Chen, R. Mallelwar, A. Thosar, S. Venkatasubrahmanyam, A. J. Butte
    BMC Bioinformatics, 9, 1, 548. (doi:10.1186/1471-2105-9-548). Published in 2008
    FitSNPs: highly differentially expressed genes are more likely to have variants associated with disease
    R. Chen, A. A. Morgan, J. Dudley, A. M. Deshpande, L. Li, K. Kodama, A. P. Chiang, A. J. Butte
    Genome Biology, 9, 12, R170 (doi:10.1186/gb-2008-9-12-r170). Published in 2008
    Translational Bioinformatics: Coming of Age
    A. J. Butte
    Journal of the American Medical Informatics Association, JAMIA, 15, 6, 709-14. Published in 2008
    An Ontology-Driven Framework for Deploying JADE Agent Systems
    C. I. Nyulas, M. J. O’Connor, S. W. Tu, A. Okhmatovskaia, D. Buckeridge, M. A. Musen
    IEEE/WIC/ACM International Conference on Intelligent Agent Technology, Sydney, Australia, 2, 573-577. Published in 2008
    Understanding Detection Performance in Public Health Surveillance: Modeling Aberrancy-Detection Algorithms
    D. Buckeridge, A. Okhmatovskaia, S. W. Tu, C. I. Nyulas, M. J. O’Connor, M. A. Musen
    Journal of the American Medical Informatics Association, 15, 6, 760-769. Published in 2008
    Network Analysis of Intrinsic Functional Brain Connectivity in Alzheimer’s Disease
    K. S. Supekar, V. Menon, M. A. Musen, D. L. Rubin, M. Greicius
    Public Library of Science-Computational Biology., PLoS Computational Biology, June 2008. Published in 2008
    Medical Imaging on the Semantic Web: Annotation and Image Markup
    D. L. Rubin, P. Mongkolwat, V. Kleper, K. S. Supekar, D. S. Channin
    AAAI Spring Symposium Series, Semantic Scientific Knowledge Integration, Stanford. Published in 2008
    The Ultimate Model Organism
    A. J. Butte
    Science, 320, 5874, 325-327. Published in 2008
    BioPortal: A Web Portal to Biomedical Ontologies
    D. L. Rubin, D. de Abreu Moreira, P. P. Kanjamala, M. A. Musen
    AAAI Spring Symposium Series, Symbiotic Relationships between Semantic Web and Knowledge Engineering, Stanford University, (in press). Published in 2008
    AILUN: reannotating gene expression data automatically
    R. Chen, L. Li, A. J. Butte
    Nature Methods, 4, 11, 879. Published in 2007
    Evaluation and Integration of 49 Genome-wide Experiments and the Prediction of Previously Unknown Obesity-related Genes
    S. B. English, A. J. Butte
    Bioinformatics, Epub. Published in 2007
    Protege: A Tool for Managing and Using Terminology in Radiology Applications
    D. L. Rubin, N. F. Noy, M. A. Musen
    Journal of Digital Imaging, J Digit Imaging. Published in 2007
    Efficiently Querying Relational Databases using OWL and SWRL
    M. J. O’Connor, R. D. Shankar, S. W. Tu, C. I. Nyulas, A. K. Das, M. A. Musen
    The First International Conference on Web Reasoning and Rule Systems, Innsbruck, Austria, Springer, LNCS 4524, 361-363. Published in 2007
    Creation and implications of a phenome-genome network
    A. J. Butte, I. S. Kohane
    Nature Biotechnology, 24, 1, 55 – 62. Published in 2006


    National Center for Simulation of Biological Structures (SimBioS) at Stanford University

    National Center for the Multiscale Analysis of Genomic and Cellular Networks (MAGNet) at Columbia University

    NA-MIC Logo
    National Alliance for Medical Image Computing (NA-MIC) at Brigham and Women’s Hospital, Boston, MA

    Integrating Biology and the Bedside (I2B2) at Brigham and Women’s Hospital, Boston, MA

    National Center for Biomedical Ontology (NCBO) at Stanford University

    Integrate Data for Analysis, Anonymization, and Sharing (IDASH) at the University of California, San Diego



    Read Full Post »