Posts Tagged ‘Toxicity Prediction’

Predict Toxicity from Structure

Larry H. Bernstein, MD, FCAP, Curator





Toxicity Prediction Directly from Chemical Structure

  • Predicts key toxicity parameters (Ames mutagenicity, rat acute dose LD50 following iv or po administration) and aqueous solubility directly from structure.
  • No in vitro physicochemical or toxicity data required.
  • Save money and time by allowing toxicity to be assessed virtually (no synthesis required).
  • Provides early stage filter for directing chemistry and prioritizing screening.

A virtual screening tool for predicting toxicity from chemical structure alone.


chemTox Input Requirements Chemical structure, e.g. SMILES, mol or sdf
chemTox Data Delivery Predicted toxicity measures: Ames mutagenicity, rat acute dose LD50 following iv or po administration
Predicted aqueous solubility


chemTox is implemented as a node for the KNIME analytics platform which executes a model of the workflow illustrated in Figure 1 below.

Figure 1
chemTox workflow.
KNIME can be downloaded easily and for free. Cyprotex then provide the bespoke toxicity modules (chemTox) for the KNIME platform.
The following properties are reported:

  1. Ames mutagenicity classification (mutagenic/non-mutagenic).
  2. Ames mutagenicity probability (probability of being a mutagen).
  3. Rat LD50 (mmol/kg) following acute administration by intravenous route.
  4. Rat LD50 (mmol/kg) following acute administration by oral route.
  5. Aqueous solubility (mol/l).

Output is a KNIME data table facilitating saving to a file or database, or using the predictions as inputs to subsequent workflow steps.

Model Development
  • Models are quantitative-structure property relationships (QSPR) that calculate the toxicity properties of interest in terms of a compound’s structural descriptors.
  • Models have been trained using large, well-validated datasets.
  • Training was performed using up-to-date, rigorous statistical pattern recognition methods.
  • Repeated 10-fold cross-validation has been used to generate the most robust statistics for prediction performance.

Data from chemTox

Figure 2
Receiver operating characteristic (ROC) plots for classification of rat acute LD50 following oral administration. (a) LD50 < 5mg/kg (b). LD50 < 50mg/kg (c). LD50 < 300mg/kg.

LD50 < 5mg/kg
LD50 < 50mg/kg
LD50 < 300mg/kg
Area under the ROC curve 0.905
Sensitivity 0.85
Specificity 0.82
Table 1
Statistics for predicted Ames mutagenicity based on 4336 compounds (1935 non-mutagenic and 2401 mutagenic).

Model variants can be generated having different balance of sensitivity and specificity to suit different screening requirements.


SJ Williams

Tempting but there have been other databases like ToxNet which have used SAR (structure activity relationship) for Ames prediction but usually that is far as it can go. This is usually a first screen but long term carcinogenicity studies are still required for obvious reasons. In addition no system can predict IDILI (idiopathic drug induced liver injury) which has and still is plaguing the drug development industry. This is where relationships need to be worked on and this problem will need a more indepth ‘omic strategy as there are no SARs which correlate with these “idiosyncratic” toxicities. However be that as it may there have been chemicals which failed the Ames test however were not carcinogenic in subchronic or chronic tests so it is still good to conduct those studies anyway, even though most regulatory bodies give a failed Ames test a bad thumbs down.

Read Full Post »