ChatGPT applied to Cancer & Oncology

10/22/2024

Most of GPT functions for biomedical analysis will rely on the proper construction of expert-curated knowledge graphs and expert-constructed ontologies. Below explained such a knowledge graphs for precision medicine, PrimeKG.

Building a knowledge graph to enable precision medicine

Source: Chandak, P., Huang, K. & Zitnik, M. Building a knowledge graph to enable precision medicine. Sci Data 10, 67 (2023). https://doi.org/10.1038/s41597-023-01960-3

Developing personalized diagnostic strategies and targeted treatments requires a deep understanding of disease biology and the ability to dissect the relationship between molecular and genetic factors and their phenotypic consequences. However, such knowledge is fragmented across publications, non-standardized repositories, and evolving ontologies describing various scales of biological organization between genotypes and clinical phenotypes. Here, we present PrimeKG, a multimodal knowledge graph for precision medicine analyses. PrimeKG integrates 20 high-quality resources to describe 17,080
diseases with 4,050,249 relationships representing ten major biological scales, including disease associated protein perturbations, biological processes and pathways, anatomical and phenotypic scales, and the entire range of approved drugs with their therapeutic action, considerably expanding previous eforts in disease-rooted knowledge graphs. PrimeKG contains an abundance of ‘indications’, ‘contradictions’, and ‘of-label use’ drug-disease edges that lack in other knowledge graphs and can support AI analyses of how drugs afect disease-associated networks. We supplement PrimeKG’s graph structure with language descriptions of clinical guidelines to enable multimodal analyses and provide
instructions for continual updates of PrimeKG as new data become available.

Background & Summary
Precision medicine takes an approach to disease diagnosis and treatment that accounts for the variability in genetics, environment, and lifestyle across individuals. To be precise, medicine must revolve around data and learn from biomedical knowledge and health information. Nevertheless, many barriers to efciently exploiting information across biological scales slow down the research and development of individualized care. While many have acknowledged the difculties in linking biomedical knowledge to patient-level health information, few realize that biomedical knowledge is itself fragmented. Biomedical knowledge about complex diseases comes from diferent organizational scales, including genomics, transcriptomics, proteomics, molecular functions, intra- and inter-cellular pathways, phenotypes, therapeutics, and environmental efects. For any given disease, information from these organizational scales is scattered across publications, non-standardized data repositories, evolving ontologies, and clinical guidelines. Developing networked relationships between these
sources can support research in precision medicine.
A resource that comprehensively describes the relationships of diseases to biomedical entities would enable systematic study of human disease. Understanding the connections between diseases, drugs, phenotypes, and other entities could open the doors for many types of research, including but not limited to the study of phenotyping, disease etiology, disease similarity10, diagnosis11–13, treatments14, drug-disease relationships15–17,
mechanisms of drug action18 and resistance, drug repurposing19–21, drug discovery22,23, adverse events, and combination therapies. Knowledge graphs developed for individual diseases have yielded insights into respective disease areas. Nevertheless, the costs and extended timelines of these individual eforts point to a need for a resource that would unify biomedical knowledge and enable the investigation of diseases at scale.
While many primary data resources contain information about diseases, consolidating them into a comprehensive, disease-rich, and functional knowledge graph presents three challenges. First, existing approaches to network analysis of diseases require expert review and curation of data in the knowledge graph. While incredibly detailed, such eforts require substantial manual labor and expensive expert input, making them difcult to scale. Second, there lacks a consistent representation of diseases across biomedical datasets and

Rather than have a standardized disease ontology, database developers select the ontology
that best suits their function from a multitude of biorepositories. Because each set of disease vocabulary was tailored for some to serve a unique purpose, their disease encodings overlap unsystematically and are often in confict. For instance, International Classifcation of Diseases (ICD) codes are optimized for medical billing whereas MedGen53, PhenoDB, and Orphanet focus on rare and genetic diseases. Moreover, expertly curated disease descriptions in medical repositories do not follow any naming conventions. The lack of standardized disease representations and the multimodal nature of the datasets makes it challenging
to harmonize biomedical knowledge at scale. Tird, the defnition of diseases as discrete and distinct units of analysis remains medically and scientifcally ambiguous. For instance, while autism spectrum disorder is considered a medical diagnosis, the condition has many subtypes linked to clinically divergent manifestations. Clinically studied disease subtypes ofen do not correlate clearly with those defned in disease ontologies. Although only three subtypes of autism have been clinically identifed, the Unifed Medical Language
System (UMLS)46 describes 192 subtypes, the Monarch Disease Ontology (MONDO)44 describes 37 subtypes, and finally, Orphanet contains 6 disease entries for autism. Te challenge in reconciling disease entities is only exacerbated by the variety of synonyms and abbreviations available for any particular disease58 and the difculty in linking structured disease entities to unstructured names in text59. Meaningful disease entity resolution across multimodal, non-standardized datasets is critical for developing resources useful for precision medicine tasks.

Collectively, biomedical knowledge graphs have lent themselves to a variety of scientifc
discoveries methodological innovationsand coordinated initiatives for model evaluation and benchmarking. Further, knowledge graphs facilitated research across various problems faced by the biomedical community. Nevertheless, due to the medical heterogeneity of diseases, the multimodal nature of disease inforamation, and the incompatibility of existing disease repositories, knowledge graphs focused on diseases have not yet achieved the scale or impact of biomedical efforts.

Precision Medicine Knowledge Graph (PrimeKG) is a knowledge graph providing a holistic and multimodal view of diseases. We integrate 20 high-quality resources, biorepositories, and ontologies to curate this knowledge graph. Across 129,375 nodes and 4,050,249 relationships, PrimeKG captures information on ten major biological scales, including disease-associated perturbations in the proteome, biological processes, molecular pathways, anatomical and phenotypic scales, environmental exposures, and the range of approved and experimental
drugs together with their therapeutic action (Fig. 1a,b). We demonstrate that PrimeKG improves on coverage of diseases, both rare and common, by one-to-two orders of magnitude compared to existing knowledge graphs. Moreover, disease nodes in PrimeKG are densely connected to many other node types, including phenotypes, exposures, and drugs. We tune PrimeKG specifcally to support artifcial intelligence analyses to understand
how drugs target disease-associated molecular perturbations by including an abundance of ‘indications’, ‘contradictions’, and ‘of-label use’ drug-disease edges, which are usually missing or sparse in other knowledge graphs.
We supplement PrimeKG’s graph structure with textual descriptions of clinical guidelines for drug and disease nodes to enable multimodal analyses (Fig. 1c). Finally, we address the disease entity resolution challenge by improving the correspondence between diseases in PrimeKG and disease subtypes found in the clinic to enable PrimeKG-powered analyses in precision medicine.

02/19/2024

Microsoft Bing Has Now Incorporated AI Features in their browser called New Bing AI. The following video shows a medical student producing a Case Report from a chart.

PLEASE WATCH VIDEO

02/18/2024

A Small Primer on the Differences Between Artificial Intelligence, Machine Learning, and Large Language Models

Reporter: Stephen J. Williams, PhD

It is warranted, at this point, to post a small reminder of the differences between artificial intelligence (AI), machine learing (ML) and large language models (LLM) in order to limit confusion, as these terms seem to be used interchangeably in the literature and news. There a wonderful article is reported below on these differences and some explanations to distinguish between the three technologies.

This excellent source can be found at: https://www.unite.ai/machine-learning-vs-deep-learning-key-differences/

Machine learning vs. Deep learning – Key Differences

by Haziqa Sajid (January 6 2023

Source: https://www.unite.ai/machine-learning-vs-deep-learning-key-differences/

Terminologies like Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning are hype these days. People, however, often use these terms interchangeably. Although these terms highly co-relate with each other, they also have distinctive features and specific use cases. AI deals with automated machines that solve problems and make decisions imitating human cognitive capabilities. Machine learning and deep learning are the subdomains of AI. Machine Learning is an AI that can make predictions with minimal human intervention. Whereas deep learning is the subset of machine learning that uses neural networks to make decisions by mimicking the neural and cognitive processes of the human mind.

Machine Learning in a Nutshell

Machine learning allows experts to “train” a machine by making it analyze massive datasets. The more data the machine analyzes, the more accurate results it can produce by making decisions and predictions for unseen events or scenarios.

Machine learning models need structured data to make accurate predictions and decisions. If the data is not labeled and organized, machine learning models fail to comprehend it accurately, and it becomes a domain of deep learning.

The availability of gigantic data volumes in organizations has made machine learning an integral component of decision-making. Recommendation engines are the perfect example of machine learning models. OTT services like Netflix learn your content preferences and suggest similar content based on your search habits and watch history.

To understand how machine learning models are trained, let’s first look at types of ML.

There are four types of methodologies in machine learning.

Supervised learning – It needs labeled data to give accurate results. It often requires learning more data and periodic adjustments to improve outcomes.
Semi-supervised – It’s a middle tier between supervised & unsupervised learning that exhibits the functionality of both domains. It can give results on partially labeled data and doesn’t require ongoing adjustments to give accurate results.
Unsupervised learning – It discovers patterns and insights in datasets without human intervention and gives accurate results. Clustering is the most common application of unsupervised learning.
Reinforcement learning – The reinforcement learning model requires constant feedback or reinforcement as new information comes to give accurate results. It also uses a “Reward Function” that enables self-learning by rewarding desired outcomes and penalizing wrong ones

Machine learning models need human intervention to improve accuracy. On the contrary, deep learning models improve themselves after each result without human supervision. But it often requires more detailed and lengthy volumes of data.

The deep learning methodology designs a sophisticated learning model based on neural networks inspired by the human mind. These models have multiple layers of algorithms called neurons. They continue to improve without human intervention, like the cognitive mind that keeps improving and evolving with practice, revisits, and time.

Deep learning models are mainly used for classification and feature extraction. For instance, deep models feed on a dataset in facial recognition. The model creates multidimensional matrices to memorize each facial feature as pixels. When you ask it to recognize a picture of a person it was not exposed to, it easily recognizes it by matching limited facial features.

Convolutional Neural Networks (CNN) – Convolution is the process of assigning weights to different objects of an image. Based on these assigned weights, the CNN model recognizes it. The results are based on how close these weights are to the object’s weight fed as a train set.
Recurrent Neural Network (RNN) – Unlike CNN, the RNN model revisits the previous results and data points to make more accurate decisions and predictions. It’s an actual replica of human cognitive functionality.
Generative Adversarial Networks (GANs) – The two classifiers in GAN, the generator & discriminator, access the same data. The generator produces fake data by incorporating feedback from the discriminator. The discriminator tries to classify whether a given data is real or fake.

Salient Differences

Below are some notable differences.

Differences	Machine Learning	Deep Learning
Human Supervision	Machine learning requires more supervision.	Deep learning models require almost no human supervision after development.
Hardware Resources	You build and run Machine learning programs on a powerful CPU.	Deep learning models require more powerful hardware, like dedicated GPUs.
Time & Effort	The time required to set up a Machine learning model is less than deep learning, but its functionality is limited.	It requires more time to develop and train data with deep learning. Once created, it continues to improve its accuracy with time.
Data (structured/unstructured)	Machine learning models need structured data to give results (except unsupervised learning) and require continuous human intervention for improvement.	Deep learning models can process unstructured and complex datasets without compromising accuracy.
Use-cases	eCommerce websites and streaming services that use recommendation engines.	High-end applications like Autopilot in planes, self-driving vehicles, Rovers on the Martian surface, face recognition, etc.

Large Language Models (LLM): A type of Deep Learning

Large language models (LLM) are a subset of Deep Learning

from source: https://aws.amazon.com/what-is/large-language-model/

Large language models (LLM) are very large deep learning models that are pre-trained on vast amounts of data. The underlying transformer is a set of neural networks that consist of an encoder and a decoder with self-attention capabilities. The encoder and decoder extract meanings from a sequence of text and understand the relationships between words and phrases in it. Transformer LLMs are capable of unsupervised training, although a more precise explanation is that transformers perform self-learning. It is through this process that transformers learn to understand basic grammar, languages, and knowledge. Unlike earlier recurrent neural networks (RNN) that sequentially process inputs, transformers process entire sequences in parallel. This allows the data scientists to use GPUs for training transformer-based LLMs, significantly reducing the training time. Transformer neural network architecture allows the use of very large models, often with hundreds of billions of parameters. Such large-scale models can ingest massive amounts of data, often from the internet, but also from sources such as the Common Crawl, which comprises more than 50 billion web pages, and Wikipedia, which has approximately 57 million pages.

02/17/2024

A Cross-Sectional Study Comparing Chat GPTs and Bard on Immuno-Oncology Queries Suggest the need for Expert-Driven Verification

Reporter: Stephen J. Williams, PhD

Large language models (LLMs) have seen great utility in evaluating large corpus of knowledge and, together with the recent developments in generative artificial intelligence (AI) and natural language processing (NLP), its use in a multitude of applications have dramatically increased, including in healthcare. As mentioned in previous postings on this site, uses of LLMs, NLP, and generative AI in oncology include providing recommendations to oncologists and physician boards, offering clinical advice, and providing educational and management information for cancer patients with regard to their treatment and prognosis. The promise of ChatGPTs to offer user-friendly tools able to provide detailed, accurate, and current information could be criticial for the physician and patient in management of cancer in the clinical setting.

The addition of immuno-oncology agents (IO) has revolutionized cancer therapy and has shown great success for a variety of malignancies. In addition, these agents have also been added to standard chemotherapy regimens, and the number of patients currently taking IO has increased substantially, increasing the demand for patient education and tools for physicians and patients to assist and provide in management of their care and information on the therapies patients are taking. Therefore, the utility of LLMs and GPTs would be significant in this regard. In the study “Comparison of Large Language Models in Answering Immuno-Oncology Questions: A Cross-Sectional Study” by Iannantuono et. al in The Oncologist, the authors assess the ability of ChatGPT4, ChatGPT3.5 and Google Bard to answer questions within four domain of immuno-oncology including mechanisms, indications, toxicities, and prognosis.

Methodology: 60 open-ended questions were generated based on the authors combined clinical experience covering mechanisms, indications, toxicities and prognosis concerning imuno-oncology drugs. These questions were submitted to the aforementioned pre-trained LLMs and responses were assessed based on reproducibility, accuracy, relevance, and readability of the response. Two reviewers also submitted independent ratings of the answers generated by the LLMs and also submitted their own written responses to the questions. These reviews were blinded and any disagreements were evaluated by a third expert reviewer. Importantly, statistical analysis, a Cohen’s kappa coefficient was calculated to evaluate inter-rater reliabilty during the rating process. The criteria and definitions of successful outcomes is shown in Table 1. A boolean response is defined as True or False where the evaluators responses were numerically reported and statistically evaluated with parametric tests. These numerical evaluations from experts were centered on agreements of expert-opinions on accuracy, relevence and reproduceability of ChatGPT or Bard responses.

Results: ChatGPT-4 and ChatGPT-3.5 answered all questions, whereas Google Bard answered only 53.3% (P < .0001). The number of questions with reproducible answers was higher for ChatGPT-4 (95%) and ChatGPT3.5 (88.3%) than for Google Bard (50%) (P < .0001). In terms of accu¬racy, the number of answers deemed fully correct were 75.4%, 58.5%, and 43.8% for ChatGPT-4, ChatGPT-3.5, and Google Bard, respectively (P = .03). Furthermore, the number of responses deemed highly relevant was 71.9%, 77.4%, and 43.8% for ChatGPT-4, ChatGPT-3.5, and Google Bard, respectively (P = .04). Regarding readability, the number of highly readable was higher for ChatGPT-4 and ChatGPT-3.5 (98.1%) and (100%) compared to Google Bard (87.5%) (P = .02). Results are graphically summarized in Figure 2 and 2. In Figure 2, it is evident that ChatGPT shows superiority of Bard with respect to various metrics (full blue circles represents 100%) while orange represents the reproduceability of results. Figure 3 shows results in bar chart format.

Figure 2. Spot matrix of the percentages of the answered questions (Blue) and reproducible responses (Orange) for each LLM. Color volume is directly proportional to percentage with the outer black circle representing 100%. Figure from Giovanni Maria Iannantuono, Dara Bracken-Clarke, Fatima Karzai, Hyoyoung Choo-Wosoba, James L Gulley, Charalampos S Floudas, Comparison of Large Language Models in Answering Immuno-Oncology Questions: A Cross-Sectional Study, The Oncologist, 2024;, oyae009, https://doi.org/10.1093/oncolo/oyae009

Figure 3. Bar plot of the results (accuracy, readability, and relevance) for all 3 LLMs. This plot was based only on the questions evaluable for accuracy, readability, and relevance. Figure from Giovanni Maria Iannantuono, Dara Bracken-Clarke, Fatima Karzai, Hyoyoung Choo-Wosoba, James L Gulley, Charalampos S Floudas, Comparison of Large Language Models in Answering Immuno-Oncology Questions: A Cross-Sectional Study, The Oncologist, 2024;, oyae009, https://doi.org/10.1093/oncolo/oyae009

Conclusion: ChatGPT-4 and ChatGPT-3.5 are potentially powerful tools in immuno-oncology, whereas Google Bard demonstrated relatively poorer performance. However, the risk of inaccuracy or incompleteness in the responses was evident in all 3 LLMs, highlighting the importance of expert-driven verification of the outputs returned by these technologies. However, the authors note some limitations of the study. First the authors note that they only compared three LLMs (ChatGPT 3.5, 4.5, Bard) while other LLMs may be more useful for biomedical information like Microsoft’s Bing AI. In addition there were only three experts and more robust statistical analyses would be available with increasing the number of expert reviewers, as well as available to retrain the LLMs.

Reference: Giovanni Maria Iannantuono, Dara Bracken-Clarke, Fatima Karzai, Hyoyoung Choo-Wosoba, James L Gulley, Charalampos S Floudas, Comparison of Large Language Models in Answering Immuno-Oncology Questions: A Cross-Sectional Study, The Oncologist, 2024;, oyae009, https://doi.org/10.1093/oncolo/oyae009

01/14/2024

Recent Highlights from Literature on Utilization of Chat GPT in medical oncology decision making: the verdict is not clear

Curator: Stephen J. Williams, PhD

Two recently published articles may be highlighting the conflicting results as to whether ChaptGPT and other generative pretrained transformers (GPT) are useful in medical oncology clinical decision making. In previous posts, we had discussed success and tribulations of some GPT queries on cancer care, and while there have been relative agreement on whole with cancer experts two themes of issues seem to occur: lack of referencing and hallucinations, or incorrect and outdated information. These articles discuss some of the usefulness of common GPT algortithms and some cautions that tumor boards should keep in mind when relying on such technology.

In “Pearls and Pitfalls of ChatGPT in Medical Oncology”, Blum et al discuss how ChatGPT can be utilized by oncologists and highlight some caveats of the technology, as well as point out some limitations that tumor boards should keep in mind when relying on GPT for reliable information and clinical decision-making. The rapid advent, development and adoption of ChatGPT has spurred multiple applications in multiple fields, including clinical oncology. As such there have been much excitement concerning this new technology potential for transforming healthcare and cancer care. This popularity of the ChatGPT technology may stem from its ability to perform effective large language processing and modeling on extremely large training data sets, particularly from over 175 billion parameters to achieve various natural language processing tasks.

Blum J, Menta AK, Zhao X, Yang VB, Gouda MA, Subbiah V. Pearls and pitfalls of ChatGPT in medical oncology. Trends Cancer. 2023 Oct;9(10):788-790. doi: 10.1016/j.trecan.2023.06.007. Epub 2023 Jul 4. PMID: 37407364

As the authors note,

currently there are issues of inaccurately generated medical advice, making the current implementation of ChatGPT a risky tool to abtain definitive treatment advice from

However, they do not that this tecnology could be useful in the training of residents and benefit from case analysis. For example, the Glass AI application is a note-taking software which has found usefulness in training of medical students with the goal of teaching them differences in clinical diagnosis and treatment plans. In the clinical setting, physicians and office staff can use AI to write letters for insurance submissions or follow-ups to patient visits. In addition, providers can use the GPT API to quickly double check clinical decision making. However there are certain caveats and warning about usage of ChatGPT for reliance on clinical decision making.

Warnings on usage

Some limitations that the authors see which could complicate effective usuage of GPT include

Identity of the training set

A large proportion of ChatGPT training set eminates from non-peer reviewed sources, and peer-reviewed medical literature accounts for very little of ChatGPT’s overall training set. As shown below from the authors Table 1, Common Crawl, which is an open source repository from billions of web pages, comprises 60% of ChatGPT training set. Books1 and Books2 do contain some peer-reviewed medical literature however this small amount (16%) is diluted by the much larger contribution on unverified and potentially incorrect material.

Dataset	No. of tokens	Propertion within training
Common Crawl	410 billion	60%
WebText2	19 billion	22%
Books1	12 billion	8%
Books2	55 billion	8%
Wikipedia	3 billion	3%

What are Book1 and Books2?

AI Training Datasets: the Books1+Books2 that Big AI eats for breakfast

Source: https://gregoreite.com/drilling-down-details-on-the-ai-training-datasets/

Books1 & Books2 are two internet-based books corpora, containing a random sampling of a small subset of all the public domain books that are available online [? fact check ASAP!] .

UPDATE: Jan 2023: I, and many others, are starting to seriously question what the actual contents of Books1 & Books2 are; they are not well documented online — some (including me) might even say that given the significance of their contribution to the AI brains, their contents has been intentionally obfuscated.

My research here is ongoing. On the surface, the contents of Books1 & Books2 appears to be far less than my original supposition of “all public domain books ever.” A helpful researcher by the name of Shawn Presser has done a hero’s task and published a “Books3” dataset. See the @theshawwn twitter feed regarding that, including download links to the raw multi-GB dataset downloads.

Therefore the authors suggest that ChatGPT should not be relied upon to provide the latest up-to-date medical information.

Innacuracy of ChatGPT on standardized tests

In the USMLE ChatGPT was proficient enough to pass bu this was highly variable, getting from 54% to a passing 75%. However this indicates that reliance to provide high accuracy to complex clinical medical decisions is unwarranted.

However, even with current limitations the authors do suggest a few areas where ChatGPT could be usefull in the clinical oncology setting including

Symptom onset and evaluation
Diagnosis and screening
Language translation tool
Patient education tool
Diagnostic and treatment planning tool
Patient virtual assistant
Personalized error checking tool

Large Language Model (ChatGPT) as a Support Tool for Breast Tumor Board

Sorin, V., Klang, E., Sklair-Levy, M. et al. Large language model (ChatGPT) as a support tool for breast tumor board. npj Breast Cancer 9, 44 (2023). https://doi.org/10.1038/s41523-023-00557-8

Large language models (LLM) have gained much attention for potential utility in healthcare. LLMs that are based on transormers with Attention mechanism can be first trained on extremely large datasets but also considered “few shot learners” meaning that once trained you can update with only a small number of examples. Several examples and studies have shown some encouraging uses in medicine. In this paper Sorin et al from Chaim Sheba Medical Center evaluated ChatGPT as a support tool for breast tumor board decision making. They used ChatGPT 3.5 to insert the results of ten consecturive patient data presented to a breast tumor board. They next asked the chatbot to recommend treatment management and results graded by the tumor board and two senior radiologists. Eight of the patients had invasive ductal carcinoma and one patient with DCIS, one patient had a phylodes tumor.

Results: In seven out of ten cases ChatGPT’s recommendations were similar to the tumor board recommendation and agreement between raters was fair for summation, subtantial for cliical recommendation and substantial for explanation (however this needs to be improved). In eight out of ten cases surgery was the next recommended step and only in two patients was neoadjuvant therapy recommended. However the composition of the tumor board was not reported, so it is unsure whether the board was made of predominantly surgeons who might be apt to recommend surgery as first step. Also of note was the low percentage of neoadjuvant recommendations even though many more presented with Her2/neu amplification and could have been treated with neoadjuvant therapy, which in many cases is preferred. The authors DID note that ChatGPT missed the HER2 FISH results. In addition, little in the manner of additional diagnostic tests were suggested (note in the case of the phalloides tumor was hypoglycemia measured or IGFII levels?).

Another interesting point the authors noted was the lack of referral by GPT for additional imaging or consultion, especially consult with a radiologist. This may be a result of the way GPT was trained or amplification of a hallucination or bias. Another issue is question of leagality and liability in cases where AI-driven decisions lead to a negative clinical outcome. Exposure of the GPT to domain specific knowledge may help alleviate some of the biases and hallucinations, as noted by the authors.

11/05/2023

NEWS RELEASE 27-SEP-2023

Insilico Medicine discovers novel dual targets for aging and disease using Microsoft BioGPT

Peer-Reviewed Publication

INSILICO MEDICINE

The intersection of aging and disease — **IMAGE: THE TOP 200 AGE-RELATED GENES SELECTED WITH THE BIOGPT MODEL AND VENN DIAGRAM OF THE INTERSECTION OF AGE-RELATED GENES.** view more

CREDIT: INSILICO MEDICINE

Based on Microsoft BioGPT, Insilico Medicine’s R&D team proposed a novel approach for predicting therapeutic targets using a large language model (LLM) specifically trained for biomedical tasks;
A total of 9 potential dual-purpose targets against aging and 14 major age-related diseases were discovered, with CCR5 and PTH nominated as novel targets for anti-aging;
Apart from target selection, the method can be applied to extensive ranking tasks, even without clear criteria.

As the cornerstone of popular chatbots including GPT-4, large language models (LLMs) trained on vast amounts of text data have been contributing to advances in diverse fields including literature, art, and science, but their potential in the complex realms of biology and genomics has yet to be fully unlocked.

Insilico Medicine, a clinical-stage generative artificial intelligence (AI)-driven drug discovery company announced that the company has utilized the connection retrieval ability of Microsoft BioGPT to identify 9 potential dual-purpose targets against both the aging process and 14 major age-related diseases. Two of the proposed genes have not been previously correlated to the aging process, indicating the potential of Transformer models in novel target prediction and other ranking tasks across the biomedical field. The findings were published in the journal Aging.

According to recent publications, the majority of LLMs are trained on the continuation of texts, and work by suggesting the next word possible depending on the connection and probability distribution extracted from the context. Given a plausible prompt and adequate background data, scientists can now apply LLMs, especially specialized models, to the target prioritization process.

BioGPT, the domain-specific generative Transformer language model, was jointly proposed by Microsoft Research and Peking University in China. Pre-trained on millions of previously published biomedical research articles, the model outperformed previous models in multiple biomedical natural language processing tasks and demonstrated human parity in analyzing biomedical research to answer questions.

To further enhance the performance of BioGPT, Insilico researchers used a dataset of 900,000 grant proposals from the National Institutes of Health for training, and evaluated the effect through log fold change of enrichment (ELFC) and hypergeometric p-value (HGPV) scores. Next, the team established a target discovery pipeline including the prompt, retrieval probability of tokens, and gene probability calculation.

Using the final prompt sentence of “human gene targeted by a drug for treating {DISEASE} is the,” and the general tokenizer from BioGPT, the researchers proposed 9 potential targets after several cycles of probability retrieval. In the end, 5 targets were nominated as dual-purpose targets against aging and all 14 age-related diseases including Alzheimer’s disease, amyotrophic lateral sclerosis, and idiopathic pulmonary fibrosis. Both CCR5 and PTH are considered novel age-related targets.

“I am thrilled to see this breakthrough based on LLMs presented by the Insilico team, as it highlights the potential of a Transformer and generative AI approach combined with specific databases,” says Alex Zhavoronkov, PhD, founder and CEO of Insilico Medicine. “We hope to further accelerate drug R&D processes using our proprietary Pharma.AI platform in this era of biotech paradigm change.”

“BioGPT can learn and understand large amounts of medical literature, thereby empowering practical processes including novel drug research and development, medical knowledge graph development, precision medicine, and medical dialogue assistance systems, and driving new biotechnology developments,” said Tao Qin, PhD, Senior Principal Researcher at Microsoft Research AI4Science. “The research results released by Insilico Medicine shed light on new practical application scenarios for BioGPT and other LLM-based AI engines. We look forward to further real-world applications and more breakthroughs.”

A leader in generative AI for drug discovery, Insilico Medicine has established and validated its proprietary end-to-end Pharma.AI platform across target discovery, small molecule generation, and clinical trial design. Recently, the company published the validation results of inClinico in Clinical Pharmacology and Therapeutics, where the Transformer-based clinical trial prediction tool achieved 79% accuracy in prospective validation.

The full article can be found here: https://www.aging-us.com/article/205055/text

However the article does provide some ethical concerns as well. Most importantly it states that Insilico Medicine was given access to over 900,000 grant applications from the NIH. These applications, if unfunded, are to be kept private and confidential, and the NIH does not have the authority to share a pre or non funding application data with outside parties. it is curious as to who authorized this. This also goes to the heart of many of the issues related to GPT technology, namely the source of model training data set and proper authorizations and compensation. It begs to ask was there consent given in this study, and if not, were any HIPAA rules then broken?

9/12/2023

Use of Artificial Intelligence Chatbots for Cancer Treatment Information

In a Research Editorial Comment in JAMA Oncology, authors Drs. Chen, Kann, and Foote from Memorial Sloan Kettering and Mass General Brigham investigated the validity of ChatGPT outputs concerning diagnositic recommendations for breast, prostate, and lung cancer against the guidelines of the National Comprehensive Cancer Network (NCCN), and a group of board certified oncologists.

Methods: The authors developed 4 zero-shot prompt templates to query treatment recommendations These templates did not provide the model with examples of correct responses. Templates were used to create 4 prompt variations for 26 diagnosis descriptions (cancer types with or without relevant extent of disease modifiers) for a total of 104 prompts. Prompts were input to the GPT-3.5-turbo-0301 model via the ChatGPT (OpenAI) interface. In accordance with the Common Rule, institutional review board approval was not needed since human participants were not involved. They benchmarked the chatbot’s recommendations against 2021 NCCN guidelines because this chatbot’s knowledge cutoff was September 2021. Five scoring criteria were developed to assess guideline concordance (as shown in the subsequent Table). The output did not have to recommend all possible regimens to be considered concordant; the recommended treatment approach needed only to be an NCCN option. Concordance of the chatbot output with NCCN guidelines was assessed by 3 of 4 board-certified oncologists, and majority rule was taken as the final score. In cases of complete disagreement, the oncologist who had not previously seen the output adjudicated. Data were analyzed between March 2 and March 14, 2023, using Excel, version 16.74 (Microsoft Corp).

Table 1. Oncologist Scoring of LLM Chatbot Treatment Recommendations^a

View Large Download

A graphical summary of the methods is shown below (taken from Supplemental Material from the paper)

Underlined text indicates where each diagnosis description was input into the LLM chatbot prompt template. Diagnosis descriptions consisted of cancer type (breast cancer, lung cancer, non–small cell lung cancer, small cell lung cancer, and prostate cancer) with and without extents of disease relevant for each cancer type. A total of 26 disease descriptions were input into the prompt templates, for a total of 104 unique prompts. NCCN indicates National Comprehensive Cancer Network.

Results: Outputs of 104 unique prompts were scored on 5 criteria for a total of 520 scores. All 3 annotators agreed for 322 of 520 (61.9%) scores. Disagreements tended to arise when the output was unclear (eg, not specifying which multiple treatments to combine). The chatbot provided at least 1 recommendation for 102 of 104 (98%) prompts. All outputs with a recommendation included at least 1 NCCN-concordant treatment, but 35 of 102 (34.3%) of these outputs also recommended 1 or more nonconcordant treatments. Responses were hallucinated (ie, were not part of any recommended treatment) in 13 of 104 (12.5%) outputs. Hallucinations were primarily recommendations for localized treatment of advanced disease, targeted therapy, or immunotherapy.

In summary, one third of the ChatBots answers were not in agreement with expert curated NCCN guidelines.

As stated in the article

Disagreement among annotators highlighted the challenges of interpreting descriptive LLM output. Disagreements most often arose from unclear output, but differing interpretations of guidelines among annotators may have played a role. Clinicians should advise patients that LLM chatbots are not a reliable source of treatment information. ….. the chatbot did not perform well at providing accurate cancer treatment recommendations. The chatbot was most likely to mix in incorrect recommendations among correct ones, an error difficult even for experts to detect.

Source: Chen S, Kann BH, Foote MB, et al. Use of Artificial Intelligence Chatbots for Cancer Treatment Information. JAMA Oncol. Published online August 24, 2023. doi:10.1001/jamaoncol.2023.2954

For a description of NCCN please see the following link

https://www.nccn.org/

The National Comprehensive Cancer Network^® (NCCN^®) is a not-for-profit alliance of 33 leading cancer centers devoted to patient care, research, and education. NCCN is dedicated to improving and facilitating quality, effective, equitable, and accessible cancer care so all patients can live better lives. Through the leadership and expertise of clinical professionals at NCCN Member Institutions, NCCN develops resources that present valuable information to the numerous stakeholders in the health care delivery system. By defining and advancing high-quality cancer care, NCCN promotes the importance of continuous quality improvement and recognizes the significance of creating clinical practice guidelines appropriate for use by patients, clinicians, and other health care decision-makers around the world.

World-renowned experts from NCCN Member Institutions diagnose and treat patients with a broad spectrum of cancers and are recognized for dealing with complex, aggressive, or rare cancers. NCCN Member Institutions pioneered the concept of the multidisciplinary team approach to patient care and conduct innovative research that contributes significantly to understanding, diagnosing, and treating cancer. NCCN programs offer access to expert physicians, superior treatment, and quality and safety initiatives that continuously improve the effectiveness and efficiency of cancer care globally.

The Guideline are proprietary but formed from expert committees of national and international oncologists and reviewed periodically to provide updates on the latest standards and research in cancer diagnosis and treatment. The NCCN provides these guidelines to the 27 NCI designated cancer centers as a service.

A little bit of extra information on the OpenAI model GPT 3.5 Turbo

Source: https://platform.openai.com/docs/models/continuous-model-upgrades

The OpenAI API is powered by a diverse set of models with different capabilities and price points. You can also make customizations to our models for your specific use case with fine-tuning.

MODELS	DESCRIPTION
GPT-4	A set of models that improve on GPT-3.5 and can understand as well as generate natural language or code
GPT-3.5	A set of models that improve on GPT-3 and can understand as well as generate natural language or code
GPT base	A set of models without instruction following that can understand as well as generate natural language or code
DALL·E	A model that can generate and edit images given a natural language prompt
Whisper	A model that can convert audio into text
Embeddings	A set of models that can convert text into a numerical form
Moderation	A fine-tuned model that can detect whether text may be sensitive or unsafe
GPT-3 Legacy	A set of models that can understand and generate natural language
Deprecated	A full list of models that have been deprecated

We have also published open source models including Point-E, Whisper, Jukebox, and CLIP.

With the release of gpt-3.5-turbo, some of our models are now being continually updated. gpt-3.5-turbo, gpt-4, and gpt-4-32k point to the latest model version. You can verify this by looking at the response object after sending a ChatCompletion request. The response will include the specific model version used (e.g. gpt-3.5-turbo-0613).

9/11/2023

ChatGPT Found Lacking Performance and Accuracy by Cancer Clinicians

Again the jury is out concerning the usefulness and dangers of using ChatGPT to make clinical decisions, particularly in oncology. In a recently studies from investigators at Dana Farber Cancer Institute and Mass General Brigham on the utility of ChatGPT to answer clinicians queries and supply reccomendations, investigators found ChatGPT to fall very short of their expectations.

The following is taken from an aricle in the American Hospital Association blog page

Source: https://www.aha.org/aha-center-health-innovation-market-scan/2023-09-05-chatgpts-performance-found-lacking-cancer-treatment-recommendations

In a recent column published by the Dana-Farber Cancer Institute, one of its physicians relayed his experience in using ChatGPT to provide statistics on a certain type of cancer. To his surprise, ChatGPT made up an equation and even gave it a name.

“It was an equation that does nothing, but it looked very convincing,” said Benjamin Schlechter, M.D., who specializes in gastrointestinal cancers. “In a way, it’s like talking to children: They start making up a story and continue the more you ask them about it. In this case, ChatGPT was adding detail after detail, none of it real, because I asked it to elaborate. It’s very confident for a computer.”

This casual observation is very much like the results study recently published in JAMA Oncology

and highlighted here in earlier posts.

Researchers from Mass General Brigham found that one-third of GPT’s 3.5 recommendations went at least partially against 2021 National Comprehensive Cancer Treatment guidelines. “Clinicians should advise patients that large language model chatbots are not a reliable source of information,” the study concluded. The chatbot was most likely to mix incorrect recommendations among correct ones, creating an error that’s difficult even for experts to detect. The study only evaluated one model at a snapshot in time, but the findings provide insight into areas of concern and future research needs. Danielle Bitterman, M.D., Mass General Brigham’s department of radiation oncology and the artificial intelligence (AI) in medicine program, said in a statement: “ChatGPT responses can sound a lot like a human and can be quite convincing. But, when it comes to clinical decision-making, there are so many subtleties for every patient’s unique situation. A right answer can be very nuanced, and not necessarily something ChatGPT or another large language model can provide.” The chatbot did not purport to be a medical device, and need not be held to such standards, the study said. Patients, however, likely will use technologies like this to educate themselves, which may affect shared decision-making in the doctor-patient relationship.The investigators plan to explore how patients and physicians can distinguish between medical advice written by a physician compared with AI. They also plan to prompt ChatGPT with more detailed clinical cases to evaluate AI’s clinical knowledge further.

8/27/2023

GeneGPT: A New Chat/Search AI Bot for Finding BioInformatic Tools and Information by searching NCBI APIs

GeneGPT is a Large Language Model and Generative AI Search Bot to help scientisits find better and more appropriate bioinformatic tools and biomedical information.

A paper published as a pre-print on Cornel University’s arxiv server (under Computer Science) by Qiao Jin, Yifan Yang, Qingyu Chen, Zhiyong Lu discusses a new tool for biomedical analysis. The authors aimed to teach LLMs to use the Web APIs of the National Center for Biotechnology Information (NCBI), including BLAST tools.

While large language models (LLMs) have been successfully applied to various tasks, they still face challenges with hallucinations. Augmenting LLMs with domain-specific tools such as database utilities can facilitate easier and more precise access to specialized knowledge. In this paper, we present GeneGPT, a novel method for teaching LLMs to use the Web APIs of the National Center for Biotechnology Information (NCBI) for answering genomics questions. Specifically, we prompt Codex to solve the GeneTuring tests with NCBI Web APIs by in-context learning and an augmented decoding algorithm that can detect and execute API calls. Experimental results show that GeneGPT achieves state-of-the-art performance on eight tasks in the GeneTuring benchmark with an average score of 0.83, largely surpassing retrieval-augmented LLMs such as the new Bing (0.44), biomedical LLMs such as BioMedLM (0.08) and BioGPT (0.04), as well as GPT-3 (0.16) and ChatGPT (0.12). Our further analyses suggest that: (1) API demonstrations have good cross-task generalizability and are more useful than documentations for in-context learning; (2) GeneGPT can generalize to longer chains of API calls and answer multi-hop questions in GeneHop, a novel dataset introduced in this work; (3) Different types of errors are enriched in different tasks, providing valuable insights for future improvements.

Reference

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Genomics (q-bio.GN)
Cite as:	arXiv:2304.09667 [cs.CL]
	(or arXiv:2304.09667v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2304.09667

Source: https://arxiv.org/abs/2304.09667

Code and Tools for GeneGPT are available using the link above. The following tweet also shows a screenshot of a use exampe of GeneGPT. Notice that the user @razoralign is using GeneGPT to ask the chromosomal location and function of a particular SNP. The answers that GeneGPT supplies are not necessarily text based (as other GPTs offer) but more importantly the GPT returns a link to NCBI pages showing the exact location and function of the queried gene.

GeneGPT: Augmenting Large Language Models with Domain Tools for Improved Access to Biomedical Information https://t.co/K9r3pG6mAd pic.twitter.com/mc063htgj8

— antisense. (@razoralign) August 25, 2023

GeneGPT, a novel method that prompts Codex (Chen et al., 2021) to use NCBI Web APIs by in-context learning (Brown et al., 2020). GeneGPT consists of two main modules: (a) a specifically designed prompt that consists of documentations and demonstrations of API usage, and (b) an inference algorithm that integrates API calls in the Codex decoding process. We evaluate GeneGPT on GeneTuring (Hou and Ji, 2023), a question answering (QA) benchmark for genomics, and compare GeneGPT to a variety of other LLMs such as the new Bing¹, ChatGPT², and BioGPT (Luo et al., 2022). GeneGPT achieves the best performance on eight GeneTuring tasks with anaverage score of 0.83, which is remarkably higher than the previous SOTA (0.44 by New Bing). In addition, we systematically characterize GeneGPT and find that: (1) API demonstrations are more useful than documentations for in-context learning; (2) GeneGPT generalizes to longer chains of subquestion decomposition and API calls with simple demonstrations; (3) GeneGPT makes specific errors that are enriched for each task

¹https://www.bing.com/new 2https://chat.openai.com/

8/13/2023

ChatGPT: friend or foe?

From the Lancet: Although not specifically on the topic of cancer, the following article in Lancet discusses the use of Chat Generative Pre-trained Tranformer (ChatGPT) and AI in writing patient discharge summaries, seemingly a trite matter. However, as the editors of Lancet propose, the reliance on AI and not human input may have multitude of ethical issues as well as misinformation put into these summaries, and may ultimately jeapordize the patient. I will leave the editorial in its original words as it is important to realize there is much consternation on use of ChatGPT in medical writing.

EDITORIAL| VOLUME 5, ISSUE 3, E102, MARCH 2023

The Lancet Digital Health

Open Access Published:February 06, 2023DOI:https://doi.org/10.1016/S2589-7500(23)00023-7

You would have been hard-pressed to miss the storm surrounding ChatGPT (Chat Generative Pre-trained Transformer) over the past few months. News outlets and social media have been abuzz with reports on the chatbot developed by OpenAI. In response to a written prompt, ChatGPT can compose emails, write computer code, and even craft movie scripts. Researchers have also demonstrated its competency to pass medical licensing exams. But excitement has been matched by a swathe of ethical concerns that could—and perhaps should—limit its adoption. ChatGPT is powered by a refined version of the large language model (LLM) GPT-3.5. Its base model GPT-3 was trained on articles, websites, books, and written conversations, but a process of fine-tuning (including optimisation for dialogue) enables ChatGPT to respond to prompts in a conversational way. In the realm of health care, Sajan B Patel and Kyle Lam illustrated ChatGPT’s ability to generate a patient discharge summary from a brief prompt. Automating this process could reduce delays in discharge from secondary care without compromising on detail, freeing up valuable time for doctors to invest in patient care and developmental training. A separate study also tested its ability to simplify radiology reports, with the generated reports being deemed overall factually correct, complete, and with low perceived risk of harm to patients. But in both cases, errors were evident. In the discharge summary example provided by Patel and Lam, ChatGPT added extra information to the summary that was not included in their prompt. Likewise, the radiology report study identified potentially harmful mistakes such as missing key medical findings. Such errors signal that if implemented in clinical practice, manual checks of automated outputs would be required. The limitations of ChatGPT are known. By OpenAI’s own admission, ChatGPT’s output can be incorrect or biased, such as citing article references that do not exist or perpetuating sexist stereotypes. It could also respond to harmful instructions, such as to generate malware. OpenAI set up guardrails to minimise the risks, but users have found ways around these, and as ChatGPT’s outputs could be used to train future iterations of the model, these errors might be recycled and amplified. OpenAI have asked users to report inappropriate responses in order to help improve the model, but this has been met with criticism, as it’s often people disproportionately affected by algorithmic bias (such as those from marginalised communities) who are expected to help find solutions. Michael Liebrenz and colleagues opine that although ChatGPT could serve to democratise knowledge sharing as it can receive and output text in multiple languages (beneficial for non-native speakers publishing in English), inaccuracies in generated text could fuel the spread of misinformation.

These concerns have serious implications for the integrity of the scientific record, given the risk of introducing not only errors but also plagiarised content into publications. This could result in future research or health policy decisions being made on the basis of false information. Last month, the World Association of Medical Editors published its recommendations on the use of ChatGPT and other chatbots in scholarly publications, one of which is that journal editors need new tools to detect AI-generated or modified content. Indeed, an AI output detector was shown to be better at distinguishing between original and ChatGPT-generated research article abstracts than a plagiarism detector and human reviewers, but did falsely flag an original abstract as being “fake”. Technology is evolving, and editorial policies need to evolve too. Elsevier has introduced a new policy on the use of AI and AI-assisted technologies in scientific writing, stipulating that use should be limited to improving readability and language of the work, and should be declared in the manuscript; authors should do manual checks of any AI-generated output; and these tools should not be listed or cited as an author or co-author as they cannot take on the responsibilities that authorship entails (such as being accountable for the published work). Widespread use of ChatGPT is seemingly inevitable but in its current iteration careless, unchecked use could be a foe to both society and scholarly publishing. More forethought and oversight on model training are needed, as is investment in robust AI output detectors. ChatGPT is a game changer, but we’re not quite ready to play.

Source: https://www.thelancet.com/journals/landig/article/PIIS2589-7500(23)00023-7/fulltext

7/10/2023

AI Tool Developed at Harvard Decodes Brain Cancer’s Genome During Surgery

Real-time tumor profiling can guide surgical, treatment decisions

By EKATERINA PESHEVA July 7, 2023

Source: https://hms.harvard.edu/news/ai-tool-decodes-brain-cancers-genome-during-surgery

At a glance:

New AI tool enables in-surgery genomic profiling of gliomas, the most aggressive and most common brain tumors.
This information offers critical clues about how aggressive a cancer is, its future behavior, and its likely response to treatment.
The tool can provide real-time guidance to surgeons on the optimal surgical approach for removal of cancerous tissue.

Scientists have designed an AI tool that can rapidly decode a brain tumor’s DNA to determine its molecular identity during surgery — critical information that under the current approach can take a few days and up to a few weeks.

Knowing a tumor’s molecular type enables neurosurgeons to make decisions such as how much brain tissue to remove and whether to place tumor-killing drugs directly into the brain — while the patient is still on the operating table.

A report on the work, led by Harvard Medical School researchers, is published July 7 in the journal Med.

Accurate molecular diagnosis — which details DNA alterations in a cell — during surgery can help a neurosurgeon decide how much brain tissue to remove. Removing too much when the tumor is less aggressive can affect a patient’s neurologic and cognitive function. Likewise, removing too little when the tumor is highly aggressive may leave behind malignant tissue that can grow and spread quickly.

“Right now, even state-of-the-art clinical practice cannot profile tumors molecularly during surgery. Our tool overcomes this challenge by extracting thus-far untapped biomedical signals from frozen pathology slides,” said study senior author Kun-Hsing Yu, assistant professor of biomedical informatics in the Blavatnik Institute at HMS.

Knowing a tumor’s molecular identity during surgery is also valuable because certain tumors benefit from on-the-spot treatment with drug-coated wafers placed directly into the brain at the time of the operation, Yu said.

“The ability to determine intraoperative molecular diagnosis in real time, during surgery, can propel the development of real-time precision oncology,” Yu added.

The standard intraoperative diagnostic approach used now involves taking brain tissue, freezing it, and examining it under a microscope. A major drawback is that freezing the tissue tends to alter the appearance of cells under a microscope and can interfere with the accuracy of clinical evaluation. Furthermore, the human eye, even when using potent microscopes, cannot reliably detect subtle genomic variations on a slide.

The new AI approach overcomes these challenges.

The tool, called CHARM (Cryosection Histopathology Assessment and Review Machine), is freely available to other researchers. It still has to be clinically validated through testing in real-world settings and cleared by the FDA before deployment in hospitals, the research team said.

Cracking cancer’s molecular code

Recent advances in genomics have allowed pathologists to differentiate the molecular signatures — and the behaviors that such signatures portend — across various types of brain cancer as well as within specific types of brain cancer. For example, glioma — the most aggressive brain tumor and the most common form of brain cancer — has three main subvariants that carry different molecular markers and have different propensities for growth and spread.

The new tool’s ability to expedite molecular diagnosis could be particularly valuable in areas with limited access to technology to perform rapid cancer genetic sequencing.

Beyond the decisions made during surgery, knowledge of a tumor’s molecular type provides clues about its aggressiveness, behavior, and likely response to various treatments. Such knowledge can inform post-operative decisions.

Furthermore, the new tool enables during-surgery diagnoses aligned with the World Health Organization’s recently updated classification system for diagnosing and grading the severity of gliomas, which calls for such diagnoses to be made based on a tumor’s genomic profile.

Training CHARM

CHARM was developed using 2,334 brain tumor samples from 1,524 people with glioma from three different patient populations. When tested on a never-before-seen set of brain samples, the tool distinguished tumors with specific molecular mutations at 93 percent accuracy and successfully classified three major types of gliomas with distinct molecular features that carry different prognoses and respond differently to treatments.

Going a step further, the tool successfully captured visual characteristics of the tissue surrounding the malignant cells. It was capable of spotting telltale areas with greater cellular density and more cell death within samples, both of which signal more aggressive glioma types.

The tool was also able to pinpoint clinically important molecular alterations in a subset of low-grade gliomas, a subtype of glioma that is less aggressive and therefore less likely to invade surrounding tissue. Each of these changes also signals different propensity for growth, spread, and treatment response.

The tool further connected the appearance of the cells — the shape of their nuclei, the presence of edema around the cells — with the molecular profile of the tumor. This means that the algorithm can pinpoint how a cell’s appearance relates to the molecular type of a tumor.

This ability to assess the broader context around the image renders the model more accurate and closer to how a human pathologist would visually assess a tumor sample, Yu said.

The researchers say that while the model was trained and tested on glioma samples, it could be successfully retrained to identify other brain cancer subtypes.

Scientists have already designed AI models to profile other types of cancer — colon, lung, breast — but gliomas have remained particularly challenging due to their molecular complexity and huge variation in tumor cells’ shape and appearance.

The CHARM tool would have to be retrained periodically to reflect new disease classifications as they emerge from new knowledge, Yu said.

“Just like human clinicians who must engage in ongoing education and training, AI tools must keep up with the latest knowledge to remain at peak performance.”

For Reference the research paper can be found here:

https://www.cell.com/med/fulltext/S2666-6340(23)00189-7

Machine learning for cryosection pathology predicts the 2021 WHO classification of glioma

Nasrallah MP, Zhao J, Tsai CC, Meredith D, Marostica E, Ligon KL, Golden JA, Yu KH. Machine learning for cryosection pathology predicts the 2021 WHO classification of glioma. Med. 2023 Jun 29:S2666-6340(23)00189-7. Epub ahead of print. PMID: 37421953. DOI: https://doi.org/10.1016/j.medj.2023.06.002

Highlights

We established the Cryosection Histopathology Assessment and Review Machine (CHARM)

CHARM predicted IDH mutation and 2021 WHO classification of glioma using cryosections

CHARM provided quantitative evidence supporting the reclassification of astrocytoma

CHARM facilitated real-time cryosection diagnoses and can be extended to other cancers

Summary

Machine learning for cryosection pathology predicts the 2021 WHO classification of glioma in Nasrallah MP, Zhao J, Tsai CC, Meredith D, Marostica E, Ligon KL, Golden JA, Yu KH. Machine learning for cryosection pathology predicts the 2021 WHO classification of glioma. Med. 2023 Jun 29:S2666-6340(23)00189-7. doi: 10.1016/j.medj.2023.06.002. Epub ahead of print. PMID: 37421953.

Background

Timely and accurate intraoperative cryosection evaluations remain the gold standard for guiding surgical treatments for gliomas. However, the tissue-freezing process often generates artifacts that make histologic interpretation difficult. In addition, the 2021 WHO Classification of Tumors of the Central Nervous System incorporates molecular profiles in the diagnostic categories, so standard visual evaluation of cryosections alone cannot completely inform diagnoses based on the new classification system.

Methods

To address these challenges, we develop the context-aware Cryosection Histopathology Assessment and Review Machine (CHARM) using samples from 1,524 glioma patients from three different patient populations to systematically analyze cryosection slides.

Findings

Our CHARM models successfully identified malignant cells (AUROC = 0.98 ± 0.01 in the independent validation cohort), distinguished isocitrate dehydrogenase (IDH)-mutant tumors from wild type (AUROC = 0.79–0.82), classified three major types of molecularly defined gliomas (AUROC = 0.88–0.93), and identified the most prevalent subtypes of IDH-mutant tumors (AUROC = 0.89–0.97). CHARM further predicts clinically important genetic alterations in low-grade glioma, including ATRX, TP53, and CIC mutations, CDKN2A/B homozygous deletion, and 1p/19q codeletion via cryosection images.

Conclusions

Our approaches accommodate the evolving diagnostic criteria informed by molecular studies, provide real-time clinical decision support, and will democratize accurate cryosection diagnoses.

6/12/2023

Use of ChatGPT in BioInformatics and How a Collaboration between MIT and Microsoft called BioGPT may Revolutionize Bioinformatic Analysis including Medical Text Analysis

Reporter: Stephen J. Williams, Ph.D.

In past updates on this page, a few case uses of ChatGPT in oncology had been discussed. These examples mainly referred to either using ChatGPT for patient information or in use in diagnosis and medical case reporting. However, there are multiple potential uses in cancer research as well. One of the most garing uses is in the field of bioinformatics, which is highlighted in this curated post with link below:

The Use of ChatGPT in the World of BioInformatics and Cancer Research and Development of BioGPT by MIT

In this curation, the usefullness of ChatGPT as a bioinformatic tool to perform everything from DNA and protein sequence analysis to structural motif analysis is discussed. Afew pros and cons of the Chat GPT algorithm is discussed and, as highlighted below, weaknesses with reporting and referencing sources. However, also discussed is the new collaboration between MIT and Microsoft called BioGPT may replace ChatGPT for medical text analysis and many bioinformatic tools including for bioinformatic tool programming.

5/26/2023

Evaluation of ChatGPT for information for cancer patients: answering questions regarding misconceptions

Reporter: Stephen J. Williams, Ph.D.

The National Cancer Institute (NCI) maintains an excellent and comprehensive website for patient information, complete with common questions and answers that patients may have regarding their cancer type and therapy. Their page, “Common Cancer Myths and Misconceptions” is meant to alleviate a cancer patient’s fears by providing correct information which answers any misconceptions or misinformation a patient might have concerning their treatment and prognosis. The site is well visited but, in many instances, patients are overwhelmed by the load of information. Therefore, a chatbot may alleviate the patient’s trepidations in interacting with the site and extracting the correct information they need. A paper in JNCI “Using ChatGPT to evaluate cancer myths and misconceptions: artificial intelligence and cancer information” used the information on this page as source of training for ChatGPT and tested the output for correct information given common queiries. The results showed answers supplied by ChatGPT were accurate as analyzed by a team of cancer experts. Therefore ChatGPT, trained on NCI information, may provide an excellent ChatBot experience for new cancer patients and alleviate fears and misconceptions concerning their tumor type.

Paper citation: Johnson SB, King AJ, Warner EL, Aneja S, Kann BH, Bylund CL. Using ChatGPT to evaluate cancer myths and misconceptions: artificial intelligence and cancer information. JNCI Cancer Spectr. 2023 Mar 1;7(2):pkad015. doi: 10.1093/jncics/pkad015. PMID: 36929393; PMCID: PMC10020140.

Abstract

Data about the quality of cancer information that chatbots and other artificial intelligence systems provide are limited. Here, we evaluate the accuracy of cancer information on ChatGPT compared with the National Cancer Institute’s (NCI’s) answers by using the questions on the “Common Cancer Myths and Misconceptions” web page. The NCI’s answers and ChatGPT answers to each question were blinded, and then evaluated for accuracy (accurate: yes vs no). Ratings were evaluated independently for each question, and then compared between the blinded NCI and ChatGPT answers. Additionally, word count and Flesch-Kincaid readability grade level for each individual response were evaluated. Following expert review, the percentage of overall agreement for accuracy was 100% for NCI answers and 96.9% for ChatGPT outputs for questions 1 through 13 (ĸ = ‒0.03, standard error = 0.08). There were few noticeable differences in the number of words or the readability of the answers from NCI or ChatGPT. Overall, the results suggest that ChatGPT provides accurate information about common cancer myths and misconceptions.

The queries are given below as the Table 1 in thier paper (source: https://pubmed.ncbi.nlm.nih.gov/36929393/)

Table 1.

Cancer myths and misconceptions questions, answers, and descriptive information^a

Question	NCI answer	NCI WC	NCI FKG	ChatGPT answer	C WC	C FKG
Is cancer a death sentence?	In the United States, the likelihood of dying from cancer has dropped steadily since the 1990s. Five-year survival rates for some cancers, such as breast, prostate, and thyroid cancers, now are 90% or better. The 5-year survival rate for all cancers combined is currently about 68%. For more information, see the Annual Report to the Nation on the Status of Cancer. It is important to note, however, that these rates are based on data from large numbers of people. How long an individual cancer patient will live and whether he or she will die from the disease depend on many factors, including whether the cancer is slow or fast growing, how much the cancer has spread in the body, whether effective treatments are available, the person’s overall health, and more.	133	15.8	Cancer is not necessarily a death sentence. The prognosis for people with cancer depends on many factors, including the type and stage of the cancer, the person’s overall health, and the effectiveness of treatment. Some types of cancer can be cured, while others may be managed through treatments such as surgery, chemotherapy, radiation, or targeted therapies. In some cases, people with cancer may live many years with the disease, while in other cases, cancer may be more aggressive and may progress quickly. It is important to work with a healthcare team to determine the best treatment plan based on the specifics of your situation.	104	11.4
Will eating sugar make my cancer worse?	No. Although research has shown that cancer cells consume more sugar (glucose) than normal cells, no studies have shown that eating sugar will make your cancer worse or that, if you stop eating sugar, your cancer will shrink or disappear. However, a high-sugar diet may contribute to excess weight gain, and obesity is associated with an increased risk of developing several types of cancer. For more information, see the NCI fact sheet on Obesity and Cancer.	76	12.7	There is no scientific evidence to support the idea that eating sugar can directly cause cancer or make it worse. However, there is some evidence to suggest that high sugar intake may be associated with an increased risk of certain types of cancer, such as breast and colon cancer. This is because a diet high in sugar is often associated with being overweight or obese, and carrying excess weight has been linked to an increased risk of cancer. Additionally, high sugar intake may be linked to inflammation in the body, which has also been associated with an increased risk of cancer. However, it is important to note that the link between sugar intake and cancer risk is still being studied, and more research is needed to understand the relationship between these factors. It is generally recommended to limit your intake of added sugars, such as those found in sugary drinks, sweets, and baked goods, as part of a healthy diet. This is because a diet high in added sugars has been linked to an increased risk of several health problems, including tooth decay, type 2 diabetes, and heart disease. Instead of sugary foods, it is important to focus on eating a diet that is rich in a variety of vegetables, fruits, whole grains, and lean protein sources. This can help you maintain a healthy weight and reduce your risk of chronic diseases, including cancer.	234	13.7
Do artificial sweeteners cause cancer?	No. Researchers have conducted studies on the safety of the artificial sweeteners (sugar substitutes) saccharin (Sweet ‘N Low^®, Sweet Twin^®, NectaSweet^®); cyclamate; aspartame (Equal^®, NutraSweet^®); acesulfame potassium (Sunett^®, Sweet One^®); sucralose (Splenda^®); and neotame and found no evidence that they cause cancer in humans. All of these artificial sweeteners except for cyclamate have been approved by the Food and Drug Administration for sale in the United States. For more information, see the NCI fact sheet on Artificial Sweeteners and Cancer.	80	9.5	There is no clear evidence that artificial sweeteners cause cancer. Some studies have suggested a link between the use of artificial sweeteners and an increased risk of certain types of cancer, while other studies have not found such a link. The results of these studies have been mixed, and more research is needed to understand the relationship between artificial sweeteners and cancer. The U.S. Food and Drug Administration (FDA) has approved several artificial sweeteners as safe for consumption, including aspartame, saccharin, and sucralose. These sweeteners have been extensively studied and have been found to be safe for human consumption at the levels typically consumed. However, it is important to note that some people may be sensitive to certain artificial sweeteners and may experience side effects such as headaches, digestive issues, or allergic reactions. If you are concerned about the safety of artificial sweeteners, it is always a good idea to talk to your healthcare provider for guidance.	157	14.9
Is cancer contagious?	In general, no. Cancer is not a contagious disease that easily spreads from person to person. The only situation in which cancer can spread from one person to another is in the case of organ or tissue transplantation. A person who receives an organ or tissue from a donor who had cancer in the past may be at increased risk of developing a transplant-related cancer in the future. However, that risk is extremely low—about two cases of cancer per 10,000 organ transplants. Doctors avoid the use of organs or tissue from donors who have a history of cancer. In some people, cancers may be caused by certain viruses (some types of human papillomavirus, or HPV, for example) and bacteria (such as Helicobacter pylori). While a virus or bacterium can spread from person to person, the cancers they sometimes cause cannot spread from person to person. For more information about cancer-causing viruses and bacteria, see the NCI fact sheet on Helicobacter pylori and Cancer and our pages on HPV and Cancer and Infectious Agents.	174	13.9	Cancer is not a contagious disease, meaning it cannot be transmitted from person to person like a cold or the flu. Cancer is a disease in which abnormal cells divide and grow uncontrollably, forming tumors. There are many different types of cancer, and each type can have a different cause. Some types of cancer are caused by genetic factors, while others may be caused by exposure to certain substances, such as tobacco smoke or radiation. Some types of cancer may also be caused by infections, such as the human papillomavirus (HPV), which can cause certain types of cancer, including cervical cancer and some types of head and neck cancer. However, these infections are not transmitted through casual contact and are not contagious in the same way that infectious diseases are.	130	11.2
Does my attitude—positive or negative—determine my risk of getting or surviving cancer?	To date, there is no convincing scientific evidence that links a person’s “attitude” to their risk of developing or dying from cancer. If you have cancer, it’s normal to feel sad, angry, or discouraged sometimes and positive or upbeat at other times. People with a positive attitude may be more likely to maintain social connections and stay active, and physical activity and emotional support may help you cope with your cancer. For more information, see the NCI fact sheet on Stress and Cancer.	83	11.2	It is not clear whether having a positive or negative attitude directly affects the risk of developing cancer or the likelihood of surviving cancer. While it is important to maintain a positive outlook and to have a support system during cancer treatment, there is no scientific evidence to suggest that a positive attitude can prevent cancer or improve survival. Cancer is a complex disease that is influenced by a variety of factors, including genetics, environmental exposures, and lifestyle habits. While it is important to take care of your overall health and to adopt healthy habits that may reduce your risk of cancer, it is not possible to control all of the factors that may influence your risk of developing cancer. It is also important to recognize that cancer and its treatment can be physically and emotionally challenging. It is normal to experience a range of emotions during this time, and it is important to take care of your mental health and seek support if you need it.	167	20.2
Can cancer surgery or a tumor biopsy cause cancer to spread in the body?	The chance that surgery will cause cancer to spread to other parts of the body is extremely low. Following standard procedures, surgeons use special methods and take many steps to prevent cancer cells from spreading during biopsies or surgery to remove tumors. For example, if they must remove tissue from more than one area of the body, they use different surgical tools for each area. For information about how cancer spreads in the body, see our page on Metastatic Cancer.	80	11.0	In most cases, cancer surgery or a tumor biopsy will not cause cancer to spread in the body. In fact, these procedures are typically done to remove or diagnose cancer that is already present in the body. During surgery, the goal is to remove as much of the cancer as possible, including any visible tumors and any cancer cells that may have spread to nearby tissues. A biopsy is a procedure in which a small sample of tissue is removed and examined under a microscope to determine if cancer is present. In some cases, it is possible that cancer cells may be released into the body during surgery or a biopsy. This can occur when cancer cells are present in the fluids or tissues surrounding the tumor. However, it is important to note that these cells are usually present in very low numbers and are not likely to cause the cancer to spread. It is important to work with a healthcare team to determine the best treatment approach for your situation. This may include surgery, chemotherapy, radiation, or other treatments, depending on the type and stage of your cancer. Your healthcare team can provide more information about the potential risks and benefits of these treatments and help you make the best decision for your situation.	215	13.4
Will cancer get worse if exposed to air?	No. Exposure to air will not make tumors grow faster or cause cancer to spread to other parts of the body. For information about how cancer spreads in the body, see our page on Metastatic Cancer.	36	8.8	Exposing cancer to air will not cause it to get worse. Cancer is a disease in which abnormal cells grow and divide uncontrollably, forming tumors. These abnormal cells are present in the body, and they do not require air to grow or spread. It is important to note that cancer surgery may involve exposing cancerous tumors to air. During surgery, the goal is to remove as much of the cancer as possible, including any visible tumors and any cancer cells that may have spread to nearby tissues. In some cases, the cancer may be removed by cutting into the skin, which can expose the cancerous tissue to air. In other cases, the cancer may be removed through laparoscopic surgery, which involves making small incisions in the skin and inserting instruments and a camera through the incisions. It is important to work with a healthcare team to determine the best treatment approach for your situation. This may include surgery, chemotherapy, radiation, or other treatments, depending on the type and stage of your cancer. Your healthcare team can provide more information about the potential risks and benefits of these treatments and help you make the best decision for your situation.	198	12.8
Do cell phones cause cancer?	No, not according to the best studies completed so far. Cancer is caused by genetic mutations, and cell phones emit a type of low-frequency energy that does not damage genes. For more information, see the NCI fact sheet on Cell Phones and Cancer Risk.	44	7.5	There is no clear evidence that cell phones or other wireless devices cause cancer. Some studies have suggested a possible link between the use of cell phones and an increased risk of certain types of cancer, such as brain cancer. However, other studies have not found such a link. The World Health Organization (WHO) has classified radiofrequency (RF) fields, which are emitted by cell phones and other wireless devices, as possibly carcinogenic to humans. However, this classification is based on limited evidence from studies of high RF field exposure in animals, and it does not mean that these fields definitely cause cancer. The WHO has also noted that the levels of RF fields emitted by cell phones are typically much lower than those that have been associated with increased cancer risk in animal studies. More research is needed to understand the potential health effects of RF fields and the use of cell phones. In the meantime, it is generally recommended to take steps to reduce your RF field exposure, such as using hands-free devices or texting instead of talking on the phone, and keeping your phone away from your body when it is not in use.	196	15.3
Do power lines cause cancer?	No, not according to the best studies completed so far. Power lines emit both electric and magnetic energy. The electric energy emitted by power lines is easily shielded or weakened by walls and other objects. The magnetic energy emitted by power lines is a low-frequency form of radiation that does not damage genes. For more information, see the NCI fact sheet on Electromagnetic Fields and Cancer.	66	13.2	There is no clear evidence that living near power lines causes cancer. Some studies have suggested a possible link between living near power lines and an increased risk of certain types of cancer, such as leukemia and brain cancer. However, other studies have not found such a link. Power lines emit extremely low frequency (ELF) electromagnetic fields (EMFs), which are a type of non-ionizing radiation. Non-ionizing radiation is generally considered to be less harmful than ionizing radiation, which is known to cause DNA damage and increase the risk of cancer. The levels of ELF EMFs emitted by power lines are typically much lower than the levels that have been associated with increased cancer risk in animal studies. More research is needed to understand the potential health effects of ELF EMFs and the risk of cancer. In the meantime, it is generally recommended to take steps to reduce your exposure to all types of EMFs, including those emitted by power lines. This may include using shielding materials to block EMFs, and keeping a safe distance from sources of EMFs when possible.	180	13.7
Are there herbal products that can cure cancer?	No. Although some studies suggest that alternative or complementary therapies, including some herbs, may help patients cope with the side effects of cancer treatment, no herbal products have been shown to be effective for treating cancer. In fact, some herbal products may be harmful when taken during chemotherapy or radiation therapy because they may interfere with how these treatments work. Cancer patients should talk with their doctor about any complementary and alternative medicine products—including vitamins and herbal supplements—they may be using. For more information, see Complementary and Alternative Medicine.	91	15.0	There are no herbal products that have been proven to cure cancer. While some herbs and dietary supplements may have potential health benefits, there is limited scientific evidence to support the use of these products as a treatment for cancer. It is important to note that the use of unproven or untested treatments, including herbal products, can be dangerous and may cause serious harm. Some unproven treatments may contain harmful ingredients, may interfere with proven cancer treatments, or may delay the use of effective treatments. If you are considering using an herbal product or other alternative treatment for cancer, it is important to speak with your healthcare provider first. Your healthcare provider can help you understand the potential risks and benefits of these treatments and can help you make an informed decision about your care. It is always a good idea to seek guidance from a qualified healthcare provider before starting any new treatment.	154	15.5
If someone in my family has cancer, am I likely to get cancer, too?	Not necessarily. Cancer is caused by harmful changes (mutations) in genes. Only about 5 to 10% of cancers are caused by harmful mutations that are inherited from a person’s parents. In families with an inherited cancer-causing mutation, multiple family members will often develop the same type of cancer. These cancers are called “familial” or “hereditary” cancers. The remaining 90 to 95% of cancers are caused by mutations that happen during a person’s lifetime as a natural result of aging and exposure to environmental factors, such as tobacco smoke and radiation. These cancers are called “non-hereditary” or “spontaneous” cancers. For more information about the risk of getting cancer, see the NCI fact sheet on Genetic Testing for Inherited Cancer Susceptibility Syndromes and Cancer Causes and Risk Factors.	128	13.5	Having a family member with cancer may increase your risk of developing cancer, but it does not guarantee that you will develop the disease. The risk of developing cancer can be influenced by a variety of factors, including genetics, environmental exposures, and lifestyle habits. If you have a family member with cancer, it is important to be aware of your personal risk factors and to take steps to reduce your risk of cancer. This may include adopting healthy habits such as not smoking, maintaining a healthy weight, and exercising regularly. It may also be helpful to speak with your healthcare provider about your family medical history and any recommended screening tests or preventive measures. It is also important to remember that cancer is a complex disease, and not all cases are caused by genetics. Many people who develop cancer do not have a family history of the disease, and conversely, some people who have a family history of cancer may never develop the disease.	164	17.4
If no one in my family has had cancer, does that mean I’m risk-free?	No. Based on the most recent data, about 38% of men and women will be diagnosed with cancer at some point during their lives. Most cancers are caused by genetic changes that occur throughout a person’s lifetime as a natural result of aging and exposure to environmental factors, such as tobacco smoke and radiation. Other factors, such as what kind of food you eat, how much you eat, and whether you exercise, may also influence your risk of developing cancer. For more information, see Cancer Causes and Risk Factors.	90	11.8	Having a family history of cancer does not guarantee that you will develop the disease, and not having a family history does not mean that you are risk-free. While genetics can play a role in the development of cancer, there are many other factors that can influence your risk of the disease, including environmental exposures, lifestyle habits, and other medical conditions. It is important to be aware of your personal risk factors for cancer and to take steps to reduce your risk. This may include adopting healthy habits such as not smoking, maintaining a healthy weight, and exercising regularly. It is also important to follow recommended screening guidelines for cancer, which can help detect the disease at an early stage when it is most treatable. If you have concerns about your risk of cancer or have any symptoms that you are concerned about, it is always a good idea to speak with your healthcare provider for guidance.	157	18.7
Do antiperspirants or deodorants cause breast cancer?	No. The best studies so far have found no evidence linking the chemicals typically found in antiperspirants and deodorants with changes in breast tissue. For more information, see the NCI fact sheet on Antiperspirants/Deodorants and Breast Cancer.	38	12.6	There is no clear evidence that the use of antiperspirants or deodorants causes breast cancer. Some studies have suggested a possible link between the use of these products and an increased risk of breast cancer, but the results of these studies have been mixed and more research is needed to understand the relationship between antiperspirants and deodorants and breast cancer. The National Cancer Institute (NCI) states that there is currently no strong evidence to support a link between the use of antiperspirants or deodorants and breast cancer. The NCI notes that while some studies have suggested a possible link between the use of these products and an increased risk of breast cancer, the results of these studies have been inconclusive and more research is needed to understand the relationship between these products and breast cancer. If you are concerned about the safety of antiperspirants or deodorants, it is always a good idea to speak with your healthcare provider for guidance. Your healthcare provider can provide more information about the potential risks and benefits of these products and can help you make an informed decision about your personal care.	188

5/20/2023

Synchronous Bilateral Breast Cancer: A Case Report Piloting and Evaluating the Implementation of the AI-Powered Large Language Model (LLM) ChatGPT

Reporter: Stephen J. Williams, Ph.D.

A report by Naik et al. 2023 compares a case reports of a rare type of breast cancer, synchronous bilateral breast cancer generated either by only by ChatGPT, or a combination of ChatGPT and human curators, in order to increase accuracy and speed of case report generation. Clinically, synchronous bilateral breast cancer is defined as the precence of invasive or in situ breast cancer in both breasts diagnosed within 6 months of each other. The diagosis is difficult al s-BBC presents at different stages and histopathological characteristics, including prognostic factors. The authors present the case report of s-BBC along with the treatment, prognosis, and clinical management. Details of clinical MRI images as well as molecular porfiles of the cancer are given. To generate the report the authors supplied ChatGPT with ten PubMed references for purposes of a query to assist in generating an AI-authored case report.

As credit is given in the acknowledgements:

Acknowledgments

The manuscript includes input from the AI-powered large language model (LLM) ChatGPT (OpenAI, San Francisco, CA), although a significant (<50%) component of the text is manually (human) generated. Text generated by ChatGPT versus human input, as well as text analysis via GPTzero (OpenAI, San Francisco, CA) of the three queries to ChatGPT and one query via domain-trained LLM (Bloom), as well as author impressions of the writing process, are available in the appendix.

Abstract

Primary breast carcinoma is the most common cancer type in women, and although bilateral synchronous breast cancers (s-BBC) remain quite rare, the reported incidence may increase with the adoption of more sensitive imaging modalities. Here, we present a case of histomorphological and clinically distinct s-BBC, together with a discussion of clinical management decisions, prognosis, and treatment standards and how these relate to outcomes vis-à-vis more established standards in unifocal breast carcinoma. The case report also constitutes a pilot and formal evaluation of a large language model (LLM) of ChatGPT as a tool to aid in generating a single patient case report.

You can read the full report here: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10183235/

Below is Dr. Gurda’s comments which are found at the conclusion of the paper:

Author impressions (Dr. Gurda)

Unfortunately, there is no such thing as a “free lunch” and not everything is as easy as it seems. The discussion sections generated by my queries to ChatGPT seemed to make sense, and superficially, the text generated appeared sensible, but the numbers and the citations produced were entirely fictitious. The authors and the journals existed; the article titles and numbers made general sense but were apparently pulled out of “thin air”, or rather generated by a confluence of text in the training set. As an example, ChatGPT text reads:

“In a retrospective study of 166 patients with s-BBC who underwent BCS, the 10-year overall survival rate was 77%, and the disease-free survival rate was 67%.

The study found that the presence of lobular carcinoma in situ, an extensive intraductal component, and lymph node involvement were significant predictors of worse outcomes (1). (1) Sun J, Huo L, Xie C, et al. The prognosis of synchronous bilateral breast cancer: a single-center experience. PLoS One. 2015;10(6):e0128343.”

It sounds credible, but this study does not exist. The authors exist, the journal exists, but the article and, most importantly, the data cited do not. Overall, for the purposes of generating a discussion section, ChatGPT at this point in its development (early 2023) appears to mostly have the characteristics of a ‘confidence man’ — superficially accurate and strong, but the substance rings hollow. Nonetheless, as stated elsewhere, I found ChatGPT and other LLMs to be useful in generating a skeletonized outline, organizing my thoughts about the subject in a logical manner (even if I had to gather the evidence for the arguments being made in the discussion ‘manually’ later), and in general as a motivation to get started and/or overcome writer’s block. Lastly, the full text of the article. What was generated by ChatGPT (verbatim) is highlighted, and manual human entry and edits are in standard text (Figures (Figures4,4, ,55):

Figure 3: ChatGPT queries for generation of report: training data ten Pubmed articles. Source: Naik HR, Prather AD, Gurda GT. Synchronous Bilateral Breast Cancer: A Case Report Piloting and Evaluating the Implementation of the AI-Powered Large Language Model (LLM) ChatGPT. Cureus. 2023 Apr 14;15(4):e37587. doi: 10.7759/cureus.37587. PMID: 37193434; PMCID: PMC10183235.

Note that the authors had to manually add in references. ChatGPT, as discussed in last week’s article, does not return its source of information.

Fig 4. ChatGPT generated (yellow highlight) and human generated (nonhighlighted) part of case report for sc-BCC. Note that many annotated features like references and Figure numbers had to be manually curated.
Source: Naik HR, Prather AD, Gurda GT. Synchronous Bilateral Breast Cancer: A Case Report Piloting and Evaluating the Implementation of the AI-Powered Large Language Model (LLM) ChatGPT. Cureus. 2023 Apr 14;15(4):e37587. doi: 10.7759/cureus.37587. PMID: 37193434; PMCID: PMC10183235.

Also note that ChatGPT could not determine which figure in a paper, or which data should be referred to in the AI generated case report. This also had to be manually added in by the authors.

And last to note:

The authors (not AI) had obtained consent or consent waivers from all study participants. Note how human involvement is needed in order for proper regulatory compliance. It is probable that an AI generated report would have not included such a statement or consent acquisition, therefore underlies the initial question:

Is AI and ChatGPT, used in clinical reporting, HIPAA compliant?

Naik HR, Prather AD, Gurda GT. Synchronous Bilateral Breast Cancer: A Case Report Piloting and Evaluating the Implementation of the AI-Powered Large Language Model (LLM) ChatGPT. Cureus. 2023 Apr 14;15(4):e37587. doi: 10.7759/cureus.37587. PMID: 37193434; PMCID: PMC10183235.

5/06/2023

Can ChatGPT Revolutionize How Cancer Patients Retrieve Relevant Information? A Critique of a head-to-head Comparison with Google Search

Reporter: Stephen J. Williams, Ph.D.

A recent paper in JNCI Cancer Spectrum “Artificial intelligence chatbots will revolutionize how cancer patients access information: ChatGPT represents a paradigm-shift” discusses the utility of ChatGPT versus Google search as a suitable source for cancer patient information on their disease and treatment [1]. The authors compared Google Search results with ChaptGPT results upon a series of queries related to cancer and cancer risk and outcome, specifically questions which would be comparable to frequently asked questions by patients. However, contrary to the paradigm-shift as suggested by the title, a more nuanced analysis of these results suggest that either format has its pros and cons as far as reliable information for patients, and that ChatGPT is in its infancy with respect to its ability to supply simple, yet comprehensive information.

Both search engines were asked questions such as

What are the most common cancer types?
What exactly is cancer?

The authors reported the main response from Google and top 5 responses from ChatGPT. Google returned “snippets” or one to two sentence answers for example that breast, prostate, and lung cancer are most common cancers. These snippets from Google would always include the source, usually as a referenced web link. ChatGPT answers were more nuanced with provided contextualization according to the patients characteristics (for example if female or male or age). When asked “What is cancer” ChatGPT, over 5 returned responses, showed variability in supplemental information such as causes, treatments, symptoms. However ChatGPT did not return weblinks unlike Google Search.

This unreferenced material would be a hindrance to the professional, who requires a hyperlink to a source in order to investigate sources and verify additional related information. The authors suggest the disruptive nature of ChatGPT is its ability to formulate responses and advice to complex questions, rather than simple queries.

This divergence between utility for a patient versus professional can be seen when analyzing the responses to more specific questions related to discrete medical terminology.

The authors asked GPT and Google these three questions:

Is coughing a sign of lung cancer
What is the prognosis of metastatic pancreatic cancer
Does pembrolizumab cause fever, and should I go to the hospital

Let’s look at the differences in response:

The first notable difference is in supplemental information and sources. While ChatGPT does not give you their source of information Google returns informative and authoritative weblinks:

Is coughing a sign of lung cancer è 7,510,000 weblinks from Google

Google response: “The main symptoms of lung cancer include a cough that does not go away after 3 weeks; a long -standing cough that gets worse; chest infections that keep coming back”

Note: this is short but informative enough for a patient however not as much scientific information however the link to the NHS is sufficient.

ChatGPT responses: Among its many responses include the statement “Coughing can sometimes be a sign of lung cancer, but…” and then describes other diseases which show sign of coughing as symptom. None of the responses give a duration of symptoms like the Google response.

What is the prognosis of metastatic potential è 29.500.000 weblinks from Google

This was more telling as to the difference between the clinical relevance between Google and ChatGPT.

Google response snippet: “Stage IV pancreatic cancer has a five-year survival rate of 1%. The average patient diagnosed with late-stage pancreatic cancer will live for about 1 year after diagnosis” and weblink from JohnsHopkins. The information is clinically relevant as uses an authoritative source of medical information which informs metastatic is referred to as stage IV. In addition Google, using the source from JohnsHopkins, gives correct and up to date information on PROGNOSTIC values. However look at ChatGPT response:

ChatGPT responses:

Response 1: “Metastatic pancreatic cancer is a very serious and difficult to treat form of the disease. The prognosis for people with metastatic pancreatic cancer is generally poor, with most people surviving less than six months after being diagnosed. However, it is important to remember that each person is different, and some people may be able to live longer with the disease. Treatment options, such as chemotherapy and radiation, can help to extend life expectancy and improve quality of life. It is important to work with a healthcare team to develop a treatment plan that is tailored to your individual needs.”

Response 2: “The prognosis for metastatic pancreatic cancer is generally poor. This type of cancer has a tendency to spread quickly and is often difficult to treat. In most cases, the cancer has already spread to other parts of the body by the time it is diagnosed, making it difficult to completely remove the cancer. The average life expectancy for people with this type of cancer is about six months, although some people may live longer with treatment. However, treatment options for metastatic pancreatic cancer are limited and may not be effective in all cases. It is important to talk with a doctor about the specific prognosis for an individual’s case.”

Response 3: “The prognosis of metastatic pancreatic cancer, also known as stage 4 pancreatic cancer, is generally not very good. This is because by the time pancreatic cancer has reached this advanced stage, it has typically spread to other parts of the body and is more difficult to treat. Most people with stage 4 pancreatic cancer survive for less than six months after being diagnosed, and only a small percentage of people live for more than one year. However, it is important to remember that every person is different, and some people may respond better to treatment than others. It is always best to discuss your specific situation with your doctor.”

Note that ChatGPT takes a more conversational approach and tries to make the language in a less technical reading level (useful for patients) however it takes about five tries before ChatGPT give prognostic values (putting prognosis at 6 months; no five year survival which most clinicians report) yet talks more about treatment, what most clinicians would refer to as more predictive information.

Does pembrolizumab cause fever, and should I go to the hospital è 119,000 from Google

Note the a Google search returns the manufacturers website on Keytruda side effects at https://www.keytruda.com/side-effects/ however ChatGPT first response is rather disappointing:

“I’m sorry, but I am not able to browse the internet and therefore do not have access to information about the effects of pembrolizumab. I would strongly recommend speaking with a healthcare professional if you are experiencing fever and are concerned about your health. They will be able to provide you with personalized medical advice based on your specific situation.”

For a patient ChatBot this would be very frustrating. Other answers say that pembolizumab is a type of cancer treatment called an immune checkpoint inhibitor and then says fever is a side effect in which should be reported to your doctor but this is the third possible response from GhatGPT.

“Pembrolizumab is a type of cancer treatment known as an immune checkpoint inhibitor. It can cause side effects such as fever, but it is important to discuss any concerns with your doctor. If you are experiencing fever or other severe symptoms, you should contact your doctor immediately and follow their advice. They will be able to provide you with personalized recommendations based on your individual situation. In some cases, it may be necessary to go to the hospital for further evaluation and treatment.”

The fourth response lists pembrolizumab as a type of cancer therapy called immunotherapy, which is a more understandable term for the patient.

In summary, while Google returns very informative weblinks, ChatGPT does have the capacity to relate information in a more conversational tone and less technical verbiage, more suitable for patients. However it could be frustrating to patients to have to go through multiple iterations to obtain a suitable answer, and no references are given nor source information. For the clinician ChatGPT seems to have more learning and training to do to reach a level which would be suitable for the clinical level. ChatGPT ability to relate to patients though could be a benefit.

Link for Source article: https://academic.oup.com/jncics/article/7/2/pkad010/7049531

Reference

Hopkins AM, Logan JM, Kichenadasse G, Sorich MJ: Artificial intelligence chatbots will revolutionize how cancer patients access information: ChatGPT represents a paradigm-shift. JNCI Cancer Spectrum 2023, 7(2).

4/23/2023

ChatGPT and Potential Healthcare Implications of Large Language Models – a talk from Dr. George Shih, M.D. FACR

at the March NCI Imaging and Informatics Community Webinar (IICW)! During the event, Weill Cornell Medicine’s Dr. George Shih will talk about ChatGPT and large language models.

ChatGPT is an AI chatbot, built on top of large language models, with the potential to impact many aspects of healthcare. Specifically, Dr. Shih will discuss:

examples of ChatGPT in healthcare, including current limitations.

potential implications of ChatGPT and large language models for researchers, providers, patients, and more.

how ChatGPT and AI’s evolution over time might impact users.

Both the Center for Biomedical Informatics and Information Technology and the Cancer Imaging Program organize the monthly NCI IICW. During the first Monday of every month, this event features scientific presentations and project updates. To receive updates on future topics and to access previous presentations and recordings, visit the IICW webpage.

Dr.Shih focuses his talk on the use of ChatGPT on analyzing DICOM images for radiology. The ChapGPT did a reasonable job at report generation and even coding in Python, however many radiologists are still hesitant to use it as some medical terminology ChatGPT has problems with. However Dr. Shih believes that will be addressed over time. One interesting example was a case study given to ChatGPT with five choices on further evaluations to suggest. ChatGPT, much like multiple radiologists, suggested an answer not give in the list of choices given to ChatGPT, and surprisingly ChatGPT did not use any of the choices given to it but went out to literature to find best suggestions.

However he does state that ChatGPT is not good at giving citations or referencing the literature correctly. Note a very good explanation on how GhatGPT3 works (OpenAI build on ChatGPT then with human instructors became ChatGPT4. the basis for ChatGPT is the algorithm is predicting the next word, like your phone does when you start typing a word

Source: https://datascience.cancer.gov/news-events/events/chatgpt-and-potential-healthcare-implications-large-language-models#event-post-event-stuff

Watch the video of this talk

https://wiki.nci.nih.gov/display/NICC?preview=/362972750/523567855/NCI%20Imaging%20Community%20Call-20230306%201801-1.mp4#NCIImagingandInformaticsCommunityWebinar-2023

4/13/2023

The utility of ChatGPT for cancer treatment information

, , Michael B. Foote, Hugo JWL Aerts, Guergana K. Savova, Raymond H. Mak,

doi: https://doi.org/10.1101/2023.03.16.23287316

This article is a preprint and has not been peer-reviewed [what does this mean?]. It reports new medical research that has yet to be evaluated and so should not be used to guide clinical practice.

ABSTRACT

The use of large language models (LLMs) such as ChatGPT for medical question-answering is becoming increasingly popular. However, there are concerns that these models may generate and amplify medical misinformation. Because cancer patients frequently seek to educate themselves through online resources, some individuals will likely use ChatGPT to obtain cancer treatment information. This study evaluated the performance and robustness of ChatGPT in providing breast, prostate, and lung cancer treatment recommendations that align with National Comprehensive Cancer Network (NCCN) guidelines. Four prompt templates were created to explore how differences in how the query is posed impacts response. ChatGPT output was scored by 3 oncologists and a 4th oncologist adjudicated in cases of disagreement. ChatGPT provided at least one NCCN-concordant recommendation for 102/104 (98%) prompts. However, 35/102 (34.3%) of these also included a recommendation that was at least partially non-concordant with NCCN guidelines. Responses varied based on prompt type. In conclusion, ChatGPT did not perform well at reliably and robustly providing cancer treatment recommendations. Patients and clinicians should be aware of the limitations of ChatGPT and similar technologies for self-education.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

The authors thank the Woods Foundation for their generous support of this work.

SOURCE

https://www.medrxiv.org/content/10.1101/2023.03.16.23287316v1

Leaders in Pharmaceutical Business Intelligence Group, LLC, Doing Business As LPBI Group, Newton, MA